Discussing Fedora Docs website improvement

Speaking of search engine rankings, removing old docs could be done by explicitly disallowing indexation in robots.txt (see #166).
Should we also include rawhide documentation in that list?

This was actually a bug in the git library I used. This should be fixed now.

And a few updates on the other changes discussed so far:

  • latest url scheme is deployed on stg, and waiting the end of freeze for prod (PR on infra side)
  • Banner on old/rawhide is deployed on prod. Note that it affects all versions not current, regardless of EOL status.
  • Versions removed from navigation menu, also deployed on prod.
1 Like

Thanks for the effort. We should wait until all of it is visible and then review the affected issues. Maybe we can close one or the other.

This would indeed strongly decrease the issue of ending up at obsolete pages. But I assume this would imply a reoccurring, biannually task (a very small one, but it has to be kept in mind): each time when support for a release ends, an adjustment of the file will be necessary; so, /f34 would be the next to become disallowed.

I don’t know whether maintainer/developer use search engines when working on the rawhide docs, or if they solely work with the raw files in git. Maybe we should clarify this first just to be sure?

I was actually referring to the old docs site. But we can extend that to the new one too.
Anyway, I think we should be able to generate this file at build time so we don’t have to think about it.

Yes, this is a big nuisance and exclusion from indexing may help. The sooner, the better.

Another major annoyance is wiki pages that have not been maintained for years and are orphaned.

It would be good to redirect all wiki pages whose last change is further back than 31.12.2017 to docs start page (there is a macro, I don’t have it in my head right now).

The older Antora based pages are not quite as annoying, I guess.

Currently on staging:
image

Let me know what you think :slight_smile:
This is only a proof of concept, and might not be production ready. There is not much we can customize (other than the look and feel with css) and It may cause a small but noticeable performance hit.

1 Like

Very nice! It fits perfectly and unobtrusively into the current design.

The loading of the page took some time. But maybe it’s my internet connection or my Mac Safari browser, which sometimes behaves rather strange.

1 Like

I love it! Very intuitive. But I also have the loading issue. After the remaining page has already loaded, it takes a moment until I can use the search field (Firefox 98.0 from Fedora repo, Fedora 35). Maybe that’s just a problem of the test environment?

Thanks :slight_smile:

Awesome! Overall this is a huge improvement. (It’s infinitely better than our previous search options :wink: )

I definitely noticed the page load issue. Maybe we could have it be a link to a dedicated search page? The downside is adding an additional click, but it would reduce the impact of the page load times.

One thing that might improve the experience, if we have any way to change it, is to include the breadcrumb in the search results listing. The page titles are not always helpful in providing context. If we can’t change that, we should try to make sure page titles are more contextual going forward (that’s a problem to fix over time, not an immediate issue)

1 Like

One big advantage of the lunr extension is you don’t need any additional service to handle the indexing. Downside to that is you need to download the index locally. And it’s quite big for our documentation site (~22MB).
I just found out that our staging (and prod) environment doesn’t have gzip compression enabled (which should really help in our case to reduce the bandwidth requirement and loading time).

I’ve created an infra PR for that.

4 Likes

I wouldn’t worry too much about the page load at the moment. To use the page, you don’t have to wait for the download to complete. You can use the page as before. And obviously the delay occurs only at the first call. Afterwards, the cache is used.

We should discuss our concerns with the developers. I would expect they will work on fine-tuning and performance improvements. There are so many options to handle even a large download as the index file. As I understand, the current version is the first release, isn’t it?

Instead of introducing detours, we should look at what we can optimize locally. So maybe we can separate the download even more from the content loading (lazy load), so that the download in progress is no longer active in the address bar and avoid the (really slight) delay in the use of the content area. Or we can display a Text like “Please wait a moment” in the Search field, when you click into it before the download is complete. Or we can optimize the use of the cache.

I think the current design is nice and helpful. I find it extremely positive that the original page content can be seen next to the search result. This makes orientation much easier.

Even without aforementioned optimizations, it is already a huge improvement and is good enough to make it productive.

1 Like

This extension is actively maintained and most of the concerns discussed here are already reported and worked on by the developers. You can follow this kind of discussion over here: https://antora.zulipchat.com

This is still an alpha version (latest is alpha-6), and I’ve already seen some big improvement with the latest one. The index size was around 60MB with the previous version, and it’s down to 20MB now. With gzip compression, I expect something around 5MB.

Some unrelated minor changes you can see on staging:
You’ll now find 3 icons on the top right: “Page history”, “Edit” and “Report an issue”
image

3 Likes

I forgot to mention in our last meeting that the gzip compression is now enabled on staging.
Let me know if you see any improvement in loading time.

2 Likes

I have the impression, it’s a bit faster. Although still a bit long.

But the loading time does not hinder the usual use of the pages, You can do everything that you could do before, without the search function. The search function does not carry any disadvantage. Therefore I think we should introduce it soon. What would we gain from a delay?

2 Likes

First of all, thank you for enabling search!
That is a huge improvement, and was long overdue.

Apart from the longish load time,
one thing I have noticed is that it is often unclear what is the context for the hits.
For example, here are some results from searching for selinux:
kuva
The two identical Troubleshooting rows are actually for Silverblue and Kinoite.
It would be useful to have that information visible in the results,
so that users of those variants know to pick that result,
and others know to ignore them.

Is there a way to add the component name or such to results?

I widely agree, but not fully:

Do we expect that people with limited traffic / slow connections use the docs? I know in most of Europe and the US we no longer think in such dimensions (MB traffic), but that is not everywhere / for anyone the case. Generally, I think that the majority of people who use the docs are in a comparable situation to us, which would make me agree to @pboy, but I am nevertheless somewhat cautious. What do you think? Btw, how much MB is it with gzip?

The gzipped index is ~6MB.

In this case you can ignore my last post :partying_face:

Yesterday, I deployed the latest version of the search extension, which brings us a few improvements like a reduced index size (it’s a little less than 4MB now) and more contextual information in the results:

You’ll also find a checkbox allowing you to search only within the current doc component.

1 Like

According to my impression it is very fast now. Is the index file loaded in the background when the page is loaded? Or is the loading started when you perform a search?

1 Like