Time to retire the old Docs website

When searching the internet for instructions, fixes, or guides for Fedora Linux, somtimes the search results point to the old docs platform (Release Notes).

With all the changes in Fedora over the past 5 years, much of the pre-F26 documentation is outdated, documentation doesn’t apply to current releases, and many instructions simply don’t work anymore.

Maybe it’s time to retire and finally unpublish the old docs websites?

None of the releases that is listed there is still supported and as far as I can see, the vast majority of docs has been ported over the “new” Docs system.

So why not kick it? Or at least move it to something like archive.docs.fp.o and disallow google and others to index the site.

2 Likes

Removing obsolete pages from search engine indexation using the robots.txt was already discussed in

The related part of the discussion began with this post.

@darknao I expect you currently have to focus the GitLab migration, but am I right to assume that it is already planned to remove unsupported/obsolete docs pages from search engine indexation anyway?

1 Like

That is already part of our team’s to-do list ( see e.g. Hey docs team — I need a volunteer to drive the removal of old docs - #8 by mattdm ). The current plan is, to pdf the old documentations (f26 and earlier) and store them in the corresponding repository along with the DocBook XML source text. That way, someone can easily retrieve the text if needed for some reason. And we can check, if the text or part of it could be useful for the current documentation. Once that done, we will ask infrastructure to remove the complete web page subtree.

The problem is, there is no volunteer to pick up the task so far. Therefore, the task is still in waiting position (wink, wink).

1 Like

Is there a reason why PDF’s are needed? Generally, it is always good to have them, but given the many open issues and tasks, it might make sense to focus a solution that is less work intensive? Otherwise, I see the risk that it takes another three years to get rid of these pages.

@augenauf 's idea of just making it archive.docs.fp or something like that (which might be done in conjunction with adding the robots.txt to remove it from search engine indexing) might be an easy-&quick-to-implement (interim?) solution.

What do you think?

Well, according to the latest discussion, we don’t want to leave our “history” to the internet archive alone, but preserver the old version(s) retrievable by our own. Therefore, we want to keep the repositories. And we also respect that there is a lot of work in the documentations that we don’t want to just delete and destroy. Very many parts are still up-to-date and can be the basis of a restructured documentation.

But for both purposes it must be quickly accessible. PDF is one solution. There could also be another one, perhaps a simple creation of a readable text via a preconfigured DocBook. Or a way by means of pandoc? Or use wget/curl to create a static local web page. This could even be script based.

Infra would prefer to get rid of it. There are various technical circumstances that make comparatively much work. I don’t remember the details off the top of my head. Therefore, it should be removed from the online content.

@pboy I was just playing a bit with wget to find easier ways to save the old Docs.

However, I found out that the PDF’s already exist on the server: wget identified a /pdf/ dir in Fedora 24, 25, 26 (I have not yet downloaded more). All contained well-formatted PDF files of the respective docs (I have not yet made a deep comparison but at first glance, it looks like all is contained).

The PDF files seem to be in a dedicated folder at each Fedora version, e.g.: https://docs.fedoraproject.org/en-US/Fedora/24/pdf/*

Some examples are https://docs.fedoraproject.org/en-US/Fedora/24/pdf/Installation_Guide/Fedora-24-Installation_Guide-en-US.pdf
and https://docs.fedoraproject.org/en-US/Fedora/24/pdf/System_Administrators_Guide/Fedora-24-System_Administrators_Guide-en-US.pdf.

However, the general access to the */pdf folders is restricted to admins. So a full download of the whole directories should be done by someone with the privileges to ensure that no contained PDF is forgotten and to avoid the massive traffic of wget with many recursions. But it seems that sometime in the past someone already fulfilled the PDF task :slight_smile:

This seems to go back to Fedora 7: e.g., https://docs.fedoraproject.org/en-US/Fedora/7/pdf/Installation_Guide/Fedora-7-Installation_Guide-en-US.pdf , which is the first / oldest Fedora version contained in the old Docs. So with a little bit luck, no PDF work has to be done.

2 Likes

We agreed today that we like the approach of dropping the old docs from the server and pointing people to the old fedora-docs-web repo when they need historical content. Before we do this, we’ll wait a week for comment.

4 Likes

I think this is a good idea to help get the documentation under control.

1 Like

Should or can we talk with Infra to put in 301 Redirects to the /latest/? I think a part of the ask from Matt was that some old docs are doing better in search than current docs, the 301 should help move the new docs up the search results.

That would just work for the installation guide and the administrator’s guide. For all the other old docs, we don’t have a successor or direct replacement. So we have to redirect to the docs start page, I guess.