✅ Proposal: Require just the standard set of apps to be release-blocking on Workstation x86_64, not all apps

:white_check_mark: Update: This proposal has now been put into action.

Hi everyone, this is a proposed change for Fedora Release Criteria. It is related to a wide set of changes that Fedora Quality team need to perform this cycle, which are summarized here, so please feel free to read that for more background information and general overview, thanks.

Proposal: No longer consider basic functionality of all apps on the Workstation x86_64 image to be release-blocking. Instead, make the requirement exactly the same for all release-blocking desktops, i.e. basic functionality of just selected apps would be considered release-blocking equally everywhere.

You can see the current release criterion here, and (expand the footnotes sections) it reads:

For all release-blocking desktop / arch combinations, the following applications must start successfully and withstand a basic functionality test:

  • web browser
  • file manager
  • package manager
  • image viewer
  • document viewer
  • text editor
  • archive manager
  • terminal emulator
  • problem reporter
  • help viewer
  • system settings

Basic functionality means that the app must at least be broadly capable of its most basic expected operations, and that it must not crash without user intervention or with only basic user intervention.
Additionally, for Fedora Workstation on the x86_64 architecture, all applications installed by default which can be launched from the Activities menu must meet this requirement.

Under this proposal, the last sentence would be dropped. That means that only the apps in the list would be release-blocking (their basic functionality, to be precise) on release-blocking desktops, regardless of the Edition/architecture/etc.

There are multiple long-standing issues with this criterion as currently written:

  1. It’s one of the most heavily contested criteria during blocker reviews. That stems from the fact that we haven’t found a way to define “basic functionality” clearly and consistently across multiple applications. People then argue whether e.g. drag&drop is a basic functionality of a file manager, or working alarm chime a basic functionality of a clock app, or image cropping a basic functionality of a photo manager. Of course the bugs are often more nuanced, often happening only under certain conditions (which might or might not be frequent use cases). This is compounded by the fact that these bugs are often found in less pronounced applications, like gnome-maps, gnome-contacts, gnome-calendar, and others. Many then feel that while the bug indeed is affecting a basic functionality of said app, it doesn’t feel serious enough to block the whole Fedora release. These discussions get difficult and long.
  2. The amount of time required to perform the testing is very high. Even basic functionality testing can take a long time, when there’s a long setup phase, e.g. you need to connect some online accounts, prepare some local data set like a photo or music collection, have a remote system to connect to, etc. Because basic functionality is just vaguely defined, you need to test more rather than less, and some applications have a lot of functionality. Testers avoid doing this test case, because it’s very time consuming and complex.
  3. Frequently, bugs in these extra applications (on top of the basic list) are discovered very close to the Final release, because of the associated issues and time requirements. That makes the whole process frustrating for both quality testers and developers, because both are then under pressure to report, fix, and verify the issues extremely fast. This causes pre-release crunch or release slips.
  4. Automating these tests via openQA is desirable due to point 2 above, but is a heavy task in itself and takes up time we could arguably more profitably spend on other work. Even once done, these tests require quite a lot of maintenance as they break when the app changes behavior (e.g. the Calculator app’s button layout changing in GNOME 49) or font rendering, GTK button rendering etc. change.

By reducing the scope on Workstation x86_64 from all apps to selected apps, we will reduce the likelihood of long blocker arguments, pre-release crunches and stress. The quality of these apps might be affected, but those are less critical apps. The most important apps, as defined in the list, will still continue to have full coverage and their blocker status unaffected.

Originally, we covered all apps mostly because occasionally some Fedora reviewer found a broken app and complained (even if it was fixed promptly, it persisted in the review). However, we question whether the current approach is worth the benefit, and more importantly, we currently lack the manpower to test all the apps.

Important note: If you’re not very familiar with the release criteria process, please read this. Reducing a release criterion doesn’t mean breaking the apps. This change doesn’t mean that all apps except those listed will suddenly be released broken in the next Fedora release. The difference is that if a problem is found, it will not be considered critical enough to block the release of the next upcoming Fedora. Instead, it will be resolved as any other standard bug (which are resolved every day, as you can see in a continuous stream of Fedora updates coming to your system regularly).

5 Likes

It’s bittersweet to lower our quality standards, but the bugs caught by this criterion are almost never serious enough to justify blocking a Fedora release, and the scramble to fix them so that we can release Fedora is not fun. So independent of any Fedora QA resource considerations, I agree this makes sense to do. I would consider adding “calendar” to the list of apps that should survive basic functionality test, though.

I think Workstation Working Group can monitor other default apps that are in an unhealthy state.

3 Likes

It’s less bad if we can keep the GNOME stack updated throughout the life of a Fedora release. Getting those bugs reported and fixed is still important, but it means nothing if updates aren’t shipped to fix them after upstream work is done.

1 Like

One clarifying question: are we asking the SIG maintainers to identify ahead of time the “Web Browser” and “Text Editor”, etc. and maintain that list? Alternately, if a particular Spin carried more than one of those application types, is it acceptable if any one of that category works?

(For a hypothetical example, suppose that the Workstation media shipped with both Firefox and Chromium installed by default. By coincidence, it turns out that Firefox won’t actually run because there’s a GTK mismatch, but Chromium works perfectly. Is that Firefox bug blocking? What if it was the other way around?)

The criterion covers that: “If there are multiple applications of the same type (e.g. several web browsers), the primary/default one must satisfy the requirements. If the primary/default application can’t be determined, at least one of said applications must satisfy the requirements.”

We’ve been through this with Fedora KDE. It means that the team responsible for the deliverable must identify which ones they are, it cannot be just any one of them. Workstation was the only deliverable that was never required to enumerate this.

That’s not what the criterion says, see Adam’s reply above. It can be changed, of course, but currently it really says it needs to be obvious (for example, when I double click on an image file, the image viewer that opens by default is the primary one; if I’m presented with several options and no default selection, none of them is default), otherwise at least one (any one) of them has to work. See “Determining primary applications” footnote under the criterion text.

They don’t need to, because they don’t duplicate apps in a default install. If there’s overlap, there is something set as the default handler. But KDE doesn’t need to enumerate it either.

Please note that this functionality is not changing under this proposal. It will continue working as it worked in many many previous years. So I’m happy to clarify, but if you want to change this approach (of how to determine primary apps), let’s have a separate discussion topic for it.

I’m sure this will be a highly unpopular suggestion… but has anyone considered slowing down Fedora’s release cycle and focussing on more testing and verification?

(MatH quickly ducks under table)

This is (at least slightly) offtopic, but IMHO, if someone wants a slow release cycle version of Fedora that focuses on more testing and validation, they should consider using CentOS Stream.

1 Like

+1

I would be happier to also see most editions just ship less. Either way I am happy to see the criteria adjusted to be more reasonably testable.

Yes. It’s been considered several times. It was never a very popular idea.

I think the test load should be reduced. However, I think the list to be tested is to general. I think it would be easier for all if each edition and spin would specify which app’s must be test and the “basic” functionality that is to be tested.

This was something I explored recently but unfortunate I discovered that 32bit multilib was dropped, so I can’t use steam. Xfce wasent an option either as there was no xorg.

Since there were no opposing voices, and two members of the Workstation WG were in favor of this proposal, I’ve now put it into action. The changes are visible here:

Thanks everyone for participating in the discussion!

1 Like

I haven’t forgotten about this. I’m a bit uneasy about it, because the last time I tried gnome-calendar, it was a can of worms (sorry to say that), and I think we haven’t found a lot more of critical bugs because we simply don’t use it during day-to-day work. Also, I suspect that the user base of gnome-calendar is much much smaller than the user base of the other listed “important” apps. So I think if we keep it in, we’ll continue having those difficult discussions whether e.g. event duplication is a basic functionality of a calendar app and whether the whole release should be blocked. However, if Workstation WG would really like to have it in the blocking set, we can certainly do it. Maybe it’s a good idea to first raise it during a Workstation WG meeting, if people are generally in favor. And it would also be great to see someone who’s really using it daily, who can tick the testcase result from time to time. It would be more beneficial than us testing it once per 6 months :slight_smile: