Exploring our bugs, part 1: the basics

bcotton · August 17, 2021, 8:00am

Originally published at: Exploring our bugs, part 1: the basics – Fedora Community Blog

This is this first part of a series I promised during my Nest With Fedora talk (also called “Exploring Our Bugs”). In this post, I’ll review some of the basic statistics from analyzing bugs from Fedora Linux 19 to Fedora Linux 32. If you want to do your own analysis, the Jupyter notebook and source data are available on Pagure. These posts are not written to advocate any specific changes or policies. In fact, they may ask more questions than they answer. This first post looks at some basic information, including counts, priorities, and duplicates.

Counts

The obvious first question is “how many bug reports do we get?” The first thing I did was to plot the number of bug reports in each release, excluding duplicates.

You can see a general downward trend over time. This sounds like good news, but it may not be good. Karl Fogel says in Producing Open Source Software that “an accessible bug database is one of the strongest signs that a project should be taken seriously —and the higher the number of bugs in the database, the better the project looks”. So is the decrease in bug reports a reflection of fewer bugs or is it a reflection of less user engagement?

What components have the most bugs filed against them over this time period? You’ve probably heard of all of them.

Component	Bugs
kernel	9028
selinux-policy	6477
gnome-shell	3645
anaconda	3079
dnf	1925

Top 5 components with the most non-duplicate bugs

What components have the fewest bugs filed against them? 10,549 components (or 85.95% of components with at least one bug report) have fewer than 10. 12,082 (98.44%) have fewer than 100.

Priority and severity

This was perhaps the most surprising part of the analysis. I had assumed that all bug reports would be marked as urgent. That was entirely wrong. Bugzilla defaults to unspecified for both priority and severity, so most bug reports don’t have it set. Looking only at the bugs that do have a value, we see a reasonable distribution. A small number are “urgent”, more are “high”, the most are “medium”, and fewer are “low”. The reason to expect fewer “low” bug reports is that they are probably under-reported. Many people won’t bother with the trivial reports.

Bug reports by priority

Bug reports by priority (excluding unspecified)

Bug reports by severity

Bug reports by severity (excluding unspecified)

Duplicates

I’ve mentioned duplicates a couple of times in this post, so let’s look at duplicate bugs. It turns out the number of duplicate bugs has held relatively steady, despite a drop in overall reports.

Which components get the lowest percentage of duplicates?

Component	Duplicates
xen	0.65%
ansible	0.77%
389-ds-base	1.05%
btrfs-progs	1.22%
synergy	1.22%

Top 5 components by lowest duplicate percentage

And which components have the most? Well, 63 components had only duplicates. This was often due to bugs being marked duplicates of Rawhide bugs or bugs filed against other components.

Finally, I wanted to know if there was a relationship between reports and duplicate percentage. My hunch was that the percentage remains relatively constant, perhaps with an increase as the number of bug reports gets large because it’s harder for users to find an existing bug to attach to. Instead, it seems there’s a drop as you have more reports. This is probably because the triage gets more difficult. There are just as many (or more) duplicates, but nobody has marked them as such.

What next?

In upcoming posts, I’ll look at how bugs are closed. Are our users happy? I’ll also review our time-to-resolution stats. In the meantime, you can explore the data yourself, or look at my slides for more tables. If you have theories to explain anything you see in this post, let’s discuss in the comments.

vondruch · August 18, 2021, 4:38pm

Just FTR, I suspect that there will be even less errors reported, because ABRT is not useful since the RPM started to use the zstd. Previously, it was possible to backtrace on server, therefore avoid pollution of your computer, it is not (AFAIK) possible anymore.

bcotton · August 18, 2021, 4:42pm

Oh, that’s interesting. The zstd switch was in F31, which had an increase in total reports over F30. But F32 had a big drop, particularly in the abrt reports.

abitrolly · August 18, 2021, 7:00pm

Images are not loading for me.

bcotton · August 18, 2021, 7:01pm

Yeah, that’s Discourse trying to be clever and failing. Click the link at the top to view the post on WordPress and you’ll see the images.

abitrolly · August 18, 2021, 7:07pm

Are there counts of bugs that are automatically closed because of EOL? Compared to fixed the other way.

bcotton · August 18, 2021, 7:10pm

That’s part two.

Topic		Replies	Views
Exploring our bugs, part 2: resolution Community Blog	0	249	August 24, 2021
Looking at Fedora Linux 33 bugs Community Blog	2	478	December 21, 2021
Exploring our bugs, part 3: time to resolution Community Blog	4	439	August 31, 2021
`glances` downgraded upon upgrade to Fedora 30 Ask Fedora f30	22	1000	June 30, 2019
Questions, bugs, incidents, and problems (oh my!) Site Help & Feedback help	12	722	November 9, 2021

Exploring our bugs, part 1: the basics

Counts

Priority and severity

Duplicates

What next?

Related topics