F42 Change Proposal: Opt-In Metrics for Fedora Workstation (system-wide)

I was responding to Andy there.

I think you have a very strict definition of “opt-in.” :smiley: I guess we could remove this from the title of the proposal, but I fear that’s likely to cause more confusion than anything…

1 Like

Michael and I are saying the same thing, from a different perspective.

Management for the desktop team would like this, and that team are committed to working on and contributing to Fedora, and this is one way in which they would like to do that. As Michael is part of that team, this is indeed Red-Hat-assigned effort. But, that doesn’t mean it comes from some nebulous, faceless corporate entity. In fact, if you ask that corporate entity through Red Hat’s public relations team, I guarantee their reaction would be confusion and someone asking me in on company Slack what this is all about.

To be honest, to some degree I wish IBM cared about desktop Linux enough to have some sort of official involvement in what we do in Fedora Workstation. There are a lot of smart people and great engineers at IBM! But there absolutely is no part of this plan that comes down in that way.

5 Likes

I’d understand this sentiment if the “accept” button was given more weight, like in the previous proposal, but that’s no longer the case. With it being a balanced, yes/no question, it definitely can be reasonably classed as an “opt-in”. in practice “no by default, yes if explicitly told” is pretty much the same as “no if explicitly told, yes if explicitly told, no default”.

Under other circumstances, I maybe could agree :wink: But this proposal is sensitive, and the first attempt ended in a massive rejection of opt-out. However, one can argue that this proposal is as much opt out as it is opt in. I fear potential perceptions in the aftermath, and that it undermines trust.

If people can misinterpret the outcome of the proposal as “I have to opt in to get the product” of course is in the end dependent on the implementation, but I am not sure at the moment that the sensitivity of the topic among people/cultures is considered sufficiently, and thus that the outcome causes that issue, so I wanted to raise the point already now. But I agree that this is not an argument on itself against the proposal. Yet, if the proposal is accepted, I would like to suggest to also seek some public confirmation and consensus of the implementation, to further increase transparency and decrease insecurity/suspiciousness of users.

This is one of the things that I find mystifying. Why lean on Endless OS when Red Hat already have a means to collect user data: RHEL.

My guess is that management, or even users, at paying companies are not as concerned about the acquisition of user metrics as the general public are.

While our outfit were concerned about taking care of personal identity information, they were at least equally concerned about security and system support. I don’t know about you, but I was an RHEL and pre-RHEL Red Hat user at a company long before I started using Fedora Workstation itself, and I used it exactly as I use Fedora Workstation now. That is, other than having to install my own applications due to how far from current our RHEL versions were from current upstream application software.

I asked about this during the last discussion phase. If there was ever a clear answer to that question in that discussion thread, I missed it.

Do you think that your user community would’ve pushed strongly for PipeWire? Or systemd? Or btrfs? Or discontinuing support for X11 on KDE Plasma? Maybe. But my take is that it is the Fedora Project that is pushing these frontiers not users. And that is perfectly consistent with the mission of the project, to push boundaries with a developer focus. You folks don’t need user metrics to imagine that users would like better, more comprehensive, more contemporary audio tools and video capabilities. They definitely do! And I can sure imagine that your project developers have plenty of other ideas along similar lines.

Fedora and Endless OS are probably much closer in userbase compared to RHEL. My understanding is that, as an individual, you aren’t using RHEL unless you’re developing stuff directly for it. The userbase of RHEL is companies and enterprises, and even those don’t need to use RHEL for everything - Red Hat uses Fedora internally, for example! I believe they rolled their own Fedora spin, which had a specific usecase?

If you want feedback which is more representative of the kind of OS Fedora is, EndlessOS sounds more appropriate compared to RHEL.

1 Like

I don’t dispute some of your other points, but I thought I understood that this proposal was about getting Workstation user information with the goal of how to make it more competitive with distros like Ubuntu. If so, RHEL users are exactly what you’d want.

Businesses like the place I worked at, an engineering concern, forced this as the only non-Windows, non-MacOS Linux option available; we had no choice beyond those three. We were not developing for RHEL, but using it as a DE for more general work. Some of it was software development for more general engineering software products, but even for that the targets were far broader than just RHEL.

Even if Fedora had some way of peeking at Red Hat user data (we don’t — not even those of us who work for Red Hat!), I don’t think it would be very useful. The personas are very, very different.

But I want to stress another thing: this proposal is not about getting user information. It is about getting system information.

6 Likes

Is that accurately reflected in this?

Some have expressed their dislike of this system, because a default of “Disagree/Disabled/Off/Out” would be messaging a “privacy-first” OS.

I personally find the proposed model perfectly acceptable, and by the same reasoning I’d say the proposed model is “choice-first” which is absolutely fine if not better for my liking.

My only concern is, how are kickstart based (Gnome) installs handled?
Are they exempt as non-Workstation installs?

If not exempt (otherwise ignore as not-applicable), you said earlier:

Is this the same setup step that firstboot handles?
And if so, does this mean that a kickstart not specifying firstboot --enable is opted-out by default because

If not specified, this option is disabled by default.

The old adage is “If you don’t measure it, it did not happen”. I am in favor of measuring.

However, I am very concerned about using opt-in approach to gain any statistically meaningful insights (perhaps slightly better than someone saying “I think…”, but not certainly anything that anyone can draw any statistically valid conclusion on or take any action based on).

Opt-in almost always results in what is referred to as self selection or volunteer/voluntary response biases, and the accepted consensus is that such can (will) result in wrong conclusions unless you add in significant additional validation work after the fact (basically do the hard survey work you are trying to not do, or eliminate the opt-in choices).

I am not going to claim either of these would happen, and they are obviously extreme examples (there are a number of more subtle ways to accomplish something equivalent), but imagine:

  • Only those with x86-64-v3 or x86-64-v4 systems opt-in, leading to a conclusion to drop anything older than x86-64-v3

  • To game the results, someone spins up billions and billions of VM’s simply to make LXDE look to be the most popular DE

How would the opt-in approach detect these cases and throw the results out such that the conclusions based on any opt-in approach could be useful?

My belief is we should add opt-in metrics, and never, ever, use them as anything more than anecdotal suggestions (and if someone tries to claim that the opt-in metrics supports their plans, they should be rightfully castigated and pilloried about their failure to understand basic rules about research and statistics).

2 Likes

Just to be clear on the directed part. As the person managing the Red Hat team working on desktop and workstation I asked Michael and Allan to work on this as part of their jobs at Red Hat after seeing Rob McQueen talk about metrics in EndlessOS from GUADEC in 2019.

What I want to see come out of this effort is enough data to both direct our development efforts towards what provides Fedora users the most value, help make technical decisions on things like which GNOME extensions to enable by default, help drive further investment from hardware partners and help drive more investment in Fedora in general. The overall goal for all of that is to see strong userbase growth for Fedora, both because as someone who has been using Fedora and RHL exclusively as my desktop for 25 years I want to see it continue to prosper and because I believe that the more successful Fedora Workstation is the more successful Red Hat Workstation is.

6 Likes

Confusingly, “Initial Setup” and gnome-initial-setup are two completely different tools. I’m not certain, but I think that kickstart setting will disable them both, though. gnome-initial-setup checks /etc/sysconfig/anaconda and reads the post_install_tools_disabled from the [General] category. I assume that’s what kickstart is setting.

At any rate, if gnome-initial-setup does not run, you of course wind up without metrics.

It wouldn’t. I share your concerns here. Whether the data will be useful or not is going to depend on how many users opt in.

I guess this is directed more at @andyants rather than me :classic_smiley:

As a longtime Fedora Workstation user, I felt the need to create an account to chime in here. I can’t speak for everyone, but I really appreciate the effort taken to work with the community, hear their feedback and address any concerns in a kind manner. This dynamic is one of the things that makes Fedora truly special.

I care a lot about privacy, but I also do recognize the value that low level system information can provide to the developers. I want to see Fedora prosper, and continue improving, which is why I voted "In favor, with reservations" . My suggestion would be to include a system similar to KDE’s user feedback reporting: If the user decides to opt-in, then add an additional slider that pops up to give them additional control on what data to submit. Many users will want to provide some data, but some may feel uneasy providing access to everything that’s listed. Adding this slider will further reassure users that they remain in control, since submitting data can be a granular setting, rather than all or nothing. This way, you can also get some data from those who would opt-out entirely without this option. Those that would have opted-in will likely submit everything either way, which is a win-win.

I truly believe that with this modification, this metrics system will receive a more positive response from the community, which will encourage more people to opt-in long term. Fedora has proven itself to value its community and building trust with them. The more trust that can be built, the better the outcome will be. :slightly_smiling_face:

5 Likes

I don’t think you’ll get any disagreement with this; nor will you find many in this thread who don’t want Fedora Workstation to be more successful.

My skepticism is about whether the move to putting software components for metrics into the Fedora Project suite would be counterproductive. I definitely don’t want it for myself, but I certainly recognize that I’m small fry as well.

I think it’s going to give a fair number of people incentive to move to/back to Debian. I guess we’ll find out.

1 Like

@uraeus, I fear you’re far too optimistic about the quality of opt-in telemetry data and what kind of conclusions could be drawn from it.

To be honest, I’m extremely surprised telemetry is still being considered for Fedora after the backlash last year. The community rejected opt-out telemetry, but the goals of this proposal have stayed the same. @catanzaro stated there’s now a higher risk the data will not be representative and @gtb wrote a good post about the statistical importance of opt-in telemetry - or rather, the lack of it.

I’m sure we’ll still get some nice looking percentage based graphs that state something like “42% of Fedora users use Totem as their video player” or “76% of Fedora users open the Settings app at least once a week”. Unfortunately the graphs will have very little meaning. A classic case of lies, damned lies, and statistics.

My hunch is that having OS level opt-in telemetry in Fedora is not going to be any more useful than running totally non-controversial community surveys. Can we just do those instead, please?

1 Like

1 . “Opt-In”

Title

Opt-In Metrics

Content

we now propose that initial setup will show an explicit yes/no prompt which has no default value

That’s NOT opt-in, that’s just opt any. There’s no strict definition of opt-in as there’s no strict-definition of opt-out, there is just one definition:

  • opt-in: out by default, the user can opt-in; if a user can “opt in”, clearly there’s an opposite default;
  • opt-out: in by default, the user can opt-out; if a user can “opt out”, clearly there’s an opposite default.

This is how the world have seen it for years.

If a user is forced to chose one or another, then that’s just not the definition of opt-in, nor opt-out. This is another “light” dark pattern use case meant to create confusion for the user, especially in association with other dark patterns like confusing question statement, and highlighting the opt-in option in the detriment of opt-out. At this moment, there is no information posted on this topic, nor on the “Privacy and Transparent Checklist” that states clearly the content of the question the user is asked, the answers they would have to pick (yes, there’s something defined as a “yes/no prompt”, that’s not clear enough, considering we don’t know the question, a yes/no prompt might mean “Agree”/“Disagree”, “Accept”/“Dismiss”, “I like ice cream”/“I don’t like ice cream”), or the UI/UX draft that would ensure the user is not influenced in any way towards one option or the other, other than their own thought process in regards to the clearly stated textual content provided.

2 . Later Edit: I’d also like to add my not-that-technical feedback on the technicalities in regards to the way the opt-in/opt-out process is going to be.
a) Will there be a configuration file that would enable/disable the telemetry or will be all done through some kind of packaging installation logic?
b) Are the telemetry components packages installed by default, or will they be installed only when the user chooses to opt in?
c) If they are installed by default, will they collect data by default, even if the user hasn’t opted in?

d) I think this feature should be enabled in two steps, configuration file and components, each with different owners, this should mitigate to a certain point the risk of pushing malicious options;
e) To make it more clear that this feature is kind of a “meta” over the distro, the packages should be distributed in their own separate repos, different from the classic Fedora repos; this would also ensure other normal packages from the normal repos won’t make any kind of hard dependencies and enable the analytics by default; if there’s no repo installed, the package can’t be installed, and if the malicious package tries to install the repo, at least the repo key acceptance should pop-out, and give the user a chance to think about it;
f) I like the idea of having the kind of system used in Fedora third-party repositories. sudo dnf install fedora-meta-repositories, sudo dnf config-manager --set-enabled system-metrics, sudo dnf config-manager --set-disabled server-metrics.

2 Likes

We will have to see how well the current approach works in terms of getting representative data. But having data is critical, for instance if we had this years ago we would likely have had better support from HW vendors much sooner, it turned out that saying ‘we believe you have a lot of Linux users on your hardware’ wasn’t very convincing in itself to get them to improve support. And if we can demonstrate that for instance certain SKUs are popular with Fedora users then we can get those put on the list of HW that vendors try to support better with Linux.

It will also let us put endless discussions about which GNOME extensions users actually care about to rest and stop making decisions just based on hunches and anecdotal evidence.

Any maybe that will be part of the driver for opt’ing in, because if you are not opted in then your wishes and preferences will of course not be reflected at all in the statistics and thus not affect the decision making. Kinda how if you didn’t vote you don’t really have a right to complain about the outcome of a election.

8 Likes

I find that to be a bit backward…
SKUs become popular with Linux users (Fedora or otherwise) because the support is good.

If a vendor puts nightmare-fuel HW in their device/SKU because it saves them $.5 per unit on licensing (or more generally cost), I can promise you they do not and will not care about Linux users. And as a result their device will not be popular with Linux users.

1 Like