F40 Change Request: Privacy-preserving Telemetry for Fedora Workstation (System-Wide)

Metrics uploading will be opt-in for users who upgrade from previous versions of Fedora Workstation, because we don’t yet have a mechanism to ask the user to consent to data collection after a system upgrade like we do for new installations, but metrics collection will be opt-out.

Just so everyone is on the same page and things are crystal clear: Is this a decision that Fedora is taking for strictly for Workspace (as in, the builds of Fedora which ship with GNOME Shell), with telemetry being disabled for all other spins? Or is this being enforced in a really “non-contained” way that impacts all spins?

I want to be absolutely, abundantly clear that I do not approve of opt-out metrics / telemetry gathering for Fedora Budgie Spin, which is the spin I develop with the Budgie SIG, nor the components it uses. Will those developing spins have the configuration capabilities to be able to ensure all metrics are disabled by default, as we may value the privacy of our users differently.

It is my view that metric / telemetry gathering should be primarily left to the purview of the respective applications or desktop environments. I understand some use cases, for example types of disks / storage devices where the impact of any change without metrics would be system-wide. Metric gathering in those circumstances should be opt-in, even if that means impacting the overall dataset you can work with. But by the sounds of the post, you’re already limiting your dataset anyways by making it Workstation-specific, so why not retain user trust by making it opt-in?

Accordingly, we want to know things like which IDEs are most popular among our users

That sounds more like a decision Red Hat should take with their commercial offerings and engagement with business partners.

which runtimes are used to create containers using Toolbx

This should be opt-in telemetry that they implement for their given use cases should they desire this.

We also want to know how frequently panels in gnome-control-center are visited to determine which panels could be consolidated or removed, because there are other settings we want to add, but our usability research indicates that the current high quantity of settings panels already makes it difficult for users to find commonly-used settings.

This should be implemented, opt-in, by GNOME, should they want it. Why does Fedora need this information over the project itself? Or is this data, allegedly “anonymized and aggregated”, going to be shared with third-parties to make these sorts of actionable decisions? Is it then going to be communicated to users that data they are sending will be shared with third-parties?

To retain user trust, we need an easy way for users to understand exactly what data we are collecting.

Here is an idea for retaining user trust: Opt. In.

However, we also want to ensure that the data we collect is meaningful, so gnome-initial-setup will default to displaying the toggle as enabled

Yikes.

these users would not be representative of Fedora users as a whole

Nor would it be if it’s just Fedora Workstation in the first place?

7 Likes

I fully support this-- but for one thing I’m really curious about what it would change.

My suggestion is, when you add this, you also start some sort of scheduled blog post or newsletter about what the telemetry information says. And also the conclusions you derive from them, as well as what you plan on doing to improve user experience based on this information.

I would say one of the only reasons people don’t want this idea, is because it was used to siphon unneeded information off of users in other services, which is obviously not being done here. But also because in people’s eyes there is no upside to this. All sorts of projects ask for this information but people don’t notice any change that occurs because of the telemetry.

I think that if you make a frequent summation of what the telemetry says-- like the hardware that most people use, or a certain feature that is most used… or what you conclude from the telemetry and what you want to change in the project to improve the experience of the people, and have that published, it would solidify telemetry in people’s eyes. At least as something that can have a positive effect.

Additionally, this is also way more transparent as well. You’re saying everything that the development team is thinking, and what you’re trying to do with this information.

Other than that, I’m just really curious about how much you folks will make use of telemetry and all the changes that’ll happen because of it, and it might inspire me to add telemetry into my own projects as well if you get a lot of bang out of the telemetry info.

5 Likes

This appears to be collecting data I would personally describe as non-intrusive. Counting installs is much different than tracking which applications are used and when.

3 Likes

Relying on telemetry is a weakness and shows a lack of clear direction or vision. Decide what you want to work on and ship it, instead of relying on stats that won’t actually give you a clear picture. You know what they say - lies, damned lies, and statistics.

Want to know about your users? Ask your users. You’re getting plenty of feedback here. This seems like a great place to gather data. Yes, this is a discussion forum that’s usually only frequented by those who care about their operating system, so you’ll only get one side of the picture. But statistics will never get you a human perspective. You can’t go “deeper” with them because there’s nobody to ask - they’re just numbers. They provide an incomplete picture. What I know from my experience assisting users with computers is that they often do things in a way you don’t expect, and that data won’t show you - many years ago I “corrected” the resolution of an employee’s monitor and she asked for me to put it back to its previous value - she had set the resolution lower at one point because of poor eyesight.

Telemetry seems, ultimately, not very useful beyond counting active users, and you already get that from repo stats - we can’t really opt out of your httpd logs. Maybe scrape those instead of installing a daemon on a user’s machine that consumes CPU time and bandwidth.

At least it’s nice you’re willing to ask here, but in the future, you could stand to wait a month or two between controversial announcements, for optics, and a better reception. As people have already said, emotions are higher than normal right now.

8 Likes

Let’s be honest: the timing for such a proposal is very, very unfortunate, and this is going to add to the bad advertising all Red Hat-related products are under now. This is to suggest that the entire proposal should be crafted and announced very carefully.

I realize that the proposal authors wanted as much data as possible to drive their decisions and development, and I respect that desire: however, this proposal in this original form is really going to backfire.

I believe no one here truly neglects the importance of telemetry when it comes to gather insights about a representative portion of the population of users. Data in general is always useful, either in the present or in the future, and allows novel insights that weren’t simply possible before. No one questions the importance of data when driving decisions.

However, it seems to me that the way such a representative portion of consenting users is obtained is disappointing, to say the least. The idea that a representative portion of consensus can only be achieved through an opt-in button literally suggests that the users are considered uncapable of deciding, or are expected to mindlessly scroll through the menus and leave the telemetry button turned on, be it by mistake or not: otherwise, users would simply turn it on in the case they agreed, and that’s something the original proponents seem to know the users will not do.

From a privacy-respecting, and most importantly, user-respecting operating system as Fedora Linux is, I frankly expect the user privacy rights to to be actually protected by design.

My idea is the following, and is summarizable with this principle: don’t decide for the users, let the users decide for themselves. Do not present the positive, agreeing choice by default, but neither present the negative choice. Instead, show the user two equal-standing buttons: an ‘I agree’, and an ‘I do not agree’ button. Those buttons should not be preselected by default. Those buttons should not even hide behind dark patterns or resort on any similar well-known or novel trick. They should be perceived as equal in strength. They should be unequivocably clear, and should be accompanied by a simple and straightforward explanation that – if too long – should link to a “further informations” section for the more meticulous. Explanation should be simple and everything should be crystal clear. Users should not be made feel guilty in the case of no consent. And, very importantly, buttons must not be skippable: user cannot jump to the next section unless they established a decision. This means that no user can press ‘next next next’ and find their telemetry enabled. Users should be trusted, not tricked in any way, subtle or not.

I think the problem here is not the telemetry itself: the issue here is the lack of trust on end-users, and the resort on opt-out strategies to collect meaningful telemetry data. I strongly believe the users should be trusted (and, in this case, forced) to decide ‘Yes’ or ‘No’, clearly and with no obscure wording we typically witness in some user-disrespecting commercial products. There is no justification for a lack of respect towards users, no matter what is at stake.

8 Likes

2 posts were split to a new topic: Thoughts about the proposal to use Discourse for Change discussion

Yes.

No, spins decide for themselves what they want to ship and what features to enable.

1 Like

Fedora and GNOME have been fine with next to no telemetry so far. Why does it need it now, all of a sudden?

3 Likes

Regarding scope, my intention is that we’ll create separate proposals for each piece of data to be collected. Approval of this change proposal would only approve the infrastructure for collecting data, and not any actual data collection itself: that would still need to be approved separately, and specifically. The “What data might we collect?” section of this change proposal, plus my other comments in this thread, is a good starting point to see what sorts of data I intend to propose in the future. Another good document to review would be the events documentation, since that shows what the system is already capable of collecting (although Fedora would probably not want to collect this much data).

Regarding purpose, the purpose of each data point will be different and the “What data might we collect?” section of the change proposal shows several examples of what the purpose behind data collection might be. Another good example that I mentioned earlier in this discussion is how GNOME recently decided not to drop support for pre-SSE4 CPUs based on data from Endless OS. I had thought it was time to stop building 32-bit flatpaks entirely, but instead we’ll keep even older 32-bit processors working due to the quantity of users that would be cut off by requiring SSE3! Correction, this was a discussion about SSE4, not SSE3. GNOME already does not support 32-bit except for libraries required to run Steam.

Regarding protection, currently the primary protection would be HTTPS (for transferring data) and SSH (for accessing the server). There is a request earlier in this thread for us to further encrypt the data such that the web server that receives the requests cannot view it. But ultimately the data is going to be stored in a Postgres database somewhere on Fedora infrastructure.

Red Hat is a big corporation that just wants to make money. It does not really care about us except to the extent that we can help Red Hat earn money, and has demonstrated that pretty clearly in the past few months. Trusting it too far would be unwise. But you do need to have some level of trust, or else you really shouldn’t be using Fedora Linux. :wink: You probably trust that our packages are not malicious and that the operating system does not contain hidden source code changes. You probably trust that we wouldn’t hide telemetry in the OS without telling you about it first. If you participate in the Fedora community, you probably also have some trust in the various individuals who work for Red Hat, even if you don’t trust the corporation itself (you shouldn’t).

My goal with the telemetry change proposal is that you only need to trust Red Hat to be not actively malicious. All metrics to be collected should be proposed, discussed, and approved via the Fedora community, not behind closed doors. Users should be empowered to participate in the process. I’m very open to suggestions on how exactly this would work.

I do believe Red Hat is still treating Fedora and Fedora users properly, at least mostly. I’m disappointed by the layoffs that affected our staff members, and the cuts to desktop development that will severely affect core desktop packages, but I can accept it’s a business and staffing is a business decision. At least Fedora is still a community-run project, and I don’t see any risk of that changing. If it makes any difference, I was a community contributor for 5 years before I joined Red Hat, and I really don’t think I have any more power in the Fedora project today than I did before I was hired. Community power is real in Fedora, and if the community just isn’t willing to allow telemetry, then this proposal will fail.

Well I’ve already promised that you’ll be able to redirect all the telemetry to your own private metrics server and see exactly what is collected; that’s at the very top of the change proposal, and I’d like to think that people reading here have at least read the executive summary at the top. So that’s already pretty significant.

Maybe we could build a client-side application to make this easier to do, so you wouldn’t need your own server to see what is collected from you?

I think the only additional possible form of transparency would be to release the entire data set. There are multiple requests for this already. But it sounds like that’s not what you’re asking for? I’m open to suggestions regarding “transparency audits,” but I’m not sure exactly what those would look like.

1 Like

That is the exciting question! Someone believes it is significant enough of a benefit to put in their time and effort into this proposal, and see what the community at large thinks! This is a major ethos of open source communities, to propose/introduce valuable change in something you value spending time on.

Just because things have worked a certain way up until now, doesn’t mean we shouldn’t constantly be looking for ideas and initiatives to make the situation even better! Every day we are exposed to new ideas and experiences that can shake the foundations of what has come before.
This proposal may get shot down quick and decisively, it may be accepted unanimously. At least it has been proposed, and the community has been asked what it thinks! And this feedback will be taken into account when those who are empowered to decide make the decision.

2 Likes

Packages that are already separately collecting their own metrics would not be affected by this change proposal.

I can speak as a GNOME developer—though not on behalf of the GNOME project as a community—and say: GNOME has not been “fine” without telemetry. It’s really, really hard to get actionable feedback out of users, especially in the free and open source software community, because the typical feedback is either “don’t change anything ever” or it comes with strings attached. Figuring out how people use the system, and integrate that information in the design, development, and testing loop is extremely hard without metrics of some form. Even understanding whether or not a class of optimisations can be enabled without breaking the machines of a certain amount of users is basically impossible: you can’t do a user survey for that.

So, yes: GNOME has been able to get by without telemetry. It would have made some things easier if we had it, and going forward it’s going to be exponentially harder to do without.

3 Likes

And still your telemetry isn’t telling you that an extremly huge number of gnome users left after gnome 3 because you removed ALL features. Still to this day gnome can’t show tray icons without extensions that break at every mayor gnome update.

This decision is kinda pointless. First they wanted to know what we think. The majority said telemetry is ok but only with opt-in or via the backend.
But the decision for opt-out is already fixed. So what is the point of this discussion???
And arguing that dnf has already opt-out is also pointless, it’s increadibly bad implemented. You have to manually comment countme=1 in all the repos.

2 Likes

I don’t think the actual numbers bear this out. Or, if so, it’s a matter of “they would have grown even more, if only…”. Or, the classic “it’s so crowded — no one goes there anymore!”.

That said, I don’t think Canonical sets the bar for us. In fact, I think this proposal is significantly better in important ways. Likewise, I absolutely do not set Chrome or VS Code as our models. And our goal isn’t to mimic, anyway. I am bringing these up because there is evidence that despite legitimate and passionate privacy concern, large numbers of users do not prioritize this at all.

Given that, I think it’s more about us doing the right thing, the most usefully and as best we can, even if what we decide doesn’t cause users to flee. if folks really do have that concern, please bring some numbers to the conversation!

3 Likes

I don’t! If you have a question, something that needs figuring out, just poll the people who are using your software, there’s no good reason telemetry, and especially not if it’s not opt-in.

If you really need to make sure everyone sees it, you could even have a little news addon like in Manjaro so everyone sees the link to the poll.

Anyway, would this only apply to the default flavour, am I safe with my KDE spin?
EDIT: apparently it’s GNOME only, just saw the answer above, and hopefully the KDE spin team will provide sane defaults = no telemetry. Thanks @glitch9138

1 Like
1 Like

A post was merged into an existing topic: Opt-in / Opt-Out? A breakout topic for the F40 Change Request on Privacy-preserving telemetry for Fedora Workstation

I would envision all GNOME desktop components would be required to respect the new setting, but I really do not intend for this proposal to have any impact on other non-GNOME components that are developed separately and already have their own separate data collection. But actually, I had not considered this because I had no idea that Fedora packages already collect data beyond dnf better counting. What packages do you know of that currently have their own private data collection?

1 Like

It doesn’t really fit this topic here and yes I don’t have the telemetry, funny. I was not talking about long term users count only about the time Gnome 3 was released. Everybody stopped using it. Cinnamon and Mate was developed and Canonical stopped contributing to Gnome and switched to Unity.
The only point I wanted to make about current Gnome is that basic features of an desktop environment are still missing, which clearly are demanded by the community. And the telemetry in Gnome is not helping with these kind of issues. Because telemetry can only be collected about things already present.
So including preselected telemetry won’t magically solve all issues.
Sorry if it was badly phrased.

3 Likes

I like the general idea, and it could be very useful down the line. Seeing that the plan is to add pages to the initial setup and Settings though, maybe it would be better to upstream most everything, if that wasn’t already planned. KDE has opt-in telemetry in their System Settings, GNOME should too, and Fedora can piggyback off of that.

Keep in mind that the entire thing should be transparent - the user should be able to view exactly what is sent out locally, and tweak the level of information provided.

2 Likes