35 posts were merged into an existing topic: Decision-Making, Governance, Council, Red Hat — abreakout topic for the F40 Change Request on Privacy-preserving telemetry for Fedora Workstation
What, if any, impact is there if the community said no to a specific metric?
I read this proposal as a “cart before the horse” type of issue - the problem statement (i.e. the metrics that need to be gathered) should be what defines the ask for telemetry to solve for X and if there is an alternative path to gather the needed data, that should be reviewed and proposed.
My expectation is that Fedora community members who want access to particular data for some particular purpose should be able to request access. But that is something we’ll need to discuss (this thread is a good place) and figure out.
The timing is mostly a coincidence; I wrote most of this change proposal in August 2021, and it just took a very long time before I felt ready to propose it. At the time we were hoping that having telemetry would make it easier to justify increased investment in Fedora. Unfortunately, with the way things have been going, that seems unlikely now.
Red Hat Display Systems Team
I have a slightly different goal: Fedora Workstation should be the premier Linux workstation operating system, period. That position is undisputedly held by Ubuntu currently.
That was the goal of the “What data might we collect?” section (above), which describes various metrics that I’ve seen requested thus far. I fully expect we would wind up collecting much more than this, but what exactly we collect would be up to the Fedora community to determine. My expectation is that Fedora packagers and other community members will have lots of use cases for data collection that I don’t know about yet.
It’s very common for developers to ask “how many of our users are doing X, Y, or Z?” and then complain that we don’t know. E.g. earlier today a user was requesting feed reader integration in Epiphany. I wanted to know how many users have a feed reader installed (because otherwise, it would be useless). A couple weeks ago, GNOME wanted to drop support for systems that don’t support SSE4 instructions (for GNOME OS and the GNOME Flatpak runtimes); we asked Endless to show us CPU data, and learned that a significant amount of their users would be unable to run Flatpaks, and so dropped that plan.
For avoidance of doubt, I don’t have any secret plan for what additional data to collect beyond what’s already mentioned in the change proposal. But I’m sure others at Red Hat and also Fedora developers who do not work for Red Hat do have some ideas. This thread would be a good place to informally propose particular metrics that you might want to collect.
What’s really important to me is that however we wind up deciding what data to collect, it should be done publicly via some sort of Fedora community process, and not wind up with Red Hat making these decisions in secret on its own. This thread is also a good place for proposals on how exactly that would look. In this change proposal I wound up proposing that FESCo would approve metrics, but there are many other ways we could do it.
Edit: This post originally said “SSE3” but it was really SSE4.
I like your idea. But I am not sure how to achieve that with such metrics? Ubuntu can be modified by vendors so that it can be easily put on their hardware, also its kernel is modified in a way that can make things easier for the users. I love that Fedora retains the stability and security guarantees of the vanilla kernel, but the majority of average users focus on the more obvious things. There are several reasons why superseding Ubuntu ain’t that easy.
I expect that in Europe, any imposition of data collection with data transfers to the US (and any data collection at all) will deter a noteworthy amount of people, may it be for rational and/or irrational reasons. So I do not see how this means will support that goal. The opposite could be the case in some parts of the world.
47 posts were merged into an existing topic: Approaches to data handling, safety, and avoiding individual identification — a breakout topic for the F40 Change Request on Privacy-preserving telemetry for Fedora Workstation
Well that depends on each particular metric. E.g. if we don’t collect gnome-control-center panel usage data, then the GNOME design team has less information to use to make design improvements. They’ve been trying to move frequently-requested settings from Tweaks into gnome-control-center but have been having trouble because user studies indicate users are already having difficultly navigating the high number of settings. So some consolidation of panels is needed. But (a) what settings to consolidate or remove? how do we decide without usage metrics? And (b) how do we know that the design changes were successful after they have been implemented? With respect to (a) specifically, I’ve seen proposals to remove most of the Sharing settings. So let’s say those settings are at risk without telemetry. (I don’t remember offhand if the Sharing settings are actually still at risk, but it’s a real non-hypothetical example of a design change where telemetry would be helpful.)
Sticking with the example of gnome-control-center panels: how else would you gather panel usage data from a representative sample of users? (i.e., not from users who choose to respond to a survey, since those users are not going to be representative of typical Fedora users.)
9 posts were merged into an existing topic: What data will be collected, exactly? — a breakout topic for the F40 Change Request on Privacy-preserving telemetry for Fedora Workstation
So, I was initially wary of this change myself, and wanted to scream “Hell No” from the rafters. If you haven’t, I highly recommend reading the linked blog post, Endless OS’s privacy-preserving metrics system – Will Thompson
Data is valuable. That’s why everyone has opinions about their privacy, and what they want to do with their personal data. And the problem is this data is valuable to different groups in different ways.
While yes, technically a Red Hatter is proposing this change, this is one team member from a single small team inside Red Hat (Red Hat Display Systems Team). This can hardly be considered “Red Hat forcing a change,” especially since Red Hat gives us autonomy on our involvement in upstream projects, even if it goes against the interest of Red Hat. I believe that the author has a positive intent, and an inherent desire to truly improve the situations they can affect.
There is immense power here, to improve the average user interaction with Fedora and improve user retention and success. A better Fedora is (imo) in the best interest of everyone. Using data to identify and guide development investments is a significant win.
“But at what cost” - This is the scary part. Telemetry is not inherently bad, evil, etc… and can be used in very negative ways. But for it to be valuable, it has to be truly representative of the situation. Simple feedback surveys and opt-in metrics will never get you statistically significant data to apply to the broad user base and guide design decisions.
Opt-out is, unfortunately, the best way to get this representative sample. A significant amount of people will just install the defaults and go about their day. They just won’t care if telemetry is collected or not.
But what does opt-out mean? In my mind, along the lines of the example in the provided blog. A blatantly obvious installer step that is direct and explicit, and explains in sufficient but terse detail what’s going on. Then, another simple and obvious option for post-install to explicitly disable it.
As far as the stored data, I don’t think it should be viewable in detail by the general public, I’m worried that enough granularity would provide those with the time and resources the ability to find weaknesses. (Something else I changed my mind on thinking about it). BUT I do think reporting and an interface to query parts of the data will be beneficial in putting the general public at ease that this data is being used correctly and appropriately.
As already evident by the chatter on this thread and the mailing list, people have STRONG opinions about this, and if approved, we would want to ensure we took great care in doing this the right way and for the right reasons.
In summary, I do think telemetry would be valuable to Fedora, and would only be truly usable in an opt-out state. But the devil will be in the details of implementation and metrics, and a very fine line would need to be walked early on to ensure success.
(Copying from devel@, as I think this is useful here)
On Thu Jul 6, 2023 at 20:17 CDT, Michael Catanzaro wrote:
I’m attaching a screenshot to give an idea of what this would look like in
gnome-initial-setup. I don’t have a gnome-control-center screenshot handy, but
it would be similar, except there it would default to off.
Telemetry is great and useful, just ask nicely, explain what information you’re collecting, and don’t enable it by default. It would also be helpful to say that collection of this data is to improve the users’ experience when asking for it.
Honestly, love that post. You basically outlined what I thought, and am thinking, to a T. And I agree that telemetry would be important to Fedora, but that the line here would be very thin.
Does this concern only GNOME? Whatever the new default is, will it be applied on spins as well?
I think telemetry is not a bad thing at its core however it is not preferable. An implementation like KDEs could work better where the user can control the “level” of data shared rather than a true/false toggle, this will give users more control and allow them to share more or less depending on personal views.
The default level could be a low level which shares little data so it is still “opt-out”.
My biggest concern with this (other than a reiteration of the plea to make this opt-in or at least make the user decide explicitly without a default), is the PR disaster this is going to be.
I can already see the headlines and the outrage on social media: “Fedora now collects metrics at the request of Red Hat, enshittification continues.” It doesn’t matter that it’s not true because Fedora makes the call, it will be bad PR for Fedora in any case, which will actively work against the stated goal of making Fedora the premier developer platform.
Given the amount of goodwill Red Hat has recently burned in the community and its fallout on Fedora (just go and see how many people are posting that they’re switching distributions because of what Red Hat did; I’m aware none of the recent announcements affect Fedora, and Fedora isn’t in any danger, but that message either does not arrive at or does not resonate with users), I think this is not the time for this proposal.
I think you should come back with this in a year. At the moment, this is just very bad timing.
I support this and I would leave it on. I trust Fedora, Fedora infra, and want Gnome Software or Settings to be more relevant to how people are using it.
To me it seems that we miss the transparency part before we start arguing unknown details. We do not know the scope (what data), the purpose (who is gonna use it and for what purposes) and the protection (how data is transferred, stored and shared).
I do not trust corporations (yes, Fedora is a community project but heavily funded, supported and steered by RedHat which is owned by IBM). I cannot remember cases where big tech made right choice for users/community if they face moral dilemmas. Why should we believe that RedHat+IBM are different ?
On the other hand, I fully understand that getting proper amount of statistical data is very hard if said data is very limited because of low engagement through opt-in. Having anonymized stats should help make the FW better through more effective bug fixing and improvements/optimizations.
I think, we need a compromise. And I believe that the full transparency would help build up trust in the opt-out. That transparency has to be audited yearly and the report must be made available publicly to the community, which has right to know what data were used by&shared with whom for what exact purposes.
My personal take on the situation is this - if opt-out is chosen and transparency is not provided, I’ll opt-out. That would be sad, very sad decision, but I prefer to be better safe than sorry. Because corporations are not people.
The way I see it is:
- Decisions need to be made
- I believe that decisions based on data are likely to be better decisions (but that’s a belief)
- I would prefer my use case to be taken into account in those decisions (I see it as a form of vote)
If the UI shows a clear way to disable telemetry, then I think the most privacy-focused people will easily be able to disable it, right? It would certainly be easier than switching distributions.
I’m not sure if this has been answered already (I did read the thread! ) but I wonder whether the collected data would be accessible to everyone in the community (not just the database schema, all the final data). It may be interesting to let our community run analytics on it and unearth some facts that the Display Systems team hasn’t yet thought of. It would be empowering to the whole community, I think. And since our server configuration is accessible to all in the Ansible repo, it would help build trust that there is no personally identifiable data hidden somewhere for secret purposes.
What about packages which already collect metrics and report them somewhere (not necessarily to Red Hat)? Would these packages need to change under this proposal? If not, how do we explain this to our users?
I think that would be extremely difficult to enforce. I think this proposal should cover the basic set of installed packages under Fedora Workstation.
I guess we could ask reviewers to provide the package’s telemetry policies but to me that looks like hell waiting to happen, and would make the reviewing/packaging situation very difficult.