F40 Change Request: Privacy-preserving Telemetry for Fedora Workstation (System-Wide)

Sure, I totally agree. Poll results can not accurately represent the whole user base. But does it really need to? A well designed poll or a survey would still provide some limited insight. Just take it for what it is and combine it with the expertise of the whole team. Usually that leads to good results. And judging by the top-notch Gnome design it clearly has.

I would be really worried if a project (be it software, a new ice-cream flavor or anything) can only make solid design choices through 100% representative surveys or telemetry data.

1 Like

I’ve been using Fedora since 2005 (18 years) as my personal and work development distribution of choice; I rarely make my appearance known, except on occasion, but depending on decisions regarding this topic, I may need to find an alternative distribution to call home.

  1. I feel the tone of the discussion is, “this is happening, and if you want to be involved in shaping HOW it happens, speak up!”; however, I feel this is the wrong approach when you are talking about privacy and data which has a long history of being abused or slowly abused over time. Based on certain statements by a few notable individuals, it is quite clear opposition was expected and thus, it makes the way the tone has been set look extremely strategic in their favor by Red Hat devs.

  2. I think it is a bit disingenuous to put forth a proposal or plan to integrate telemetry when it isn’t at all clear as to specifically what data the telemetry will be targeting. Data is a sellable commodity in today’s society, and as such it should be a mountain of effort and heavily scrutinized even to extract just one bit. Instead, the tone and direction of this proposal / discussion aims to make the user agree to a larger picture with intentions of expansion and change as time moves on. This is wrong…

Your proposal should state plainly and clearly what data you want and then be discussed on the merits of the data; not be flippant and ask users to agree to something on the promise of not being evil and then we just have to take your word that 10 months late Red Hat won’t be siphoning any and all data such as search like Microsoft.

  1. Plenty of people have already stated the obvious: Polls or surveys are the best viable options, but there is a second option that I can’t discuss here due to the favorably strategic split of the discussion.

Bottom line, my 2 cents, if any telemetry makes it as default, regardless of the data it is targeting, there’s a 90% chance I say good-bye to Fedora.

Edit: replied to the wrong post, I meant to reply to post 334 or user Catanzaro

9 Likes

I’m in the process of evaluating operating systems to install on a new laptop. Fedora has been at the top of my list for a long time. Having read through this discussion, I decided to make an account to express another point of view that seems to be underrepresented so far.

There has been a lot of discussion of optics, opt-in vs. -out, the particulars of the implementation, the utility of the data, and personal privacy. While some of these are important, such as privacy, these concerns are all secondary to the fundamental issue that telemetry represents an anti-feature by definition: it is a feature that prioritizes the interests of the developers of the software over its users.

The mere suggestion that this is appropriate for a free-software project is baffling and lays bare a sense of entitlement over Fedora’s users and their machines. I realize that anti-features like this have become so normalized over the last decade by proprietary software vendors that it may now be hard to see how unacceptable the practice is, and how it is in direct conflict with the Fedora Project’s stated vision. Anti-features run counter to the ethos of software freedom and user autonomy and control.

Seeing the direction this discussion has taken, it’s unlikely that I will consider using Fedora no matter the outcome of the decision. Whether or not it gets approved now, I don’t have any interest in worrying about the addition of this or some other anti-feature when I update to the next version.

Thanks for reading. I’m happy to expand on what I’ve written if something is unclear.

5 Likes

This is going to seem like it belongs in the “what to collect” thread, but I think there’s a connection here. It feels like there are at least a couple dimensions of data that could be captured under “telemetry”…this is roughly the conceptual model I followed in my old job (as best as I can remember, and with examples that seemed relevant for this setting):

The more I think about this, it feels like types of data that are best obtained from multiple different mechanisms, with different sensitivity levels and different optimal means of collection, might all be asked for at once.

I’m bringing this up only because I’m wondering if there’s a way to actually simplify/ease folks’ concerns about the overall initiative by scoping out “themes” of data capture - say “Hardware Specs”, “Hardware Performance”, “Software Portfolio” and “Software Usage”, and tying those themes to different opt-in/out policies or to different collection mechanisms (background harvesting, foreground prompting/submission a la ABRT, etc.). I totally appreciate and respect the caution in wanting to get clearance for the topic before then broaching specific items, but just wonder if a middle ground of specificity might help stop the leap being made to “telemetry = Windows-style keylogging = evil”?

Sorry if this is late Friday night rambling and doesn’t make sense, but wanted to throw out some thoughts that were noodling around while walking the dog earlier!

3 Likes

When i started using Gnome over a year ago i was surprised to find there was no toggle to share usage like KDE has. Personally i always enable/leave telemetry on software that i trust and enjoy to use as a way to help development. And i 'm used to having to opt-out since that is the way most handled telemetry. Historically anyway. Not saying this is right or wrong either way but just as a personal observation.

So personally i have no issue with this proposal but do understand why others might and why they have concern over this and how it is being proposed to be implemented.

My understanding is that this is just the overall proposal if it should be done or not at all but i think having a clear set of basic telemetry included with a clear set of guidelines on how the telemetry will be handled in the proposal right from the start might have eased some minds. As it is now i understand how some might interpret this as more like sign us a blank check and trust us to do the right thing. I can especially understand my fellow EU citizens hesitation when a US based corp wants/will have access to their data, regardless of how low level/anonymized it is since history has shown us that data is valuable and can be used in many ways.

And the timing. Optics matter regardless of how many times Fedorians state that RH does not have a finger on the scale with regards to voting etc. And i fear you lost a lot of goodwill in the community releasing such a polarizing proposal so close to the other drama recently regardless of how well intended this is and detatched these two projects are.

4 Likes

Well, it seems that people are extremely interested in this topic.

In two hours (if you add both threads) it became the hottest topic ever on this forum. At least the one with the most replies.

1 Like

I guess it really due to the original nature of open source development model.

As the development is initially by self needed driven + voluntary contribution, so there is no need to have data backup to find out areas that need further improvement.

That is what I consider as a bottom up model.

Now, large organizations are more involved. And the model are chaning to Top Down.

Thus those at the Top need data driven model, to decide how to allocate their resource to enhance the product, based on their metric of “return” - it can be money, frame, product popularity.

So I can see talk about telemetry is touching the foundation of open source development - and it will upset a lot of capable users.

For me, a free rider, will just keep using what is available as I do not have the capacity to find better solution, or change the product to suit my desire.

1 Like

I’ve only seen it mentioned tangentially in some of the replies here, but would it be possible to use a system similar to “dnf countme” rather than a (potentially too powerful) platform like the one made by Endless? As far as I can tell, the “dnf countme” system is so simple that a similar system for other metrics would face much less criticism than the one proposed here. If I understand correctly, dnf will randomly say “I’m here! count me!” randomly, once during a specified time period, and send information like “what edition / spin am I running on”.

A system for other metrics could work similarly - it could collect data for “approved metrics” (approved by who? FESCo? Workstation WG?) locally, and only randomly send responses for individual metrics, but never send responses for multiple metrics together (to avoid making it possible to identify users). A system like that will likely have a lower number of responses, but a stochastic system like it would still be useful to extrapolate usage patterns.

4 Likes

I understand the point you are making, but I too was taken aback that data would be collected in advance just in case we decide to opt-in later. This seems odd to me. Unless and until I agree, I would not expect any data to be collected locally or otherwise. To argue it is “consistent” to take this approach sort of misses the point that the approach seems wrong on the surface.

2 Likes

Hey Fedora developers and community, I created this account just to express my opinion about this “telemetry” proposal. I have been using Fedora since version 22 (or 23 I don’t remember exactly). I have reported bugs on Fedora and GNOME for long time to help them improve. But it was my choice and I had control over what data I upload to these bug reports.

But the proposed “telemetry” crosses the line for me. You are basically taking away the control from users by making it opt-out. There are two ways to look at this “telemetry” proposal. Whom does it give advantage to? Users or developers. Obviously the developers, which is against the whole FOSS philosophy. FOSS should empower it’s users not the manufacturer. You are going to install software that runs on user’s machine and transmits data to the manufacturer because they thought it’s their software. But I strongly disagree with this kind of mentality.

Also is there any proof that telemetry actually works? Let’s take an example. Accessibility. Majority of the users not disabled so your “telemetry” is going to project that. So should you de-prioritize accessibility features because majority of the users are not disabled?

If telemetry is good, then why not add it to Linux kernel? Now kernel developers will know exactly what drivers are used and remove the least used ones. Seems like good idea, isn’t it? Also why not add telemetry to PostgreSQL database server. Let their telemetry send sample of data from their customer’s database. Their developers will know what kind of workloads their users are running. Maybe they can prioritize features based on that. Sounds good, right? Because after all telemetry is ethical right.

I am not blaming the author of this proposal. Maybe they have different opinions and does not object to telemetry in the software they use. But not all users are alike. Many of us who are using Linux are using it because Windows and MacOS have similar kind of anti-user behaviors.

So please vote against this proposal whoever is in power and keep Fedora “our OS” as it claims on it’s official website.

7 Likes

This is kinda off-topic, but I’m throwing this in the air for future discussion in another change proposal.
Now that you mention the “is telemetry useful?” point, it’s true the advantages of telemetry can be hard to pinpoint (at least for users without experience in programming with and without telemetry)

I think it would be better for Fedora to utilize the resources directed at questionable change proposals to prioritize proposals that can truly benefit everyone, telemetry mostly of the time only helps developers (it would be cool to see if telemetry is implemented, some problem that gets solved thanks to it!).
For example, if we got proposals suggesting things like good QA testing (more akin to openSUSE) or integrating the BTRFS snapshot capability out of the box with Fedora.
Having snapshots means that developers can gain time to solve issues, especially critical ones, as most of the time users can just roll back the problematic changes. That, at least for me, is more essential than telemetry in helping out users and developers.

1 Like

A post was merged into an existing topic: Opt-in / Opt-Out? A breakout topic for the F40 Change Request on Privacy-preserving telemetry for Fedora Workstation

A post was merged into an existing topic: Opt-in / Opt-Out? A breakout topic for the F40 Change Request on Privacy-preserving telemetry for Fedora Workstation

I’ve been using Fedora since 2005 (18 years) as my personal and work development distribution of choice; I rarely make my appearance known, except on occasion, but depending on decisions regarding this topic, I may need to find an alternative distribution to call home.

I’ve been around since the Red Hat Linux days… :slight_smile:

  1. I feel the tone of the discussion is, “this is happening, and if you want to be involved in shaping HOW it happens, speak up!”; however, I feel this is the wrong approach when you are talking about privacy and data which has a long history of being abused or slowly abused over time. Based on certain statements by a few notable individuals, it is quite clear opposition was expected and thus, it makes the way the tone has been set look extremely strategic in their favor by Red Hat devs.

I’m sorry if the tone looks that way, it’s 100% not the case.
This is a change proposal. It may be amended, rejected, delayed or
approved. I am pretty sure there’s already been a lot of changes that
they submitters want to merge in.

  1. I think it is a bit disingenuous to put forth a proposal or plan to integrate telemetry when it isn’t at all clear as to specifically what data the telemetry will be targeting. Data is a sellable commodity in today’s society, and as such it should be a mountain of effort and heavily scrutinized even to extract just one bit. Instead, the tone and direction of this proposal / discussion aims to make the user agree to a larger picture with intentions of expansion and change as time moves on. This is wrong…

Your proposal should state plainly and clearly what data you want and then be discussed on the merits of the data; not be flippant and ask users to agree to something on the promise of not being evil and then we just have to take your word that 10 months late Red Hat won’t be siphoning any and all data such as search like Microsoft.

Yeah, I personally wouldn’t want to approve this without deciding also
on the processs for adding/removing collected data and an initial set of
those. (I’m going to make another post with all my questions.)

  1. Plenty of people have already stated the obvious: Polls or surveys are the best viable options, but there is a second option that I can’t discuss here due to the favorably strategic split of the discussion.

Sadly, I agree with those that note that polls or surveys are very self
selecting. You get people who tend to answer polls or surveys and often
not the people you do want feedback from. ;(

4 Likes

This is kinda off-topic, but I’m throwing this in the air for future discussion in another change proposal.
Now that you mention the “is telemetry useful?” point, it’s true the advantages of telemetry can be hard to pinpoint (at least for users without experience in programming with and without telemetry)

I think it would be better for Fedora to utilize the resources directed at questionable change proposals to prioritize proposals that can truly benefit everyone, telemetry mostly of the time only helps developers (it would be cool to see if telemetry is implemented, some problem that gets solved thanks to it!).

Well, “Fedora” can’t (except in pretty limited ways) direct resources to
do anything. Folks can discuss and convince people to work on things,
but since Fedora is a community, people will work on what they feel is
best.

So, in this case Gnome developers want to work on this because they feel
it will help them make a better workstation and improve the lives of
users. Fedora (via FESCo) can say “no, we don’t want to do this”, but we
can’t say “and you should work on this other unrelated thing we would
prefer you to work on”. People would just say “no thanks”. :slight_smile:

Of course Red Hat employees can be told what to work on in their work
hours, but despite all the commotion lately, Red Hat is still very
deeply a upstream first/community/Open source company. So much so that
the Code of business conduct and ethics that all employees must agree to
abide by specifically says there’s no conflict of interest when an
employee makes some decision favoring a community over the company.
(

)

For example, if we got proposals suggesting things like good QA testing (more akin to openSUSE) or integrating the BTRFS snapshot capability out of the box with Fedora.

We already do extensively use openqa! :slight_smile: It’s awesome.

I don’t know of anyone working on btrfs snapshots off hand, but anyone
could!

6 Likes

In the long run, when used properly, telemetry is beneficial for users too. The developers have a limited amount of time to work on things, and telemetry can guide the developers to spend time on things that are useful to users. Developers want to do well by users, but now we often work “in the dark”, with very little idea what features are actually used.

Please note that the proposal and subsequent discussion makes a huge effort to keep the data collection anonymous and aggregated. Data that would be collected is generally not privacy-sensitive, e.g. no URLs, keystrokes, filenames, times, user or machine names, or addresses, would be collected.

3 Likes

Some general thoughts from me here after reading all the replies. :slight_smile:

  • I would really want the process for approving metrics known/agreed to before this proposal could be accepted. In order to think of what that could look like though, I think I need more information. How often would they change? How long would they run? Would this be used for “we want to know how people use this thing so we can implement a change” or more “we want to notice high level things so we can redirect resources more” Or “both” ?

  • With my Fedora Infrastructure hat on, if this was approved and we wanted to deploy the server end, it would need to be in our OpenShift cluster. Our proxy network sits in front of that and proxies requests into applications. We would want to consider proxy logs (perhaps we only keep them for a day, or don’t keep them at all for that service, but that might make debugging problems hard). I’ve not looked at the application yet much, but from a high level it seems like it wouldn’t be hard to deploy in OpenShift. I assume any questions about the server software we can just work with upstream on?

  • I like the idea of making all the end data public. We already do this with a number of services in Fedora (you can get db dumps of many of our services already). This also might allow community members to look at them and visualize the data better than the people who were first looking at it.

  • I hear the concerns about opt-out and also the concerns about self selecting samples on the other side. Perhaps there’s still hope for a compromise here.

  • I saw one of the upstream developers posting here, but cant seem to find the post now. ;( I wonder if endless could speak to any examples of things they were able to improve due to metics, how their community reacted to ‘opt out’ and how the system has been running?

  • I’d like to thank everyone for (mostly) keeping calm and asking questions/providing feedback. It’s very appreciated by me at least.

I’m sure I will have more questions as I ponder on it…

7 Likes

I support this proposal and I hope it’ll gain the support of majority. It’s great to see the very active discussion — it’ll make this proposal better, or if it is rejected, inform future approaches to data collection.

As a developer of software, and maintainer of a bunch of packages in Fedora, I have quite often wished for general statistics of which packages are used. If I have 80 packages I could work on, it’d be much more useful to update the one with 10k users rather than the one with 5. But right now, the only feedback metric is bug reports, and if a package doesn’t get bug reports, it could either mean that it works as expected or it’s not used. Debian has popcon, and the statistics it collects really help the developers decide which packages are important.

The proposal is mostly geared towards Workstation, but that already is 70% of Fedora desktop users. I hope other spins would opt-in into this proposal too, but it’s fine if that happens later. I’m sure we can establish a transparent mechanism how to decide which statistics are collected.

4 Likes

I drafted internal guidance for open source telemetry for one of the world’s largest companies. I can’t tell you what those guidelines are, and my opinion here is my own, but I hope that the context adds some weight. I’m not a lawyer, but the lawyers I worked with valued my perspectives.

I posted in the opt-in / opt-out discussion as well, but want to share some more general thoughts here.

It’s best simply to not do this at all. Corporate policies often prohibit software that collects telemetry, even when there’s reason to hope that telemetry is benign, or require users to opt-out. Adding telemetry to Fedora Workstation will mean being banned through those policies, which runs counter to the stated goal to “make Fedora Workstation the premier developer platform for cloud software development.”

VSCodium (the privacy fork of VS Code) is desirable not only because individual users don’t trust Microsoft with their data, but because corporations don’t trust them either. It’s true that many developers are using VS Code anyway, but if they’re following company policy, they’re turning telemetry off anyway.

Fedora had a friendly relationship with its users and contributors. Collecting telemetry changes that relationship. Instead of friends, it becomes “us” collecting data and making decisions vs. “them” using our product and being the recipients of our greater wisdom. Even if you hold discussions here to hammer out what data is actually collected, you’ve still turned it into “us” vs. “them”.

It’s not too late to roll this back. I hope you do. If not, there are ways to try to keep it friendly. Positive consent is the beginning of that.

9 Likes

A post was merged into an existing topic: Opt-in / Opt-Out? A breakout topic for the F40 Change Request on Privacy-preserving telemetry for Fedora Workstation