F40 Change Request: Privacy-preserving Telemetry for Fedora Workstation (System-Wide)

This is a proposed Change for Fedora Linux.

This document represents a proposed Change. As part of the Changes process, proposals are publicly announced in order to receive community feedback. This proposal will only be implemented if approved by the Fedora Engineering Steering Committee.

Break-Out Topics!

Because this a big conversation, we have identified several important themes and created separate break-out conversations for those. For these topics in particular, please follow the appropriate thread.


:link: Privacy-preserving Telemetry for Fedora Workstation

:link: Summary

The Red Hat Display Systems Team (which develops the desktop) proposes to enable limited data collection of anonymous Fedora Workstation usage metrics.

Fedora is an open source community project, and nobody is interested in violating user privacy. We do not want to collect data about individual users. We want to collect only aggregate usage metrics that are actually needed to achieve specific Fedora improvement objectives, and no more. We understand that if we violate our users’ trust, then we won’t have many users left, so if metrics collection is approved, we will need to be very careful to roll this out in a way that respects our users at all times. (For example, we should not collect users’ search queries, because that would be creepy.)

We believe an open source community can ethically collect limited aggregate data on how its software is used without involving big data companies or building creepy tracking profiles that are not in the best interests of users. Users will have the option to disable data upload before any data is sent for the first time. Our service will be operated by Fedora on Fedora infrastructure, and will not depend on Google Analytics or any other controversial third-party services. And in contrast to proprietary software operating systems, you can redirect the data collection to your own private metrics server instead of Fedora’s to see precisely what data is being collected from you, because the server components are open source too.

Keep in mind this Fedora change proposal is just that: a proposal. It must undergo community review and must be approved by the community-elected Fedora Engineering Steering Committee (FESCo) before it can be implemented, just like any other Fedora change proposal. We welcome community participation and fully expect this proposal may need to be modified significantly depending on Fedora community feedback.

:link: Owner

:link: Current status

  • Targeted release: Fedora Linux 40
  • Last updated: 2023-07-05
  • FESCo issue:
  • Tracker bug:
  • Release notes tracker:

:link: Detailed Description

We intend to deploy the Endless OS metrics system. This blog post contains a description of how the system works. We do not plan to deploy the eos-phone-home component in Fedora.

:link: How will data collection be approved?

The proposal owners feel it is essential to ensure the Fedora community has ultimate oversight over metrics collection. Community control is required to maintain user trust. If this change proposal is approved, then we’ll need new policies and procedures to ensure community oversight over metrics collection and ensure Fedora users can be confident that our metrics collection does not violate their privacy.

We can say “we would never collect personally-identifiable data” and write software that really doesn’t collect any such data, but this alone will never be enough to ensure user confidence. We will need a metrics collection policy that describes what sort of data may be collected by Fedora (anonymous, non-invasive), and what sort of data may not be collected. Such a policy does not exist currently. We will also want to ensure the Fedora community has ultimate control over which particular metrics are collected. One option is that each metric to be collected should be separately approved by FESCo. Collection of particular metrics in a particular data format is ultimately an engineering decision, and therefore FESCo seems like an appropriate approval point. Because FESCo members are elected regularly by the Fedora community, this also provides the community with ultimate control over metrics collection via the election process. But other oversight and approval structures would work too.

:link: What data might we collect?

We are not proposing to collect any of these particular metrics just yet, because a process for Fedora community approval of metrics to be collected does not yet exist. That said, in the interests of maximum transparency, we wish to give you an idea of what sorts of metrics we might propose to collect in the future.

One of the main goals of metrics collection is to analyze whether Red Hat is achieving its goal to make Fedora Workstation the premier developer platform for cloud software development. Accordingly, we want to know things like which IDEs are most popular among our users, and which runtimes are used to create containers using Toolbx.

Metrics can also be used to inform user interface design decisions. For example, we want to collect the clickthrough rate of the recommended software banners in GNOME Software to assess which banners are actually useful to users. We also want to know how frequently panels in gnome-control-center are visited to determine which panels could be consolidated or removed, because there are other settings we want to add, but our usability research indicates that the current high quantity of settings panels already makes it difficult for users to find commonly-used settings.

Metrics can help us understand the hardware we should be optimizing Fedora for. For example, our boot performance on hard drives dropped drastically when systemd-readahead was removed. Ubuntu has maintained its own readahead implementation, but Fedora does not because we assume that not many users use Fedora on hard drives. It would be nice to collect a metric that indicates whether primary storage is a solid state drive or a hard disk, so we can see actual hard drive usage instead of guessing. We would also want to collect hardware information that would be useful for collaboration with hardware vendors (such as Lenovo), such as laptop model ID.

Other Fedora teams may have other metrics they wish to collect. For example, Fedora localization wishes to count users of particular locales to evaluate which locales are in poorer shape relative to their usage.

This is only a small sample of what we might want to know; no doubt other community members can think of many more interesting data points to collect. But note the purpose of all of the above metrics is to inform specific design decisions, not to build tracking profiles. We only need to collect data in aggregate, and have no need to associate the data we collect with particular users.

:link: Metrics transparency

Transparency is required to provide confidence that Fedora metrics collection is not creepy or invasive. Since Fedora is open source, a developer can review the source code to verify exactly what it is doing and what data is being collected. But most Fedora users are not software developers, and few software developers have time or inclination to review the source code of the operating system to see what it is doing. To retain user trust, we need an easy way for users to understand exactly what data we are collecting. We propose to maintain a documentation page showing the current metrics database schema, so users can see exactly which fields are in the database and what example data looks like.

Experienced users may gain additional confidence by building and running their own metrics collection server; all of the components of the server (discussed below) are open source, and we will provide instructions for how to run a simple server yourself and view its metrics database. You can redirect metrics from Fedora’s server to your own by changing a URL in a configuration file.

:link: User control

A new metrics collection setting will be added to the privacy page in gnome-initial-setup and also to the privacy page in gnome-control-center. This setting will be a toggle that will enable or disable metrics collection for the entire system. We want to ensure that metrics are never submitted to Fedora without the user’s knowledge and consent, so the underlying setting will be off by default in order to ensure metrics upload is not unexpectedly turned on when upgrading from an older version of Fedora. However, we also want to ensure that the data we collect is meaningful, so gnome-initial-setup will default to displaying the toggle as enabled, even though the underlying setting will initially be disabled. (The underlying setting will not actually be enabled until the user finishes the privacy page, to ensure users have the opportunity to disable the setting before any data is uploaded.) This is to ensure the system is opt-out, not opt-in. This is essential because we know that opt-in metrics are not very useful. Few users would opt in, and these users would not be representative of Fedora users as a whole. We are not interested in opt-in metrics.

To make this a little more confusing, metrics collection is actually separate from uploading. Collection is always initially enabled, while uploading is always initially disabled. The graphical toggle enables or disables both at the same time. That is, a newly-installed Fedora system will always collect metrics locally at first, but the collected metrics will be deleted and never submitted to Fedora if the user disables the metrics collection toggle on the privacy page. If the user leaves the toggle enabled, then the collected metrics may be submitted only after finishing the privacy page.

Metrics uploading will be opt-in for users who upgrade from previous versions of Fedora Workstation, because we don’t yet have a mechanism to ask the user to consent to data collection after a system upgrade like we do for new installations, but metrics collection will be opt-out. That is, your upgraded system will collect metrics locally but will never submit them to Fedora. If you visit the privacy page in gnome-control-center, then both collection and uploading will be either enabled or disabled depending on the user’s selection. Unlike gnome-initial-setup, the switch in gnome-control-center will default to off if the user has not seen the switch in gnome-initial-setup and has not previously selected a value for the setting.

This might sound complicated, but it is consistent. If the user has not yet made a decision whether to allow telemetry, we collect it locally so that it’s ready to submit if the user approves telemetry in the future, but we never upload it. Once the user makes a decision, then we either upload it or delete it and stop collecting.

:link: GDPR

It is Fedora Legal’s obligation to ensure our data collection complies with legal requirements in the jurisdictions in which Red Hat operates. This is not an obligation of the Fedora community, so there is no need to discuss GDPR rules on our mailing lists. The proposal owners will not respond to mailing list posts that discuss GDPR or similar legal obligations during this change proposal discussion. In short, let’s keep discussion focused on what Fedora SHOULD or SHOULD NOT do, rather than what we MUST or MUST NOT do.

That said, Fedora Legal has determined that if we collect any personally-identifiable data, the entire metrics system must be opt-in. Since we are only interested in opt-out metrics due to the low value of opt-in metrics, we must accordingly never collect any personally-identifiable data. We must also not collect any data that could become personally-identifiable if combined with other data, which notably means IP addresses must not be stored. We only want collect anonymous data anyway, but we need to be especially mindful of the possibility that combining two “anonymous” data points could result in the data no longer being anonymous.

:link: Fedora data collection policy

Fedora Legal requires that we publish a Fedora data collection policy separate from the existing Fedora Privacy Policy, which is designed to address usage of Fedora websites. This is currently a work in progress that we’re not quite ready to share yet. You can expect it to be very short and very generic.

:link: Metrics server infrastructure

We propose to deploy Azafea, the open source metrics collection server used by Endless OS. An Azafea deployment consists of five components: an nginx proxy server, azafea-metrics-proxy, redis, azafea itself, and a Postgres database. nginx proxies HTTP requests to azafea-metrics-proxy, which is itself a simple HTTP server that adds metrics into the redis database, where they will be fetched by Azafea and stored into Postgres. We will provide instructions on how to set up your own server and see for yourself what data gets collected.

:link: Metrics client infrastructure

The client side consists of eos-metrics, eos-event-recorder-daemon, and eos-metrics-instrumentation. eos-metrics is a D-Bus interface that applications and services may use to record events, plus a GObject library that provides a simple API around the D-Bus interface. eos-event-recorder-daemon is the service that actually implements this interface: it collects incoming metrics, batches them together, and sends them to the metrics server at predefined intervals. eos-metrics-instrumentation is the component that actually collects specific metrics. Originally, we had planned to not use this component and instead write our own fedora-metrics-instrumentation that would collect only a few particular metrics that are approved via Fedora community process. However, currently we are planning to ship eos-metrics-instrumentation and instead ensure that it is not collecting more metrics than would be acceptable to the Fedora community. A review process to decide which metrics to collect and which metrics to disable will be required.

:link: Data set considerations

Although we assume the metrics server administrator is not malicious and will not actively attempt to deanonymize users, we will still take reasonable precautions to make it difficult to correlate metrics to a particular user, starting by not storing any IP address information in the metrics database. Additionally, each metric that we collect will be considered individual, non-correlatable data by default, unless approved to be correlated with particular other metrics via future Fedora community process. That is, if a user submits two data points, we usually don’t want the ability to know that these data points were both submitted by the same user.

Each metric is stored in the database with a Unix timestamp indicating when it was generated on the client. If abused, this timestamp could allow correlation of data points that are collected at the same time as each other, or at a fixed time offset to other events. For example, if the system were designed to collect two metrics exactly 300 seconds after the system were booted, then just looking at the timestamps would be enough to determine that both metrics recorded at the same time were submitted by the same user. Accordingly, we should consider modifying the metrics server to reduce timestamp granularity at least somewhat.

:link: History

Currently Fedora’s only form of metrics collection is DNF Better Counting, but this only counts Fedora installations. That is useful, but we want to count more than just how many users we have.

Fedora’s first metrics collection attempt was Smolt, a precursor to hw-probe which collected data on user hardware. The current proposal is different from Smolt because it will collect more than just hardware data, and also because Smolt collected only opt-in data. The current proposal would be opt-out, not opt-in.

This change proposal will likely be compared to the Ubuntu spyware complaints from a decade ago, when Ubuntu desktop users’ search queries were sent to Amazon by default. Let’s not do that.

:link: Feedback

We will endeavor to update this section of the change proposal to include a summary of Fedora community discussion of this proposal.

:link: Benefit to Fedora

The main benefit to Fedora is that we will be able to use collected metrics to inform design decisions. It is very common for developers to wish to know something about how Fedora software is used, and we will finally have a way to answer such questions.

Occasionally, Red Hat might need to collect specific metrics to justify additional time spent on contributing to Fedora or additional investment in Fedora.

:link: Scope

  • Proposal owners:

This change requires substantial technical and nontechnical work from the change owners. Most notably, we will need to package eos-metrics, eos-event-recorder-daemon, and eos-metrics-instrumentation properly for Fedora; they are currently packaged in a copr. We also still need to modify eos-metrics-instrumentation so that it does not send events not approved for use in Fedora, as we expect to collect less data than Endless OS.

  • Other developers:

This proposal will require substantial effort by Community Platform Engineering (CPE) to host the metrics server infrastructure.

  • Release engineering: #11514

  • Policies and guidelines: New processes and guidelines are proposed above under the section “How will data collection be approved?”

  • Trademark approval: N/A (not needed for this Change)

  • Alignment with Objectives: This change does not align with any current Fedora Initiatives, which are very limited in scope. That said, one of the main purposes of metrics collection is to determine whether we are achieving other objectives not listed on the wiki page. For example, we want Fedora Workstation to become the premier developer workstation operating system. To that end, we want to know how many of our users are using particular IDEs.

:link: Upgrade/compatibility impact

We would like to enable metrics upload for upgraded systems, but this isn’t trivial because we want to obtain user consent before enabling metrics upload. This would require us to design a user interface that would run on upgraded systems and present the setting to users. We have not yet created such a user interface, so for now metrics upload will need to default to disabled for systems upgraded from older versions of Fedora. Since the underlying setting will be off by default, we don’t need to do anything special to achieve this.

:link: How To Test

The ultimate goal is to see metrics appear in the Postgres database of a metrics server, but configuring and running the server is not trivial. Accordingly, we propose to publish a separate document detailing how to set up and configure a metrics server for testing purposes, how to redirect metrics to the custom server, and how to force the client to immediately submit metrics to ease testing. Although we don’t actually expect many community members to seriously run their own metrics servers, we still want to document the steps involved so that interested developers can see exactly how it works.

:link: User Experience

A new metrics collection setting will be added to the privacy page in gnome-initial-setup and also to the privacy page in gnome-control-center. This setting will be a simple toggle that will enable or disable all metrics upload for the entire system. Users who do not want any metrics upload should feel confident that uploading can be disabled with a simple toggle.

Fedora users should be confident that Fedora metrics collection respects their privacy and collects only limited, anonymous usage data.

:link: Dependencies

Any package that wishes to collect a metric would need to depend on eos-metrics. For example, if we were to collect statistics on which system settings panels are used most frequently, then the gnome-control-center package would need to depend on eos-metrics in order to send a metric to eos-event-recorder-daemon.

:link: Contingency Plan

  • Contingency mechanism: We would need to remove the eos-metrics, eos-event-recorder-daemon, and eos-metrics-instrumentation packages from the workstation-product comps group, and rebuild any packages that gained a dependency on eos-metrics.
  • Contingency deadline: Beta freeze
  • Blocks release? Yes, if the change is incomplete, it will need to be reverted before release.

:link: Documentation

This feature will depend on several different upstream projects with varying amounts of documentation.

The client side consists of eos-metrics, eos-event-recorder-daemon, and eos-metrics-instrumentation. The best documentation of eos-metrics available online is its D-Bus interface XML. eos-metrics also contains normal API documentation that will be built and installed in a docs subpackage, but this is not currently available online. The eos-event-recorder-daemon and eos-metrics-instrumentation components do not appear to have any online documentation.

On the server end, the metrics server consists of azafea-metrics-proxy feeding metrics into redis, where they will be pulled by azafea and then added to a Postgres database. Documentation for azafea-metrics-proxy and azafea can be reviewed online. Events recognized by the server are documented here. Note that this documentation is currently focused on use by Endless OS rather than by Fedora, and includes documentation of many events that are no longer sent by Endless OS. This change proposal does not propose to enable sending any particular events in Fedora.

:link: Release Notes

Release Notes are not required for initial proposal. We need to write the release notes before change freeze.
1

8 Likes

Perhaps this is implicit in the use of eos-* but I seem to be missing a list of what metrics would be collected exactly and what is contained in messages to/from Fedora infrastructure related to these metrics.

Is this change request meant to discuss the general idea and acceptance level of adding opt-out metrics collection in Fedora?

One of the main goals of metrics collection is to analyze whether Red Hat is achieving its goal to make Fedora Workstation the premier developer platform for cloud software development.

Could you please motivate on how metrics collection on which IDE is ran on Fedora systems would help Red Hat achieve making Fedora the premier developer platform for cloud software development?

Occasionally, Red Hat might need to collect specific metrics to justify additional time spent on contributing to Fedora or additional investment in Fedora.

Fedora is upstream; collecting these metrics on RHEL systems seems like a saner place to put them if it’s to steer Red Hat prioritizations?

10 Likes

97 posts were merged into an existing topic: Opt-in / Opt-Out? A breakout topic for the F40 Change Request on Privacy-preserving telemetry for Fedora Workstation

I’m not proposing to collect any of these particular metrics just
yet, because a process for Fedora community approval of metrics to be
collected does not yet exist.

Right. Also, to start discussion of how particular metrics should be approved. I don’t intend to propose collection of any particular metrics until we have some agreement on how exactly metrics should be approved.

For example: we’ve been investing in developing tools like GNOME Builder and Toolbx. How much are these tools being used? Do we need to increase or reduce investment in these areas?

We already have a separate telemetry system for RHEL (Red Hat Insights) that, among other purposes, is used to prioritize RHEL development. But the system I’m proposing here would be for Fedora, not for RHEL. RHEL users are very different from Fedora users, and desktop development should primarily prioritize the needs of Fedora users.

2 Likes

A post was merged into an existing topic: Decision-Making, Governance, Council, Red Hat — a breakout topic for the F40 Change Request on Privacy-preserving telemetry for Fedora Workstation

First of all, are we talking on imposing this on users or will users have the possibility to opt out/in?

Especially the latter is very probabilistic, and sufficient of the first can still achieve clear profiles. You can create a lot of mistrust in European countries, especially if it is complemented by points like this:

I think this proposal is very dangerous in many non-US countries and its impact is hard to foresee. With that, I mean if and how many users will leave, but also what happens to their data. That the GDPR and such regulations are a “must” and no “should” is a personal opinion, not a fact. Please do not impose an opinion on others, making it a fact the community shall not discuss about.

Be aware that data collection in the US has on itself a problematic reputation especially in Europe. This is also an issue of different cultures.

Additionally, I am not sure how many people use Fedora also for work/business and related data: this may force them to switch to another system if this is not cared for. This has to be transparent, too. It is not acceptable imho to tell them that it is their problem to care for GDPR.


However, if the community agrees to that, I think a user should have the possibility to opt out during installation (better: opt in), and it should not be automatically enabled on updates or so.

Also, this is a matter of nudging: if you force people, they will often reject things they would accept if you leave them the choice during installation, and add good points about how this serves the community and is controlled by public policies of the community and such triggers to convince (and add a clear policy).


In either case, I think before bringing this to the community to make a decision, it needs to be much clearer what data is collected and so on: a policy (what, when, how, if, options) should be part of the proposal, not its next step after the decision. The same for what can happen to the data and about what anonymous or non-invasive is. Especially the latter two are critical and complex tasks…

Given the proposal as it is, I have to say that I do not see the responsibility from the owners I would expect for collecting any data. It asks for a general agreement without answering the related questions, and even excludes some (making it the problem of the users). Thus, the proposal already excludes …

… during the ask for agreement.

5 Likes

I understand that you’re trying to move in a step-wise direction, but I think some more transparency on what you expect to end up collecting and how you intend to use this data would make the discussion more productive.

5 Likes

35 posts were merged into an existing topic: Decision-Making, Governance, Council, Red Hat — abreakout topic for the F40 Change Request on Privacy-preserving telemetry for Fedora Workstation

What, if any, impact is there if the community said no to a specific metric?

I read this proposal as a “cart before the horse” type of issue - the problem statement (i.e. the metrics that need to be gathered) should be what defines the ask for telemetry to solve for X and if there is an alternative path to gather the needed data, that should be reviewed and proposed.

2 Likes

My expectation is that Fedora community members who want access to particular data for some particular purpose should be able to request access. But that is something we’ll need to discuss (this thread is a good place) and figure out.

The timing is mostly a coincidence; I wrote most of this change proposal in August 2021, and it just took a very long time before I felt ready to propose it. At the time we were hoping that having telemetry would make it easier to justify increased investment in Fedora. Unfortunately, with the way things have been going, that seems unlikely now. :frowning:

Red Hat Display Systems Team

I have a slightly different goal: Fedora Workstation should be the premier Linux workstation operating system, period. :wink: That position is undisputedly held by Ubuntu currently.

That was the goal of the “What data might we collect?” section (above), which describes various metrics that I’ve seen requested thus far. I fully expect we would wind up collecting much more than this, but what exactly we collect would be up to the Fedora community to determine. My expectation is that Fedora packagers and other community members will have lots of use cases for data collection that I don’t know about yet.

It’s very common for developers to ask “how many of our users are doing X, Y, or Z?” and then complain that we don’t know. E.g. earlier today a user was requesting feed reader integration in Epiphany. I wanted to know how many users have a feed reader installed (because otherwise, it would be useless). A couple weeks ago, GNOME wanted to drop support for systems that don’t support SSE4 instructions (for GNOME OS and the GNOME Flatpak runtimes); we asked Endless to show us CPU data, and learned that a significant amount of their users would be unable to run Flatpaks, and so dropped that plan.

For avoidance of doubt, I don’t have any secret plan for what additional data to collect beyond what’s already mentioned in the change proposal. But I’m sure others at Red Hat and also Fedora developers who do not work for Red Hat do have some ideas. This thread would be a good place to informally propose particular metrics that you might want to collect.

What’s really important to me is that however we wind up deciding what data to collect, it should be done publicly via some sort of Fedora community process, and not wind up with Red Hat making these decisions in secret on its own. This thread is also a good place for proposals on how exactly that would look. In this change proposal I wound up proposing that FESCo would approve metrics, but there are many other ways we could do it.

Edit: This post originally said “SSE3” but it was really SSE4.

4 Likes

I like your idea. But I am not sure how to achieve that with such metrics? Ubuntu can be modified by vendors so that it can be easily put on their hardware, also its kernel is modified in a way that can make things easier for the users. I love that Fedora retains the stability and security guarantees of the vanilla kernel, but the majority of average users focus on the more obvious things. There are several reasons why superseding Ubuntu ain’t that easy.

I expect that in Europe, any imposition of data collection with data transfers to the US (and any data collection at all) will deter a noteworthy amount of people, may it be for rational and/or irrational reasons. So I do not see how this means will support that goal. The opposite could be the case in some parts of the world.

2 Likes

47 posts were merged into an existing topic: Approaches to data handling, safety, and avoiding individual identification — a breakout topic for the F40 Change Request on Privacy-preserving telemetry for Fedora Workstation

Well that depends on each particular metric. E.g. if we don’t collect gnome-control-center panel usage data, then the GNOME design team has less information to use to make design improvements. They’ve been trying to move frequently-requested settings from Tweaks into gnome-control-center but have been having trouble because user studies indicate users are already having difficultly navigating the high number of settings. So some consolidation of panels is needed. But (a) what settings to consolidate or remove? how do we decide without usage metrics? And (b) how do we know that the design changes were successful after they have been implemented? With respect to (a) specifically, I’ve seen proposals to remove most of the Sharing settings. So let’s say those settings are at risk without telemetry. (I don’t remember offhand if the Sharing settings are actually still at risk, but it’s a real non-hypothetical example of a design change where telemetry would be helpful.)

Sticking with the example of gnome-control-center panels: how else would you gather panel usage data from a representative sample of users? (i.e., not from users who choose to respond to a survey, since those users are not going to be representative of typical Fedora users.)

9 posts were merged into an existing topic: What data will be collected, exactly? — a breakout topic for the F40 Change Request on Privacy-preserving telemetry for Fedora Workstation

So, I was initially wary of this change myself, and wanted to scream “Hell No” from the rafters. If you haven’t, I highly recommend reading the linked blog post, Endless OS’s privacy-preserving metrics system – Will Thompson

Data is valuable. That’s why everyone has opinions about their privacy, and what they want to do with their personal data. And the problem is this data is valuable to different groups in different ways.

While yes, technically a Red Hatter is proposing this change, this is one team member from a single small team inside Red Hat (Red Hat Display Systems Team). This can hardly be considered “Red Hat forcing a change,” especially since Red Hat gives us autonomy on our involvement in upstream projects, even if it goes against the interest of Red Hat. I believe that the author has a positive intent, and an inherent desire to truly improve the situations they can affect.

There is immense power here, to improve the average user interaction with Fedora and improve user retention and success. A better Fedora is (imo) in the best interest of everyone. Using data to identify and guide development investments is a significant win.

“But at what cost” - This is the scary part. Telemetry is not inherently bad, evil, etc… and can be used in very negative ways. But for it to be valuable, it has to be truly representative of the situation. Simple feedback surveys and opt-in metrics will never get you statistically significant data to apply to the broad user base and guide design decisions.

Opt-out is, unfortunately, the best way to get this representative sample. A significant amount of people will just install the defaults and go about their day. They just won’t care if telemetry is collected or not.

But what does opt-out mean? In my mind, along the lines of the example in the provided blog. A blatantly obvious installer step that is direct and explicit, and explains in sufficient but terse detail what’s going on. Then, another simple and obvious option for post-install to explicitly disable it.

As far as the stored data, I don’t think it should be viewable in detail by the general public, I’m worried that enough granularity would provide those with the time and resources the ability to find weaknesses. (Something else I changed my mind on thinking about it). BUT I do think reporting and an interface to query parts of the data will be beneficial in putting the general public at ease that this data is being used correctly and appropriately.

As already evident by the chatter on this thread and the mailing list, people have STRONG opinions about this, and if approved, we would want to ensure we took great care in doing this the right way and for the right reasons.

In summary, I do think telemetry would be valuable to Fedora, and would only be truly usable in an opt-out state. But the devil will be in the details of implementation and metrics, and a very fine line would need to be walked early on to ensure success.

10 Likes

(Copying from devel@, as I think this is useful here)

On Thu Jul 6, 2023 at 20:17 CDT, Michael Catanzaro wrote:

I’m attaching a screenshot to give an idea of what this would look like in
gnome-initial-setup. I don’t have a gnome-control-center screenshot handy, but
it would be similar, except there it would default to off.

3 Likes

Telemetry is great and useful, just ask nicely, explain what information you’re collecting, and don’t enable it by default. It would also be helpful to say that collection of this data is to improve the users’ experience when asking for it.

Honestly, love that post. You basically outlined what I thought, and am thinking, to a T. And I agree that telemetry would be important to Fedora, but that the line here would be very thin.

7 Likes

Does this concern only GNOME? Whatever the new default is, will it be applied on spins as well?