You’re right, the schema isn’t obvious in the Azafea documentation. Each class
in the Events page in the documentation corresponds to a database table. They all have a foreign key pointing to a “channel” table.
I’m sure it is possible to configure sphinx/readthedocs/sqlalchemy to render the database schema as part of the documentation. This would be good to add.
Each event is stored individually. Each recorded event has an integer ID that identifies the row in the database, and a reference to a “channel” which in Endless OS identifies the originally-installed OS image, but no ID for each individual user, or to the batch of events it was submitted in.
If you run the code on Fedora today, all events will be attributed to the same channel with a blank image ID, because Fedora systems don’t have the Endless OS-specific eos-image-version
xattr on the root directory of the root filesystem.
Currently there is no automatic rule to expire & delete old events. This would be a great addition, combined with a batch process to aggregate the event stream at different resolutions (e.g. day, week, month). In practice it is rarely useful to look at individual events, only aggregated summaries. (A complication for Endless OS is that our users are often very intermittently connected, so we don’t know when the events for July 11th 2023 will stop arriving, and it’s actually very hard to determine this latency because the received_at time is not stored on each event…)
I’ll put some table schemas and sample data behind a cut to avoid posting an even more gigantic wall of text. I hope this is helpful!
Database schemas and sample data within
To make this concrete, while bearing in mind that the Fedora proposal does not presuppose that Fedora would record the same data points that Endless OS does, and that Fedora would define a “channel” differently, here’s the schema for the “updater failure” table:
azafea=> \d updater_failure_v3
Table "public.updater_failure_v3"
Column | Type | Collation | Nullable | Default
---------------+--------------------------+-----------+----------+------------------------------
id | integer | | not null | generated always as identity
os_version | character varying | | not null |
occured_at | timestamp with time zone | | not null |
component | character varying | | not null |
error_message | character varying | | not null |
channel_id | integer | | |
And the channel_v3
table it refers to:
Table "public.channel_v3"
Column | Type | Collation | Nullable | Default
-------------------+-----------------------------+-----------+----------+------------------------------
id | integer | | not null | generated always as identity
image_id | character varying | | not null |
site | jsonb | | not null |
dual_boot | boolean | | not null |
live | boolean | | not null |
image_product | character varying | | |
image_branch | character varying | | |
image_arch | character varying | | |
image_platform | character varying | | |
image_timestamp | timestamp without time zone | | |
image_personality | character varying | | |
site_id | character varying | | |
site_city | character varying | | |
site_state | character varying | | |
site_street | character varying | | |
site_country | character varying | | |
site_facility | character varying | | |
- The
image_id
column is the image ID
- The
site
column is an optional string-to-string dictionary which is empty by default, which is intended for use in contexts where, for example, the same OS image is deployed in computer labs in several schools in a given region and our deployment partner wants to be able to distinguish between the different schools. (We have essentially never used this feature, which needs to be manually configured on each client.)
- The
dual_boot
and live
columns are booleans with an obvious meaning (i hope)
- All the
image_*
and site_*
fields are just views on the image_id
and site
fields.
Here are a couple of randomly-selected rows from this pair of tables:
azafea=> select updater_failure_v3.*, channel_v3.image_id, channel_v3.site, channel_v3.dual_boot, channel_v3.live from updater_failure_v3 tablesample bernoulli(0.1) join channel_v3 on updater_failure_v3.channel_id = channel_v3.id limit 2;
-[ RECORD 1 ]-+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id | 540
os_version | 4.0.0
occured_at | 2021-12-21 20:38:50.478174+00
component | eos-updater-flatpak-installer
error_message | Couldn’t apply some flatpak update actions for this boot: Failed to read commit dd5cf78a2f925ba3892cc9a168e8102e1a0e16e4b824f8cfdcca8f3d654e3aa5: No such metadata object 1655f4f3e085fa547a37ad2ae3dc1d93a5d0e2a2c51d3e4785a5e03f701cecd3.dirtree
channel_id | 3262
image_id | eos-eos3.5-amd64-amd64.190408-212651.en
site | {}
dual_boot | f
live | f
-[ RECORD 2 ]-+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id | 81961
os_version | 4.0.5
occured_at | 2022-05-17 08:31:18.238816+00
component | eos-updater
error_message | Error fetching update: opcode close: min-free-space-percent '3%' would be exceeded, at least 13.3 kB requested
channel_id | 354017
image_id | eos-eos4.0-amd64-amd64.211213-144019.base
site | {}
dual_boot | f
live | f
(By the way, these are two of the most common classes of updater errors on Endless OS: corruption in the ostree repo and/or filesystem, and insufficient free space to pull the update. And yes, it would be better to report something more easily-queried than “the message field of a GError
” but perfect is the enemy of good.)
Some data is aggregated client-side before being sent to the server. Here are two randomly-selected rows from the daily app usage table:
azafea=> select daily_app_usage_v3.*, channel_v3.image_id, channel_v3.site, channel_v3.dual_boot, channel_v3.live from daily_app_usage_v3 tablesample bernoulli(0.1) join channel_v3 on daily_app_usage_v3.channel_id = channel_v3.id limit 2;
-[ RECORD 1 ]+-----------------------------------------------
id | 11797684
os_version | 4.0.10
period_start | 2023-05-22
count | 1955
app_id | google-chrome.desktop
channel_id | 138
image_id | eosoem-eos3.6-amd64-nexthw.190923-084936.pt_BR
site | {}
dual_boot | f
live | f
-[ RECORD 2 ]+-----------------------------------------------
id | 11798176
os_version | 5.0.2
period_start | 2023-05-22
count | 8009
app_id | org.chromium.Chromium.desktop
channel_id | 376
image_id | eos-eos4.0-amd64-amd64.211123-052013.base
site | {}
dual_boot | f
live | f
The count
field is a duration in seconds. (You may read this and think, shouldn’t it at least be rounded off to some lower precision? Yes, that would be great!)
We can see which other apps were used by users in those channels on that date, but we can’t tell which specific users used which combination of apps.
On that particular date, there were 1746 users in the eosoem-eos3.6-amd64-nexthw.190923-084936.pt_BR
channel and 30 in the eos-eos4.0-amd64-amd64.211123-052013.base
channel. In that month, there were 3247 and 41 respectively.
Again, on Fedora as the code stands all users would be attributed to the same channel with an unknown
image ID. This field could be used to distinguish (e.g.) silverblue from workstation; OEM-preinstall from download; and perhaps originally installed Fedora branch. But I would imagine that a channel would be defined much more broadly than on Endless OS.