F41 Change Proposal: Upgrade systems to createrepo_c 1.0 and change repositories metadata settings

Upgrade systems to createrepo_c 1.0 and change repositories metadata settings

This is a proposed Change for Fedora Linux.
This document represents a proposed Change. As part of the Changes process, proposals are publicly announced in order to receive community feedback. This proposal will only be implemented if approved by the Fedora Engineering Steering Committee.

Wiki
Announced

:link: Summary

This is a proposal for upgrading systems which produce composes to createrepo_c > 1.0 and changing some options used to create Fedora repositories metadata. Note that some of these changes are inevitable due to createrepo_c >= 1.0 behavioral change. We aim to change both Rawhide/F41, then move all following releases to the new settings, while preserving most of the current settings for releases <= 40.

:link: Owner

:link: Detailed Description

With createrepo_c < 1.0 weā€™re currently using different settings for Rawhide repository metadata and stable releases repository metadata.

  • Rawhide

    • gzip compression for all metadata
    • no updateinfo.xml file is generated (no updates repository exists)
    • sqlite database of repodata is generated and compressed with gzip as well
    • comps repodata is made available both as uncompressed xml and gzipped
    • zchunk is active
    • DRPMs disabled
  • F40

    • gzip compression for primary metadata
    • updateinfo.xml (for updates repository) is compressed with XZ
    • sqlite database of repodata is generated and compressed with BZ2
    • comps repodata is made available both as uncompressed xml and gzipped
    • zchunk is active
    • DRPMs disabled (approved change for F40)
  • F<40

    • gzip compression for primary metadata
    • updateinfo.xml (for updates repository) is compressed with XZ
    • sqlite database of repodata is generated and compressed with BZ2
    • comps repodata is made available both as uncompressed xml and gzipped
    • zchunk is active
    • DRPMs enabled

With createrepo_c > 1.0 moving to zstd as the default compression type, we want to have consistent settings for all new releases and to specify those settings manually, so that possible future changes of defaults donā€™t cause unexpected breakages. So we propose the following settings:

  • Rawhide/F>=41

    • use --general-compress-type set to zstd to compress all metadata to zstd
    • updateinfo.xml (for updates repository) compressed with zstd as well
    • disable generating the additional sqlite database, as it was only useful for yum
    • comps repodata will be available only zstd compressed
    • zchunk is active
    • DRPMs disabled
  • F<=40

    • nothing should change by using the --compatibility flag of createrepo_c >= 1.0

Note that updating bodhi-composer to use createrepo_c >= 1.0 will also introduce an unavoidable change into EPEL8 updates repositories: comps repodata will be made available only in compressed format (which is set to XZ in EPEL8). Other EPEL releases (EPEL7 and EPEL9) can use the --compatibility flag to maintain actual settings.

:link: Feedback

The zstd compression type was chosen to match createrepo_c settings. As an alternative, we might want to choose xz, especially after zlib-ng has been made the default in Fedora and brought performance improvements.

The sqlite database distributed alongside the repodata is not useful for dnf, but it might be used by some external consumer weā€™re not aware. Please do let us know.

:link: Benefit to Fedora

Weā€™re aiming at having consistent defaults for Rawhide and stable releases and avoid future changes to createrepo_c defaults to cause unexpected changes to repodata. Also, by using better compression methods and avoid generating the sqlite database weā€™re reducing repodata disk usage.

:link: Scope

  • Proposal owners:

    • change createrepo_c settings for Rawhide in compose-rawhide01 (which is already upgraded to f39)
    • upgrade bodhi-composer to f39 and have it using createrepo_c >= 1.0.0
    • upgrade (if a new release is released in time) or backport patch into bodhi to support all createrepo_c settings
    • change bodhiā€™s createrepo_c.ini in ansible to use the new settings for F>=41
  • Other developers:

  • Release engineering: #Releng issue number

  • Policies and guidelines: N/A (not needed for this Change)

  • Trademark approval: N/A (not needed for this Change)

  • Alignment with Community Initiatives:

:link: Upgrade/compatibility impact

No change should be noticed while upgrading from previous releases.

:link: How To Test

DNF normal day usage should not be affected: upgrading/installing packages should work as before, maybe a little faster in downloading repodata.

:link: User Experience

User experience should not be affected.

Possible external custom repodata consumers might stop working and will need to adjust to the new compression method.

:link: Dependencies

:link: Contingency Plan

  • Contingency mechanism: revert back to old createrepo_c < 1.0 and previous settings and wait for the next day compose.
  • Contingency deadline: F41-beta freeze
  • Blocks release? Yes

:link: Documentation

https://docs.pagure.org/pungi/configuration.html#createrepo-settings

:link: Release Notes

Last edited by @mattdm 2024-04-04T22:14:28Z

Removed f39

A preliminary discussion about this change was held here.

The zstd compression type was chosen to match createrepo_c settings. As an alternative, we might want to choose xz, especially after zlib-ng has been made the default in Fedora and brought performance improvements.

Did you mean gzip? Iā€™m unsure how performance improvements with zlib-ng would lead to xz / lzma being more attractive.

zstd is slightly less efficient at compression than xz / lzma but much faster to compress and decompress than either xz / lzma or gzip (even when implemented via zlib-ng).

The sqlite database distributed alongside the repodata is not useful for dnf, but it might be used by some external consumer weā€™re not aware. Please do let us know.

mdapi needs to be updated Don't rely on .sqlite metadata Ā· Issue #97 Ā· fedora-infra/mdapi Ā· GitHub

I think itā€™s the only other consumer though.

nothing should change by using the --compatibility flag of createrepo_c >= 1.0

Minor change:

createrepo_c --compatibility w/ createrepo_c 1.0 doesnā€™t bother generating compressed comps metadata at all, it just provides the uncompressed version (because this was much simpler to implement).

But, because the comps.xml handling in createrepo_c < 1.0 is broken for any compression type other than gzip and because of this problem dnf does not recognize the existence of such metadata, this is not actually a change in practice.

https://bugzilla.redhat.com/show_bug.cgi?id=1904360

Could we give this change a bit more of a specific name? To me (and I think to releng), we tend to think of ā€œthe composeā€ as the entire process of building a whole Fedora every day (actually, several Fedoras, but never mind). This seems pretty much specifically about repository metadata.

1 Like

Iā€™ve set up the wiki page with that title, then changed the in-page title to " Upgrade systems to createrepo_c 1.0 and change repositories metadata settings", but I donā€™t know how to rename the wiki pageā€¦ or do you mean to change the name of the thread here?

This change proposal has now been submitted to FESCo with ticket #3189 for voting.

To find out more, please visit our Changes Policy documentation.

So, yesterday together with @kevin we deployed the new createrepo settings (and updated bodhi-backend to F40 / createrepo_c > 1.0).

Iā€™ve just checked the latest composes results and while Rawhide compose is still running Iā€™ve spotted an unexpected difference for both stable Fedoras and EPEL composes: despite using the --compatibility flag, comps repodata is provided only in uncompressed xml and zcunk forms, while the old createrepo_c provided uncompressed xml, zcunk and compressed xml (using compression method defined for the release).
Do you think this could be of any trouble?

Iā€™ve also noticed another difference in EL8: the primary.xml, filelist.xml and other.xml are zstd compressed instead of the expected gz. I donā€™t think thereā€™s a way to fix that behavior while preserving other settings, so, again, do you think this could be of any trouble?

@mattia I said as much, and explained why it was the case, and why it wouldnā€™t make a significant difference at the bottom of my post on March 20

To go a bit deeper so you donā€™t need to dig into the BZ and github issues:

pre-1.0, createrepo_c was buggy in that it would append a suffix to the end of the name of the groups metadata file with an abbreviation of the compression type. The ā€œnameā€ here referring to the one used in repomd.xml

e.g. group_gz, group_xz, group_bz2

The ā€œbugā€ is that yum / dnf only ever recognized one special case here, which is ā€œgroup_gzā€, and all others would be ignored. So if the metadata was XZ compressed, createrepo_c would generate uncompressed comps.xml ā€œgroupā€ and compressed comps.xml ā€œgroup_xzā€, and the latter would be completely ignored.

The root of it though, is that pattern completely diverged from the handling of all other types of metadata (where the metadata file can have any compression type and not be renamed). ā€œgroupā€ metadata should be treated as any other metadata which can be compressed. And DNF did fix this, but yum did not.

createrepo_c 1.0, like DNF now treats ā€œgroupā€ metadata the same as every other metadata type. However, because ā€œyumā€ doesnā€™t support that, --compatibility leaves it uncompressed. When I submitted that patch I skipped generating compressed comps metadata entirely because it wasnā€™t worth the effort or additional complexity for the sole benefit of EL <=7 clients.

Of course in this case youā€™re using it for clients that arenā€™t EL7 as a shorthand for overriding the compression options, but itā€™s still not a regression because Fedora has been using xz compression, and therefore DNF has been downloading the uncompressed comps.xml this entire time anyway. As per 1904360 ā€“ dnf not downloading compressed group metadata

Iā€™ve also noticed another difference in EL8: the primary.xml, filelist.xml and other.xml are zstd compressed instead of the expected gz. I donā€™t think thereā€™s a way to fix that behavior while preserving other settings, so, again, do you think this could be of any trouble?

Do you mean:

  • On EL8, the createrepo_c 1.0 tool generates zstd compressed metadata by default, which is unexpected, or

  • Unexpectedly, the repos for EL8 are being generated with zstd compressed metadata

If the former, I would say it is expected for a new major version of a tool to have different defaults. EL8 shouldnā€™t be pulling in a new major version of the package though.

If the latter, it should still work (because RHEL 8 supports zstd compressed metadata since I believe 8.2) but you might want to avoid changes like that anyway - and depending on how itā€™s configured, it could be strange behavior or it could be misconfiguration.

To go a bit deeper so you donā€™t need to dig into the BZ and github issues:

Thank you for the detailed explanation. My concerns were more about possible ā€œexternalā€ consumers other than dnf/yum. I tried to write the new settings to minimize any changes in the stable repositories metadata and I thought using the --compatibility flag would have made things exactly as before.
Iā€™m glad that this small change will not affect dnf/yum at all, but I also want to raise awareness about this unexpected (to me) change that I didnā€™t mention in the change proposal. I donā€™t think itā€™s so critical for anyone using a custom consumer as they can always switch to use the uncompressed file.

Do you mean:

  • On EL8, the createrepo_c 1.0 tool generates zstd compressed metadata by default, which is unexpected, or

  • Unexpectedly, the repos for EL8 are being generated with zstd compressed metadata

EL8 has always had different settings from other EPEL.
EL8 use --compress-type=xz, while other EPELs have/had --compress-type=gz/--compatibility.
Previously repodata for EL8 had: [primary,filelist,other].xml.gz other things like the sql databases compressed with xz.
Now it has: [primary,filelist,other].xml.zst other things like the sql databases compressed with xz.

We canā€™t use --compatibility otherwise everything will switch to gz, but we cannot use --general-compress-type otherwise everything will switch to xz.
I donā€™t think thereā€™s a way to recreate the mixed compression as before (?).

We canā€™t use --compatibility otherwise everything will switch to gz, but we cannot use --general-compress-type otherwise everything will switch to xz.

You should still be able to use the standard compression options --compress-type and --general-compress-type in combination with --compatibility. The latter only changes the default values, the former will always override it.

edit: The help text isnā€™t super clear on that point, so Iā€™ll file a PR to make it more obvious.

So yeah, this is breaking some people on epel8ā€¦ who are using createrepo_c older than 1.0 to regenerate the repodata.

https://pagure.io/fedora-infrastructure/issue/11917

https://pagure.io/releng/issue/12097

Would it be useful to backport zstd support to pre-1.0 createrepo_c (without any of the associated changes of defaults)?

libsolv (and hence dnf) on EL8 and EL9 already supports it, but obviously itā€™s not possible to generate / reprocess the metadata without createrepo support for doing so.

Yes, that might be nice I guess. Others might hit this case trying to use rawhide or fedora repodata on rhelā€¦

So the above issues, one seems like mergerepo_c in koji, so it would be helped by a zstd backport, but the other one (and ones like it) seem to be due to the users having libsolv-0.7.22-6.el8, whichā€¦ I am not sure where they are getting. ;(

In the mean time I enabled ā€˜general-compress-typeā€™ for epel8 and pushed epel8-testing for people to test with. As expected itā€™s showing xz for everything now (except the updateinfo bodhi makes, which is still bz2). I donā€™t know if that will ā€˜fixā€™ things for the cases above, or make it worse. :wink: I guess we could also just go back to gz there entirely like other epels.

So the above issues, one seems like mergerepo_c in koji, so it would be helped by a zstd backport, but the other one (and ones like it) seem to be due to the users having libsolv-0.7.22-6.el8, whichā€¦ I am not sure where they are getting. ;(

I bet I can answer that. The user said they were mirroring EPEL, thereā€™s a good chance theyā€™re using Katello / Satellite, which ships its own libsolv (I know I knowā€¦ thatā€™ll be cleaned up when they move to EL9) and the specfile for that package uses outdated configuration that doesnā€™t enable zstd support.

I filed an issue about that a few months ago, hopefully this will be a gentle nudge to get it dealt with :slight_smile:

Yeah. :frowning:

In the mean time the change to xz should hopefully work for them?

Is it intentional that Fedora 41 ā€œupdatesā€ repos use zstandard (the new default), but ā€œreleaseā€ repos seem to still use gzip?

Yes, see PR#1410: Change createrepo_c settings to zstd - pungi-fedora - Pagure.io