F40 Change Proposal: DNFConditionalFilelists (System-Wide)

DNF: Do not download filelists by default

This is a proposed Change for Fedora Linux.
This document represents a proposed Change. As part of the Changes process, proposals are publicly announced in order to receive community feedback. This proposal will only be implemented if approved by the Fedora Engineering Steering Committee.

Wiki

Announced

:link: Summary

Change the DNF behavior to not download filelists by default. These metadata, which describe all the files contained within each package, are unnecessary in the majority of use cases. Additionally, these metadata files can be large in size, leading to a significant slowdown in the user experience.

:link: Owner

:link: Current status

  • Targeted release: Fedora Linux 40
  • Last updated: 2023-11-01
  • [ devel thread]
  • FESCo issue:
  • Tracker bug:
  • Release notes tracker:

:link: Detailed Description

Until now, filelists were always downloaded together with other metadata. This was hardcoded and unable to change from the outside of DNF.

With these changes, we are proposing to not download the filelists metadata by default. This default behavior can be modified through the new DNF configuration option. Additionally, specific commands can override this behavior and request loading the filelists metadata at runtime using the existing demands object in DNF.

Note that after this change, users can still use DNF without filelists metadata when querying file provides located in /usr/bin, /usr/sbin or /etc directories.

:link: Feedback

:link: Benefit to Fedora

As DNF is integral to various infrastructure tasks like package building and installation, testing environment creation, and server integration tests, this change significantly reduces processing time and resource usage for these processes.

This change reduces the RAM requirements of the DNF process, addressing existing issues when running the Fedora system on low-memory machines such as the Raspberry Pi (see f.e. Bug 1907030).

Also, omitting the filelists metadata download overall decreases the costs of a Fedora mirror server operation.

:link: Scope

  • Proposal owners:

    • libdnf
      • Modify the Repo object to enable conditional filelists metadata download
      • Introduce a new main configuration option to set the default behavior
    • dnf
      • Enable configuration of filelists download from commandline, DNF commands and DNF plugins
      • Implement filename pattern argument detection heuristics
  • Other developers:

    • Dependencies using the existing DNF C interface may need to adapt if they expect the filelists metadata to be available and explicitly request loading filelists using the existing API due to this change:
      • PackageKit
      • microdnf
      • API users
  • Release engineering: N/A

  • Policies and guidelines:

    • Package maintainers must follow Fedora’s packaging guidelines, particularly concerning file dependency specifications (see here)
  • Trademark approval: N/A

  • Alignment with Community Initiatives: N/A (no currently active initiatives)

:link: Upgrade/compatibility impact

In general, applying these changes should not affect any existing user workflows and no additional manual changes are required. However, the absence of filelists might create an issue with packages that are not correctly packaged or originate from third-party repositories. In the current Fedora release repository, there are only a few such packages, see the comment in Bug 2180842.

:link: How To Test

When using DNF commands without a filename pattern passed as the argument, filelists metadata should not be downloaded from the remote repositories and should not be needed for the command execution. This can be tested with the following steps:

  • Clean the local metadata cache (dnf clean metadata)
  • Run a DNF command not involving the filename spec (e.g. dnf repoquery rpm)
  • Verify that no *-filelists.* metadata files were downloaded inside the cache subdirectories (by default under the /var/cache/dnf for root)
  • Check the command works as expected

The same should also apply to RPM package arguments (files ending with .rpm extension).

When using DNF commands with a filename pattern passed as the argument, filelists metadata should be downloaded from the remote repositores as before.

:link: User Experience

Large filelists could be over 200MB in size. It could take 1-2 minutes to download which is greatly slowing down the user experience.

For many operations the filelists metadata are not needed, so downloading them is wasting the resources. Without filelists being downloaded, DNF performance will be improved significantly, mainly regarding the network, CPU and disk space resources. Metadata download size will be reduced by about 60%. The improvement includes deployments of customer built RPMS to containers that have no need for filelists level dependencies.

:link: Dependencies

No changes should be required for any package depending on DNF to implement this behavior.

:link: Contingency Plan

  • Contingency mechanism: Change the configuration option to download the filelists by default
  • Contingency deadline: Branch Fedora Linux 40 from Rawhide
  • Blocks release? No

:link: Documentation

Links to the relevant DNF CLI and API documentation sections will be provided here once the related pull request is created.

:link: Release Notes

5 Likes

Thank you for proposing this! Slow repodata download is currently my #1 complaint about Fedora, so this change would be very welcome.

3 Likes

Yeah, this is a long desired feature. Thanks for working on it!

1 Like

Yay! This is huge. Not only will this reduce files use, but (at least when I last looked at it), filename-deps were like 95% of all deps considered, so this should be a significant reduction in memory use, and probably a noticeable performance gain even if metadata is already downloaded.

1 Like

This change is targeted for dnf4 right? Will the changes also be carried
into dnf5?

So, if I’m reading the proposal correctly, it’s a backwards incompatible change?

It sounds like installing a package that has dependencies that are only resolvable with the filelists present will fail to resolve after this is implemented, and will need to be retried with filelists enabled?

I thought there was discussion about lazy-loading filelists in cases like this, which would gracefully handle the “failure” case above with no need to retry with different settings.

1 Like

Hi Kevin, the proposed behavior without filelists has been the default in DNF5 for quite some time, starting around the beginning of this year, see this PR.

Hi Fabio and thanks for your question. If you have a package that depends on filelists dependencies and you don’t have filelists metadata present, it won’t install. However, as stated in the proposal, such packages are against Fedora guidelines, and there are very few of them in the official repositories.

As for the filelists lazy loading solution, I don’t think it could be currently easily implemented in DNF itself. Predicting whether the presence of filelists would help resolve failure cases is challenging. Additional hints from the solver would be needed, but I believe this approach would be more complicated and delicate than the proposed solution.

2180842 – Broken packages without filelists => DNF5 is unable to use them with default setting is the bug — it was already linked from the Change text. It’s just a handful of packages now, so we should fix them and then this will not be a problem.

This proposal is implementing semantics of DNF5 in DNF4. I think this should be stated explicitly, so it’s clearer what is happening. As and added befit, this can be listed under Benefit to Fedora: “behaviour of dnf in F40 is closer to DNF5 which is planned for F41” or something like that.

This is not actually true if you read the packaging guidelines:

They SHOULD NOT include dependencies on other paths as that requires additional repository metadata to be downloaded.

This is a SHOULD NOT rule, not a MUST NOT rule, so packages CAN use dependencies like this if they need to, and it’s not against the guidelines.

Could we make it into a “MUST NOT” rule? I’m pretty unconvinced by the arguments to the contrary, given the benefit of avoiding them.

We could. Please file a PR with the Packaging Committee :wink:

Though note that Packaging Guidelines don’t apply retroactively.

SHOULD NOT is strong enough, if the consequences are properly explained.
Isn’t it?

I vetoed making it MUST NOT originally and I will do so again. Lazy loading file lists needs to be implemented in DNF. It was a regression from YUM that we could not do it.

Even a dead-simple way of handling this would be “if a package’s unsatisfied dependency starts with a ‘/’, turn file lists back on and retry”.

SHOULD NOT is strong enough, if the consequences are properly explained.

I don’t think so. We shouldn’t allow packages in the distribution that fail to install without manual package manager config changes. The package manager should either be fixed to dynamically load filelists or this should be a MUST NOT guideline.

I have not provided enough details, so please let me elaborate.

  1. “consequences are properly explained” was probably too brief. But
    what I wanted to say that packagers should have enough information to be
    able to make educated decision if they want to use file dependency. I
    don’t think that the file dependencies are used very extensively and I
    believe that 99% of current use cases fall into the practice described
    by current guidelines [1] and this won’t change on DNF side.

While I cannot imagine the use case, where despite this might need some
user manual intervention, the maintainer would insist on some random
file dependency. But still this is think which might always be discussed
and changed if needed.

BTW the guidelines elaborates about this as long as I remember: “They
SHOULD NOT include dependencies on other paths as that requires
additional repository metadata to be downloaded.”

  1. I still find the user manual intervention acceptable if the scenario
    is properly reported. I don’t know what error message DNF (g-s) throws
    if there is missing file dependency, but if the error message was
    helpful and provided some guidance such as “missing file dependency,
    please use --include-file-list on command line to proceed”, that would
    not be the worst UX given how rare situation this is.

[1]

Just FTR, this is rough list of packages which IMO stand out from
current guidelines:

Why we are using systems, which don’t bother to strip part of the message :triumph:

This is the list:

$ grep -R '^Requires: .*/' | grep -v \( | grep -v bindir | grep -v '/usr/bin' | grep -v '/usr/sbin' | grep -v '/etc/' | grep -v '%{_sysconfdir}'
autofs.spec:Requires: bash coreutils sed gawk grep module-init-tools /bin/ps
beakerlib.spec:Requires:   /bin/bash
beakerlib.spec:Requires:   /bin/sh
cobbler.spec:Requires: /sbin/service
cyrus-sasl.spec:Requires: /sbin/nologin
ddccontrol.spec:Requires:         /sbin/modprobe
diskimage-builder.spec:Requires: /bin/bash
diskimage-builder.spec:Requires: /bin/sh
gdm.spec:Requires: /sbin/nologin
gsi-openssh.spec:Requires: /sbin/nologin
gst.spec:Requires:       python3-pyxdg %dnl >= 0.28 # Try to run with old for now https://bugzilla.redhat.com/show_bug.cgi?id=2242522
gst.spec:Requires:       python3-requests %dnl >= 2.31.0 # Try to run with old for now https://bugzilla.redhat.com/show_bug.cgi?id=2189970
gwe.spec:Requires:       python3-pyxdg %dnl >= 0.28 # Try to run with old for now https://bugzilla.redhat.com/show_bug.cgi?id=2242522
gwe.spec:Requires:       python3-requests %dnl >= 2.31.0 # Try to run with old for now https://bugzilla.redhat.com/show_bug.cgi?id=2189970
komikku.spec:Requires:       python3-dateparser  %dnl >= 1.1.4 | https://bugzilla.redhat.com/show_bug.cgi?id=2115204
libvirt.spec:Requires: /sbin/zfs
libvirt.spec:Requires: /sbin/zpool
Lmod.spec:Requires:       /bin/ps
lxdm.spec:Requires:       /sbin/shutdown
munin.spec:Requires:       /bin/mail
openssh.spec:Requires: /sbin/nologin
os-prober.spec:Requires:       grep /bin/sed /sbin/modprobe
slim.spec:Requires:       scrot xterm /sbin/shutdown
pmount.spec:Requires:       /bin/mount
powerpc-utils.spec:Requires: /bin/grep
psad.spec:Requires: /bin/ps
redhat-lsb.spec:Requires: /bin/mailx
resource-agents.spec:Requires: /bin/mount
resource-agents.spec:Requires: /sbin/fsck
resource-agents.spec:Requires: /sbin/mount.nfs /sbin/mount.nfs4
resource-agents.spec:Requires: /sbin/ip
resource-agents.spec:Requires: /sbin/rpc.statd
resource-agents.spec:Requires: /sbin/findfs
resource-agents.spec:Requires: /sbin/quotaon /sbin/quotacheck
rt.spec:Requires:  /usr/share/fonts/google-droid-sans-fonts/DroidSansFallbackFull.ttf
rt.spec:Requires:  /usr/share/fonts/google-droid-sans-fonts/DroidSans.ttf
slurm.spec:Requires:       /bin/mailx
spamassassin.spec:Requires: /sbin/chkconfig /sbin/service
spectre-meltdown-checker.spec:Requires:   /bin/sh
systemtap.spec:Requires: /usr/lib/libc.so
uhd.spec:Requires:       %{_libdir}/wireshark/plugins/%{wireshark_ver}
bottles.spec:Requires:   ImageMagick             %dnl # https://bugzilla.redhat.com/show_bug.cgi?id=2227538
bottles.spec:Requires:   python3-chardet         %dnl # https://bugzilla.redhat.com/show_bug.cgi?id=2240292

Thank you, Zbigniew. I have added notes about that to the original wiki page under the sections “Detailed Description” and “Benefit to Fedora”.