Transport independent updates

When the internet appeared as resilient network that is impossible to put down, there was no HTTP. And now, thanks to HTTP we are living in the age of worsening Splinternet - not only country institutions censor “their citizen” access to information, but they also censor “other citizen” access to information, and on top of that we have corporations like Cloudflare gladly profiting on the political narrative by providing the tools.

Because Fedora uses HTTP for updates, we are forced to play these political games just when trying to update the system. We all contribute to make this OS better, and I value time, energy and nerves more than ephemeral concepts of these mass manipulation practices. In the end nobody of those politicians saves me, Open Source developer, from poverty and holes in my pockets. They don’t care about me, but force me to “care” about them. Even this post, which is supposed to be solely technical, has tainted kernel.

So I propose to make Fedora updates transport independent. If regulators requires Fedora users to use HTTP, let them use HTTP, but for the rest of us, please make a favor and give a choice. Thank you.

1 Like

What transport would you be able to replace HTTP with that will not also be controlled?

Do you have the expertise to do a Proof-of-Concept experiment with you proposed replacement?

Any P2P protocol, where one node can request content from another node using hash. With local dir mount it could be anything, p9, NFS, WebDAV. It could be torrent, or a Merkle tree based protocol for binary synchronization. The mirrors then will cache and relay any signed blobs and dnf update –from=<uri> would be able to access any URI, making it an update node as well.

For the PoC, I only can try to “vibe"-code it if you like.

1 Like

AI coding is coming on leaps and bounds, but I’m not sure I’d trust AI to get a security critical app right.

But without a PoC this proposal is unlikely to go anywhere.

And I’m not sure P2P solves anything in the real world.

Yep, have to completely agree with it :frowning:

For example, in Russia most foreign services are already blocked: Telegram barely updates text messages without a VPN and is scheduled for total block starting April 1 2026. I have to update Fedora with VPN on otherwise download speeds are miserable and some things will surely not download at all. Government firewall will impose whitelists very soon, they are already being tested, which intends to isolate Russia from the rest of the Internet.

In some other countries similar processes are happening too :frowning:
it is obvious that ubiquitous, guaranteed international and P2P connectivity in the internet is the thing of the past (hopefully at least for some time!). The internet is falling apart into fragmented, walled “gardens”. Therefore the model where each user and each app is supposed to download all data on the fly, is obsolete.

Interestingly, Windows update allows fetching updates from another users via P2P for many years already, since Windows 10.

Should linux distros (and respective package managers) adapt to this decentralization process?

What are possible mitigation options?
users need an alternative way to get and install update files offline, like in early windows days. Offline installation itself is not a problem, right? We only need to automate 1) getting list of updates 2) process of their download as files into a local folder 3) being able to use this folder as installation source.

Perhaps a distro could publish daily lists of updated files with basic metadata (sequential number, date, name, version, size, where they come from, hash, etc) which can be fetched by some downloader to download updates locally, which can later be installed via dnf and shared with other machines as files.

the only issue I see is that each client only needs some files that were updated/published in a particular day, not all.

IPFS is theoretically built for this, though I don’t know its internals or the packaging internals well enough to know if it’s a fit. I do know that its security claims have been pretty thoroughly debunked, though it might still be a viable way to distribute binaries.

Someone would have to sort out the details of storage, including what that means for server resources. Then you would need to patch all of our repository tools to push to it. And then you would need to patch all of our end-user tools to access that storage + account for the likely “eventually consistent” nature of those files.

Certainly possible, but it would be an enormous task. If you want to push this forward you would need to do all of those initial “technology alignment” investigations. Once you have a detailed technical plan, share it and ask around for feedback.

1 Like

https://syncthing.net/ is another option to sync cache dirs. With [well-defined specs](Specifications — Syncthing documentation) it is may be easier to tune for the updates cause.

The intetesting problem is how to download only the new files. I am not sure which structure would be the best - Merkle Tree of Prolly Tree. Need to watch some video to refresh knowledge.

If we can agree that the result of the transport operation is “refreshed cache dir for the next update”, then the surface modification to the dnf could be just “check signatures and install updates from this dir”.

Syncthing could be useful for maintaining an adhoc network of mirrors, but tools like dnf won’t be able to use it directly. You would have to stand up your own mirror from the syncthing data to point dnf to.

Additionally, syncthing requires both ends to approve a connection to each other, and the protocol isn’t really designed to have more than a handful of hosts connected. So it could certainly be useful at small scales, but not in a “distribute from one main upstream to thousands of downstreams” mode.

In case anyone stumbles across this thread, I did a bit of research and found that the apt team attempted IPFS package distribution several years ago.

The problem they ran into was that they tried to post the entire mirror (an entire directory) as a single entity (a single “CID” in IPFS lingo). This is a huge nested object, and:

  • Uploading a huge nested set of hashes to IPFS was not feasible at high speed at the time.
  • Changing any single file underneath the main directory basically means that the main directory is now a different object than it was before, so you have to repeat the entire slow huge upload.

They abandoned the approach because they needed all three attributes: huge, fast, and often.

I still know very little about IPFS, but it seems to me that rather than uploading the entire directory, each package should be its own CID. Then you just need a single CID for the equivalent of repodata.xml, which points to the “set” of CIDs at that time, which are (as far as IPFS is concerned) not actually related in any way.

This would allow you to achieve “huge” without needing “fast”, and with a somewhat reasonable range for “often”.

You would need to build some special pinning software to ensure that entire mirror sets are pinned, and that old sets can be unpinned. But from a distance, this seems feasible.

1 Like

The problem they ran into was that they tried to post the entire mirror (an entire directory) as a single entity (a single “CID” in IPFS lingo). This is a huge nested object,

That’s interesting. Too bad there is no write up for the reference, so that we could prove on Fedora files that the results still hold true in 2026. How that “huge nested object” compares to index of updates on Fedora mirror?

Changing any single file underneath the main directory basically means that the main directory is now a different object than it was before, so you have to repeat the entire slow huge upload.

I don’t think so. Each file has it is own CID, so things that should be uploaded is the new file and CIDs for updated parent directories. There might be other weak points, but without a write up it is hard to remember how it was 10 years ago.

huge, fast and often need to be measurable metrics. Then it would be possible to attack them one by one by picking bits and pieces from other protocols. In fact, the Fedora update system could be a living paper, like Jupyter Notebook with experiments instead of boring documentation that doesn’t really explain things, because they are “internal”. The Fedora update process could be near real-time, but not while it relies in downloading and parsing XML files over HTTP.

IPFS was indeed slow back in the days. In the end, it still other people machines, who are serving files, and there is this central DHT table that needed to be maintained, but if speed for the end user machine is not critical, and the rest of overhead is minimal, why not to have this option? The only thing dnf needs is pluggable interface and described API/protocol of that interface.

I realize that it is oxymoron, but my point is that this distributed index may require too much memory or CPU cycles from peers to be kept fresh and updated. Again, this needs to be measured, with less text and more data (pictures).