First, thank you for the verification you did (something that I woudn’t do :P). I really didn’t want to make you go a long way analyzing this. As I said in the beginning, the problem is that nobody is going to work on it anyway, it almost doesn’t matter how great and efficient it can be.
IF there was anybody who’d want to work on it, at least some of the issues you mentioned could be fixed (and for the others, someone should decided about which tradeoffs to make). For example, I think the exponential growth of drpm files can be fixed; e.g. the combinedeltarpm could happen on the user’s system before applying the generated rpm.
That’d be great, if there is going to be really active development. But the PR is really stalled since 2020. I wonder if upcoming Fedora changes in ostree distribution method will push it to become active again.
That’d lose you a lot of the savings — if the delta between two revision bumps of a 66MB package is 14MB, then the best hope of really reducing download sizes is if combined deltas add up to significantly less than the sum of their sizes.
(Which should be the case. If an A..B.drpm is 14MB, and a B..C.drpm is another 14MB, then the size of the A..C.drpmcan be anywhere from 14MB to 28MB.1 It all depends on how much overlap there is in what’s changed by both of those deltas.)
Hopefully… and (fortunately) also probably… it would be closer to 14MB than 28MB. Many of the things that change in the package with each new version build are likely to be the same things (the dynamic elements of the package). So by actually creating all of the A..C.drpm files and making that available directly, you get the maximum possible download size reduction.
Notes
(Technically the possible size of A..C.drpm is anywhere from 0KB to 28MB. In the degenerate case where a lot of things are changed in A..B, then those changes are all reverted in B..C, the two deltas could “cancel out” to almost nothing. But in practice, combined deltas smaller than the individual deltas being combined would be a rarity.)
For the Fedora situation specifically, what you could do to rein in the explosion of .drpms is throw away all those deltas that end somewhere other than the current package version. (After you’ve combined them with the newest delta.)
In the example above, after that 5th version you actually only need to keep 5 deltas out of the possible 16: A..F.drpm, B..F.drpm, C..F.drpm, D..F.drpm, E..F.drpm.
All of the others, the ones that produce some package version other than the latest, are no longer useful. Those “end” version packages are no longer available even in the full .rpm form.
Next version, after you’ve generated F..G.drpm, then combined it with all of the other deltas, you can throw away the set of *..F.drpm files.
For the Fedora situation specifically, what you could do to rein in the explosion of .drpms is throw away all those deltas that end somewhere other than the current package version. (After you’ve combined them with the newest delta.)
True, that’d be nice. Even if older versions of packages were there, it was not necessary to provide drpm support for them since it’s not very common.
We could even limit the number of from versions, either by a count (e.g. only 3 preior versions), or probably more appropriately, by the time of release of the previous versions; e.g. no need to provide deltas if the user has not updated the package for more than 3 months.