Hey all
TL;DR: I’d like to plead the case for the recently adopted-into-C2y N3366 to make it into RHEL 10, even though this is perhaps a hail mary operation.
I’m opening this here because I don’t really know of a better place to start this discussion. But since RHEL essentially branches off from Fedora at some point, it’s a necessary precondition to get it into Fedora anyway – though I have no idea if the decision where to branch off RHEL 10 has already been made (and if so, I assume it cannot be communicated yet). If that ship has sailed already; too bad. The reason I’m opening this thread is the sliver of hope that it hasn’t yet, and perhaps the small chance that wider discussion of the subject may still influence things.
For some background on myself, I help maintain a large cross-language & cross-platform ecosystem[1] called https://conda-forge.org
, which has >2 billion monthly downloaded packages, the lion’s share of which happens on linux. This is only tangentially related to the story below, except perhaps that it provides a bit of background when I appeal to the role of all of us as stewards[2] of the computing ecosystem in the wider sense.
Text processing in C is unfortunately a wasteland, and since C is effectively the kind of lingua franca that every other language needs to interface with, this leads to the extreme prevalence of encoding issues we see everywhere. This should have been fixed in the C standard yesterday (or rather yesterdecade), but alas, it didn’t happen until a few weeks ago, and that was only because JeanHeyd Meneide fought for this with the fervor of a million suns for 5 years. Despite all efforts, it unfortunately missed the recent C23 standard, but at least it’s accepted now, which opens the door for implementation in glibc.
Of course, a common response is “just use UTF-8 everywhere, dude”, and this might work in various places, but the ecosystem is a vast place, and unfortunately it doesn’t apply everywhere by a long shot.
Speaking of the wider computing ecosystem, a lot of modern infrastructure is built on top of derivatives of RHEL (CentOS, Alma, Rocky, etc.), because it has proven to be the best baseline w.r.t. longevity, ABI stability and an up-to-date toolchain (which is hugely non-trivial effort, and thanks to all involved there!).
This true especially for anyone needing to do binary distribution. Concrete examples I’m involved with are manylinux (underlying the main binary distribution format for Python packages), and conda-forge (which also has RHEL-derived infrastructure), though I know that other ecosystems have likewise learned from manylinux (or independently came to the same conclusions).
Because glibc is so central to the (OS’s) ABI that it effectively becomes the “clock” measuring the age of any given distribution, and because RHEL is by far the longest-lived and with the most stuff built on top, progress in the ecosystem is effectively discretized by the RHEL lifecycle.
This is because the available glibc features are effectively determined by what the infrastructure baseline offers (which is in turn what package authors will generally target), and this only makes a leap when said infrastructure jumps from one ancient RHEL version to a slightly-less-ancient one (for example, only once RHEL 7 is EOL, glibc features from >2.17,<=2.28
become broadly usable).
In other words, if the functionality from N3366 doesn’t make it into RHEL 10, that equates to losing roughly another 3-5 years until those features can be used broadly (i.e. when RHEL 10 goes EOL in 10+ years; rather than “merely” when RHEL 9 goes EOL).
This is – in short – why the glibc version that ends up in RHEL has a huge impact on something that really affects a huge amount of the (lack of) quality in our digital lives – people not being able to enter their names correctly, corrupted files, outputs, and so much more. And the problem is that the timescales involved in actually fixing these things are colossally big, so losing another few years would be a Real Bummer™.
Now for the inevitable snag: this isn’t implemented in glibc yet. The good news is that glibc is generally very quick to support freshly-standardized features, and JeanHeyd himself (who I’m in loose correspondence with) is planning to get this into glibc 2.41, which is expected in early 2025. This obviously depends on the collaboration and review of the glibc folks, but since much of this is already implemented, I’m hoping-slash-assuming that this will not be the crux of the issue.
So, what I’m looking for here is: inputs from people whether this is at all feasible, support/opposition/discussion about the subject, or sharing this with people who are involved and/or likely affected by this.
Thank you for your time