6 months ago, there was a change proposal for introducing (optional) optimized binaries into Fedora.
Although it ended up being removed and deferred to a later date, I decided to look into some of the feedback it has gotten.
The mechanism suggested for optimized libraries seems relatively simple enough, but the one for optimized executables got a lot of concerns:
Among these concerns, the following were the most common:
- Setting the
$PATHis finicky and not guaranteed to work - programs can decide to modify it, which would break this mechanism. - This mechanism doesn’t work if the program is launched using its absolute path.
I’ve thought about a different mechanism, which solves these concerns, and would like some feedback on it!
Introduce a new executable, called hwcaps-loader. It first determines what’s the maximum feature level of the CPU, then, it executes the best version of the specified program, as long as its equal or below the CPU’s feature level.
When a package opts-in to this mechanism, its executables in /usr/bin/ link to /usr/bin/hwcaps-loader, which finds the actual binary in /usr/bin/glibc-hwcaps/*/.
There’s no separate “loader” program for each executable to keep storage costs as low as possible. Instead, a hard link is made from /usr/bin/{EXEC_NAME} to /usr/bin/hwcaps-loader.
hwcaps-loader determines which program it should execute by reading /proc/self/exec’s path (symlink to the process’s binary file), which will correspond to the hard link.
Using this strategy, hwcaps-loader knows exactly which executable it has to run without needing to receive any launch parameters!
Here’s a visualization of how the filesystem looks like:
And here’s a logic flow of hwcaps-loader:
Here are the main advantages of this mechanism:
- No reliance on
$PATH: Optimized binaries can be executed simply by running the program’s usual path. Direct references still provide optimized binaries and the environment of the command never interferes. - Minimal storage footprint: No need to have one wrapper/loader for each optimized binary.
- Minimal overhead: A specialized binary launcher will take less time to launch the program compared to an equivalent shell script.
There’s one major drawback though… packaging must be very careful in order to ensure /usr/bin/hwcaps-loader is never deleted while there’s still other links to its inode, otherwise, the inode and all the other links will still linger.
Recreating /usr/bin/hwcaps-loader doesn’t fix this - it simply creates a new inode which none of the existing links will use. A mechanism would need to be made to ensure all references are removed before hwcaps-loader.
Using symlinks instead of hard links would solve this issue, but, unfortunately, /proc/self/exec points to the actual binary file instead of the symlink, so a hard link must be used instead.
I’ve made a Rust-written prototype of this concept. It cannot determine CPU capabilities yet, but all of the loading logic is implemented and works correctly.




