Considering that a major use case of Fedora (and Linux in general) is in cloud computing and big data systems, why is the de-facto way of setting up Hadoop and Spark (key frameworks related to big data processing) in 2021 downloading and unpacking archives from upstream, instead of simply fetching the appropriate binary packages from the official Fedora repositories by running an appropriate dnf install command? Unless I’m missing something, I imagine that having such commonly-used frameworks prepackaged for Fedora would bring a number of tangible benefits to a vast user base, such as (but not limited to):
Especially with (2), I’m not sure how reliable it is since it doesn’t appear to be officially endorsed by Fedora, but anyway, I did a fresh install of Fedora 35 server to see for myself and dnf search hadoop did not turn in any results, while dnf search spark gave mostly unrelated results, the closest being some Spark client for Azure(?). If (2) is indeed true, why was the package removed in Fedora 32?
Yeah… Java is notoriously difficult to package in Fedora and really Linux distros in general. The language ecosystem has a bunch of norms which don’t really fit with the rpm (or deb) way of doing things. So that added a lot of challenge beyond just “work on the big data stuff we’re actually interested in”.