That’s one of the things I explicitly want to stop allowing, honestly. Code should not be pushed into Fedora that isn’t ready for use.
I can’t think of any example off the top of my head where you would want someone to push unbuilt code into the repository. Even if the change was just to add new tests, you’d still want them to run the MR to make sure those new tests actually work. If we want to add a label to tell it not to actually submit the resulting build to Bodhi after the merge, that’s an option I think we could meet in the middle on.
I’m trying to think of a situation where we wouldn’t want it to even bother doing the draft build or running tests, but I’m coming up empty. If you have a good example for this, please let me know.
The group we select could be different, but I strongly feel that “proven packagers” are a group that is dangerously overpowered in a world where supply chain attacks are growing more and more common (and more insidious). While forcing everything to go through some basic CI checks is by no means foolproof, it at least offers a little bit of protection. I think “every provenpackager can hit the merge button on any package” is still a lot of power. It’s just not to the current level of “a provenpackager can decide to ignore all process and push things directly”. This has been an issue in the past (and no, not just one time or one person).
Building without going through an MR means building without testing. That should be an exceedingly rare event outside of scheduled mass rebuilds.
I fix typos and formatting and do other changes which don’t warrant an immediate build all the time. People also do mass changes, like changing the syntax of some macros. Or remove Group. Or change License tags. Etc.
A mass distro-wide commit to change the license tag format, for instance. Or change from %patch1 to %patch 1 (or whatever that exact syntax change was, I forget).
Mass events would be scheduled and almost certainly fall under the aegis of the exception cases like mass-rebuilds. I feel like individual packages making an update to their license tag would either bundle it up with another update or else we’d plan to batch them with one of the big scheduled events.
I should have been more clear in my question: Can you think of an example where you’d want to make a change to just one package (or a small number) without building it?
A “Draft Build” is sort of like a persistent scratch-build. It’s not a full build in the Koji parlance because it doesn’t consume the NVR (so we don’t have to keep bumping the NVR if the merge request goes through multiple revisions), but it’s more than a scratch-build because it has to follow the rules of real builds for where the sources can come from. When it’s “promoted”, it mostly just means that it consumes the NVR for all time and also gets switched over to the lifecycle of a full build.
Updating the license tag in response to a ‘hey, we need you to do this’ email. Fixing a typo or wrong date in the changelog. Correcting or adding a comment.
What CI system do you see doing this? ie, is this some CI you see us running with forge.fedoraproject.org? I don’t know that I have seen any final answer if there’s going to be such a thing or not or which one it is. Perhaps @jednorozec could chime in?
Unless it’s changed in recent times git lfs is no good because it prevents you from being able to mirror git repos (and have the lfs stuff work on the mirror since hashes are local).
I’ll mention something that some proponent is likely to bring up at some point: lots of the issues around collections of packages could be solved by a monorepo. (ie, you just make a MR that updates all the needed packages in one side tag, tests them as one thing and promotes them together. People could even collaborate in the MR as maintainers from different packages. But of course it has it’s downsides too.
I think indeed that the MR’s would need some well defined/documented set of actions to take. Including ways to override things if needed.
I don’t think we should allow draft builds from sources that are “discarded”.
In addition to editing the License tag: I might fix an incorrect date in the changelog, or add/delete/modify a comment. Yesterday I removed an rm statement that deleted any libtool archives from a package that had been ported to meson. Last week I fixed a bunch of packages that had manual changelog entries added above %autochangelog. This is all very common. You can do a draft build and wait for CI to pass, but it’s a waste of time when we don’t need any build.
In GitLab, the solution is to just add [ci skip] to the commit message.
A monorepo would be a huge change, and I’m not sure it’s a good idea, but it would solve the largest problem here. If we’re not willing to do a monorepo, then I recommend thinking much harder about how to handle CI for updates that depend on each other. I’m sure this is solvable, but I don’t see the answer yet.
This was something else I didn’t understand from the original post as well but I don’t really get how this works - if the automation fetches the source and adds it to the lookaside then how does the sources file in the repository get updated? Does the automation have to add an extra commit to the branch?
What if I’ve already added the source to lookaside in order to update sources as part of creating the branch for the MR instead, as I would in order to be able to test the build before I push it?
Also what are the security implications of this? Historically fetching of source files has deliberately not been automated I thought because (at least in theory) people were supposed to be checking the files they were adding?
Thanks for the detailed proposal and the time you put into it.
I think making Fedora CI for pull requests better is great, as it currently has many issues:
CI is completely useless for packages that need to be built in a specific order in a side tag.
CI builds run with a lower priority so anytime there’s a large rebuild or other activity going on in Koji, builds don’t run or timeout. During the mass rebuild, there was an issue in the golang package that broke every package with Go binaries (hundreds of packages) and we had to get a change in that wouldn’t have been possible if we had to wait on CI.
Installability tests are broken on EPEL and have been since I joined Fedora 4 years ago. This has been reported multiple times.
There are various issues with parts of the build pipeline running on old versions of Fedora making it a problem when new SRPM macros packages or RPM features get introduced and cannot be used until CI is updated.
Oftentimes (especially recently) builds don’t trigger or TF fails to provision resources and then tests fail with cryptic errors.
(See previous point about TF failures) Testing infra seems very fragile with lots of pieces strung together that are hard to understand (Jenkins, TF, Greenwave, confusing TMT and gating.yaml syntax that I had trouble finding easy-to-understand, up-to-date documentation for when I introduced gating for some packages in the Ansible stack, etc.). Sometimes one (or more) of the pieces break for hours or a whole weekend or during US time when the Fedora CI team at RH is not available, and then it prevents volunteers who work during these times from doing their work unless they bypass it.
For user-facing tools that we try to keep up-to-date on stable branches or when dealing with a Branched release like f43, having to submit PRs for multiple branches and then dealing with Pagure’s slow UI to click through and merge each one is tedious, especially considering the lack of automation for any part of this process.
In general, there are multiple longstanding issues with CI (see Issues - fedora-ci/general - Pagure.io for some of them) that it seems the RH team behind it has not been given the proper resources to address, even though some of these have already been solved in CentOS Stream land.
Most importantly: The pull request workflow won’t scale for .so bumps or mass rebuilds. Doing these are hard enough. Please don’t add extra steps or make us join additional special “ExtraProvenpackager” groups to keep doing these.
To be clear, I’m not trying to denigrate the people who work on the Fedora CI system or the packagers who make heavy use of it, but I think there are issues to be solved, and I think we should focus on improving it for people who want to use it and encouraging more people to adopt it instead of forcing it.
I rely on the pull-request workflow for some packages where it makes sense (I have submitted 600+ distgit pull requests and have almost 500 distgit forks), so I am not trying to bash a system I don’t use. But there are other times where it doesn’t work for one of the reason above (or some other issue), there’s nobody to review my changes, I’ve done my own testing locally with mock --chain or copr, or the infra breaks down, and using pull requests would add extra friction to the packaging process.
Telling volunteer contributors, especially those who maintain many packages and have their own workflows in place, that they have to go through extra steps and use not-entirely-reliable tooling to maintain their packagers doesn’t sit well with me. I think the calculus is much different in Fedora than it is for RHEL. We have less resources and less time to deal with cumbersome processes and to maintain the necessary infrastructure, even though we move at a much faster pace and make many more changes, and mostly work as volunteers.
++1. I agree with this wholeheartedly. I’m happy you want to improve the Fedora CI flow, but as mentioned, I think these should happen separately from breaking common workflows.
While I applaud the idea of offering more workflow improvements (especially improving continuous delivery options), I am also of the opinion that if you take away my ability to directly push, I probably would have to drop almost all my stuff. The overhead of pull requests on top of everything else I need to do would be enough drag to reduce my involvement.
Additionally, pull requests don’t really solve the problems I have: I have build chains that I need to manage manually. Most of the packages I deal with these days are collections of things that need to be updated and built in order. I am not willing to add the overhead of pull requests to that.
I think it’s great for casual contributors and also creating a pathway for helping newer folks gain experience working across the distribution, but I think most of this proposal is just excessive pain for not enough gain. Even the steps you’re talking about automating as a consequence of moving to pull requests don’t really help me as a packager. They are the short and trivial parts, not the crunchy and time-consuming ones.
(Also keep in mind we will replace our forge system in ten years if the pattern holds, so we should be wary of project specific coupling to the forge that we don’t have a straightforward way to migrate when we replace the system again. If that’s not a concern you want to deal with, that’s fine, but given how poorly the STI→TMT migration is going, I think it’s worth thinking about.)
Anything that involves preparing ordered builds. The whole KDE stack and increasingly the GNOME stack is done this way. FFmpeg updates are also done this way. The Rust stack too. It’s a very common paradigm.
You want to get rid of this? Then we need resources to have a build system that does that part for us, like how openSUSE does with their build service. Nothing Red Hat is producing today (Copr, Koji, Konflux, etc.) has the necessary capabilities to improve our packaging experience in this manner.
We already require Source URLs and have a sources file with checksums. There’s practically no reason to couple this to a pull request. We could just have this happen on push as a server side hook. Coupling this to a pull request is kind of pointless.
This is a great time, and we should celebrate the community’s flexibility in agreeing to switch git forges to improve everyone’s experience. The day we move will be a significant point in our careers.
However, the difficulty of this process and the sheer number of issues we’re aiming to solve at once isn’t a good sign. A git forge should be just that—a forge. We shouldn’t get locked into a specific implementation again, as we are with Pagure. We need to avoid deep, custom modifications or forge modernization efforts. Instead, we should use the forge “as is” and build our software pipelines on top of it.
By treating the forge as a core but still “replaceable component” of our infrastructure, any future migrations won’t be as difficult. As others have pointed out, our goal should be a minimal switch. We must avoid reinventing tools we don’t need to, re-implementing existing principles, or creating new policies. These changes can be done now (with Pagure) or anytime in the future.
Whenever we group multiple changes into a bigger one, we increase the chance that people will have mixed feelings, potentially disliking the change as a whole. That would be a huge missed opportunity, as this particular change should be one that everyone can fully embrace.
These changes are drastic, big and we will have hard time to find manpower to implement it and it will be hard to find consensus on all items at once. My recomendation is: divide it in subtask and work on these. E.g. formalize script that customize sources that cannot be fetched from URL. Or enforcing MR on all changes…
Lot of the automation (automatic build, testing, submit to bodhi) can be already done by Packit. It has not been implemented as default because FESCO rejected that and Packit lacks man power to implement features that people are asking for. See Changes/PackitDistgitCI - Fedora Project Wiki
To add to @msuchy 's point, I would say: you should always build The Thing and make The Thing awesome before you mandate The Thing. Ideally, before you even suggest mandating The Thing.
That is: do the forgejo dist-git migration and build an awesome CI experience on top of it before you suggest making that experience mandatory. This avoids an awful lot of genuine fear, uncertainty and doubt.
+1 Just a small correction that this was postponed by one Fedora cycle and not rejected.
(We would like to do this step when we are sure we can fully support and actively develop this service.)
In general, I like the ideas, but I would leave the solution for generic CI system service as an implementation detail – for the user, it can and should look like it’s described but under the hood, I would suggest not going with Forgejo actions for multiple reasons:
Let’s not stick with one-forge specific solution.
Webhook-based systems (i.e. Packit) allow better support for multi-step pipelines that requires waiting for external service (e.g. build finished,…).
It’s already there. With Packit, we’ve offered providing such solution including the Forgejo integration (people behind current solutions are aware that they do not need to do this integration.) This was also approved. We can do various theoretical discussions and comparisons but the crucial part is that someone needs to provide/maintain/develop this..
I like the idea of something-on-git* (=everything is done as part of PR/MR) and Packit can incorporate this as a next step (I hope we’ve discussed this around Flock/DevConf). But the first step is to replace current CI solution and then provide new feature. (But sure, have this in mind.)
It’s worth mentioning that I would differentiate generic CI and single-package setup where I would welcome if Forgejo actions could be used.
Monorepo will add more problems than it solves. For example, the consumption of disk space and bandwidth, since every packager will need to download and store gigabytes or even terabytes of data. Even some big corporations have realized that the monorepo was a mistake.
Thanks for all the points and most of them was actually the reason why we, in Packit, agreed on stepping in and offering that Packit is technically able to do it and we would like to improve the situation. (Also because Packit-provided dist-git PRs has value only if the CI works well.)
We already integrate various services (TF, Bodhi, Koji,..) and introduced various mechanism to overcome various temporary issues. (But we can’t resolve issues on underlying service, that’s true..