Fedora-Council/tickets ticket #473: Git Forge Evaluation 2024

@amoloney filed Fedora-Council/tickets ticket #473. Discuss here and record votes and decisions in the ticket.

Ticket text:

7 Likes

Thanks for tackling this, @amoloney! I agree it’s time to re-start this discussion. I’ve created git-forge-future which we can use for this. It’s a sub-tag of council, so tag posts council and then that should also be available.[1]

For now, of course, we can discuss the meta-proposal in this topic.


  1. We’ll see if that’s too confusing! ↩︎

2 Likes

It’s important for us to separate our different uses of git forges, I think. They are:

  1. dist-git — the very specific tool we use as part of our build pipeline for RPMs, containers, and flatpaks
  2. Team and subproject repos, like Docs, Legal, Design, Council, Infrastructure, Rel-Eng, etc. (Many of these currently using gitlab.com). This mixes git storage with issue tracking and project management.
  3. Documentation repos (separate from the docs team organization) for docs.fedoraproject.org
  4. Infrastructure code hosting for various tools used by and for Fedora (here, “infrastructure” in a broad sense, not just the infra team)
  5. Possible future dist-git-like things — source git (or some successor concept), possibly a Rust crate repo, etc.
  6. General code hosting somewhat related to Fedora (the long-tail descendants of Fedora Hosted)

Did I miss anything? Are all of these in scope?

2 Likes

In the entire scheme of things, yes I believe they all should be in scope. But we might want to start slow and choose one, maybe two areas to focus in on. I know the Community Platform Engineering team are currently looking at the areas Fedora uses Pagure as a dist-git, and once the findings of that investigation are complete that could be a nice baseline to use when evaluating another dist-git alternative.

3 Likes

It’s probably also worth discussing What about Gitlab? There was, of course, a big discussion about that several years ago. Recently, I’ve heard both of these things from people: “I thought we decided on Gitlab” and also “I thought we rejected Gitlab”. Clearly, we’re lacking some clarity. Here’s the situation as I know it:

  1. We are taking part in GitLab’s program for open source projects — this is https://gitlab.com/fedora. This provides access to the non-open-source “Ultimate” plan.
  2. We tried to get Gitlab to provide us with (paid by Red Hat) dedicated hosting using their open source Community Edition. This looked promising, but didn’t work out.
  3. Even if we went for their non-open-source commercial dedicated program, their pricing model is per-user. With our project scale, this is out of reach, unless we would severely restrict access, which would be fundamentally at odds with how Fedora works (and strategy2028’s guiding star!)

The current gitlab.com arrangement has some limitations and issues that I am not thrilled about. I don’t want to side-track on this too much, but I think there’s enough to make “we’ll just use that!” not an easy answer after all:

  1. It’s not open source. In the previous discussion, it was very clear that this is fundamental community requirement for dist-git — even if some of us may be open to taking advantage of non-open tools for more “peripheral” activities, dist-git is so core to Fedora that people felt overwhelmingly that we need to use free and open source software here.
  2. It uses gitlab.com accounts. This has some advantages (network effect?) but also can be really confusing. (Is @mattdm on gitlab.com me? Yes, but only by fortune!) And, the double-sign-on thing is a mess.
  3. There is no guarantee that GitLab won’t change their minds about this on short notice at any point in the future. We must actively renew our participation every year.
  4. There are no Service Level Expectations, let alone an agreement.
  5. The terms of the program restrict use to OSI-approved licenses. Fedora’s own Allowed License List is a superset of this.
  6. Because the “Ultimate tier” includes a bazillion features, the UI feels like a space shuttle flight control console — intimidating and hard to navigate if you’re not immersed in it.
  7. If we were to use this for dist-git, it’s unclear what the data-transfer overhead might mean vs. storage near to our builders.
6 Likes

Another oddity of gitlab, which can get quite confusing. The tier you are on is not tied to the user, it is tied to the namespace the user is working in. For instance, kernel-ark is not a part of the Red Hat namespace, so anything I am doing there is not considered as a Red Hat user from a support standpoint. This means we either have to have everyone under a “fedora” or similar namespace and run them as a named user, or have some sort of MR automation and hope that packagers making those changes in their local repositories do not run into an issue that requires support.

2 Likes

Pagure exists, is already in use, and is opensource. One option that’s not being mentioned here is to continue investing on it and focus on addressing its current shortcomings. Is that being considered at all? Something that could also help this discussion is to state clearly what are the “important features” Pagure is lacking.

2 Likes

That’s something that Red Hat could choose to do. I don’t see a way for Fedora to sustainably invest in it. We don’t have people we can assign to it and there’s no way to throw money at it. You might say “well we can ask Fedora contributors to work on it”, but

  1. This is not the sort of activity that we’ve historically had success in sustaining solely on volunteer effort, so I’m not optimistic This Time It Will Be Different™. There’s nothing stopping people from working on Pagure the last few years (although the years-old pull requests are not encouraging)
  2. Fedora is an integration project built around producing an operating system, not a software development project. I’m of the opinion that projects shouldn’t be writing their own infrastructure (unless that’s the point of the project). If there were a distinct upstream that we could provide financial support to, that’d be good. Fedora contributors who are also interested in developing a git forge can, as mentioned, already contribute to Pagure.
3 Likes

To be clear, some of the problem has been that the CI was essentially non-functional throughout the pandemic, which led to me and others having issues with validating pull requests to merge. That situation changed this year and there is work going on now because of it.

The current focus is bringing the stack up to the latest dependencies so that we can run it on RHEL 9+ and drop Python 2 support with Pagure 6.0. Then we can move more toward feature work too.

@wombelix and I have been working on this for the past several months.

Even with that, I just merged today a pull request from @zlopez to add API endpoints for managing groups and reviewed a pull request from @mattia for an API endpoint to verify the validity of API keys.

3 Likes

I’m glad that things are more functional. I know it was really frustrating when I had a docs improvement sit for a long time (and I truly appreciate your work to finally land it).

I’d feel a lot better about Fedora investing in Pagure if (and this is a little counter-intuitive) if it were a fully-separate project. The line is kind of blurry right now and it’s not clear what the governance model is, etc. What I’d like to see is for Pagure to get a fiscal sponsor (e.g. Conservancy) so that we could throw some Fedora money at it. That would also make it easier to get other projects & companies to contribute too. Of course, Red Hat apparently owns at least some of the copyright and all of the infrastructure, so this isn’t a simple option.

That said, I think I’d prefer a move to GitLab CE hosted by CPE. We’d need to prot the dist-git pieces to GitLab from Pagure, but at least then we’re only on the hook for our “secret sauce” and not the commodity pieces it’s build on top of. It’s not anything against Pagure, but a question of “what gets us the most value for what little we have to put into it?”

2 Likes

FYI, please be aware that any migration will also be dependent upon coming up with a solution for Issue #11822: Numerous Packages in Fedora fail `git-fsck` - releng - Pagure.io since all of the non-Pagure systems I’m aware of will refuse to import packages with this issue.

1 Like

I think one more thing to consider would be bug support integration within the git forge and abandoning Bugzilla.
I mean, Bugzilla is just fine for what we need, but with RH moving away from it there was some discussion not long ago about whether Fedora could lose access or have to move to a self maintained instance. So, better considering this aspect also when evaluating a possible git forge change.

1 Like

Fedora is an integration project built around producing an operating system, not a software development project. I’m of the opinion that projects shouldn’t be writing their own infrastructure (unless that’s the point of the project). If there were a distinct upstream that we could provide financial support to, that’d be good. Fedora contributors who are also interested in developing a git forge can, as mentioned, already contribute to Pagure.

I will disagree here. Or maybe change a direction of the argument.

There is difference between writing infrastructure app and building the infrastructure. Your argument applies to the first part, it doesn’t apply to the second as good.

Project should not write generic infra tooling and applications - because writing generic applications is not the project’s job.

Yet, building the infrastructure out of the existing building blocks, designing the APIs, rules and policies around these blocks and managing that infrastructure is the main job Project ever did.

Integration project is the project which integrates. And development infra is the thing we own and design in our unique way to make that integration happen.


Yes, years ago you would treat Git server as a generic shared “file storage with history”. And in that setup it wouldn’t really matter where and by whom it is hosted, because the interesting things happened elsewhere. Now we are moving into the era where the whole lot of your project’s life is in the Merge Request. In things you write, in things you trigger, in things you set permissions to and so on.

We could drop half of our contributors doc, put the carefully designed MR interface in the center and that would be the 80% of the project for you.

That is why moving to SaaS platform is so scary.

They do not just provide a Git hosting as a backend where you drop files while do your project work. They take the control of the entire project. They will tell you how to manage users, groups and permissions, they will tell you how to manage triggers and events, they will tell you which levels of visibility you can have, they will tell you your retention policy, they will tell your terms of service and your moderation rules…

And it can be that what they tell you now is not bad, and usable and nice. But the problem is that you do not own any of those decisions anymore.


I can agree that writing generic infra tools is not our job. But running reliable development infrastructure which we own and can shape to the needs of the project is our job. Even though it is hard and requires expertise, it is not a good reason to run away from it.

We do not outsource Fedora Change process to a consulting firm running market research, because, however painful, it is our process. Same for the dev infra. We need to learn how to make it better, not how to let someone else do it without us.


This is not an argument for Pagure, though. Because I think Pagure has the same problem as others Forges - it is a generic tool which solves generic problem, while the Fedora Project needs to have the custom infrastructure solution designed for the Project needs but built on top of existing building blocks.

And Pagure may or may not be a good building block.

4 Likes

I’m mostly in agreement with you. But I don’t think us running the infrastructure is a hard requirement if the data is exportable.

1 Like

That’s again the problem: with CI/CD and GitOps, GitForge is not only a data storage, it is the engine which drives your project. Exporting data is a requirement, but you need also the control and customization over the real-time interaction with it.

It doesn’t help if you are free to export your data, if you can not change how it is handled.

Consider an example:

As a FOSS project we would like to be open and share our things with the world. So Ideally each project should have a public RSS feed with all changes.

But currently you can not subscribe to the GitLab events, until you have registered an account on gitlab and get a token.

So the choice of GitLab as a platform creates a barrier between us and our users. It is a small barrier, there are ways to overcome it in most cases(sanctioned countries come to mind), but it is there and we have no power to change it.

So our platform starts to impact our core behavior.

3 Likes

I have almost written the whole article about git forges again, but then I came back to re-read the original post in the thread.

Do I understand it correctly that the ask here, in this particular thread, is

  • We agree that the primary place for this discussion is Discourse
  • We agree that relevant conversation will be tagged as git-forge-future
  • We agree that we start the conversation from square 0

I agree with points above and I add three questions:

  1. Do we wait for you or anyone specific to start additional threads to guide the conversation or do we invite people to do so?

  2. Once we gather thoughts and collect requirements, who are the decision-makers? Fedora Infrastructure team, CPE, Fedora Council?

  3. What are the time expectations (for this new round :slight_smile: ) 6 months? 12 months?

1 Like

I wasn’t sure if you were asking everybody, or just fellow Fedora Council members. Just in case it’s the former, a few thoughts as the current overall CPE manager…

I think we’re looking for guidance from the Fedora Council on how to best proceed. We know git forge choices matter and changing what has been established is sensitive. We want to proceed in a manner that respects this reality.

In delegation poker I think we want to land somewhere between Consult and Agree at each of those levels.

It’s too early to realistically answer this, we have to see where the conversation takes us. I don’t have a specific deadline today. Sooner is preferable of course, but what matters most to me is that the community is part of the journey.

3 Likes

Here is the first iteration of my thoughts on this topic. It is not an answer, rather a braindump :slight_smile:

Firstly, I am worried that structuring the conversation around requirements will not lead us anywhere.

Requirements are important and we should discuss them. But they are not enough to build a reasonable path forward. And they are not useful if we only see one path forward.

Personally I can write a huge list of things I want from the Git infrastructure, but I would have zero understanding on how do we get any of the items on that list and whether these dreams are even remotely possible. We then will get into a fight over which Git Forge checks more items on the list, but “number of checkmarks” is not the criteria of success I would be looking for.

One possible way to balance the requirements conversation is to talk about proposals.

So instead of asking “what is our requirement”, we may ask “what is our goal and what is our plan” or “what are our options”.

=> Do we want to invite the community to work on describing problems, or to work on designing solutions?


The second thing is:

The way I understand the problem: CPE identifies Pagure as a risk for the Fedora Project infrastructure and looks for a way to replace it.

The question though, is it the Pagure which is the risk, or is it a lack of a strategic vision for Git-related infra in the Fedora Project, which puts the current and all other implementations at risk. And if so, does replacing Pagure with anything really addresses the main problem?

=> At which level should we enter the discussion, should we look for alternatives to the tool, or the alternatives to the way we treat this part of the infrastructure? Both?


The third point relates to the list of use cases which Matthew posted.

As a project we are not really in the business of running a generic Git Forge for a generic FOSS enthusiast. There are plenty of other services doing that. We have 3 focus areas, where we need to provide the ways for project contributors to collaborate. While we use a Git Forge to solve all of them right now, it doesn’t mean we must continue doing that.

dist-git is our most critical platform. But it also has the least overlap with requirements of a generic GitForge. By using generic GitForge software to run dist-git we are making everything harder for ourselves, increasing the attack surface and the maintenance burden.

=> Can we address the cases independently by different tools/services?


All in all, I probably would prefer to replace the original question of “evaluating git forges” by “evaluating Fedora Infrastructure plans regarding the git-based collaboration”. Or something like that.

1 Like

Yes, this seems like a healthy way to look at it. I’ll offer a slightly different version of this that might help frame discussion, too: The number of requirements should be really small, but the number of nice-to-haves should be really large. A few constraints and a bunch of dreams is a fine thing.

Assume this one is for your fellow Councilors…


Either is valid as long as the outcome is sustainable infrastructure. Pagure is being used as a core piece of infrastructure and we have a problem with Pagure because of its lack of maintenance. The goal isn’t to switch one git forge for another, it’s to remove a serious known liability. Using a different model where a forge isn’t needed at all also serves the underlying motivation. And if participation in Fedora is in some way hampered because a forge is being used where it’s actually a bad fit, moving to a better way of working, as long as it doesn’t entail a new liability, is better than a forge swap.

From my standpoint, yes. We don’t need a single magic tool that solves everything. Rather, we want to adopt tools that can be maintained with existing staffing, that foster contribution from new and existing community members. It would be nice if that was a single tool, but it doesn’t have to be.

This is a nice, holistic way of thinking of it. Every change of this sort is an opportunity to innovate.

1 Like