AI/ML SIG "Revival"

The idea of “reviving/combining” the AI/ML related SIGs was brought up in the PyTorch thread but the resulting discussion seemed to focus mostly on PyTorch and not on coordinating AI/ML related activity in Fedora. This thread is my attempt to extract the non-PyTorch related bits from that conversation.

Based on the activity in the PyTorch thread and around the HC SIG, it seems like there is broad interest in AI/ML and I assume that there is also interest in keeping the work coordinated. @kaitlynabdo seemed to indicate that the ML SIG folks are on board with consolidating the groups but I didn’t see replies from other groups in that thread

Seeing as the winds of change are blowing toward discourse and matrix, those seem like the places to start. Discourse already has the ai-ml-sig tag and that implies a SIG name of ai-ml or ai/ml.

I’ve requested a “Fedora AI/ML” room on chat.fedoraproject.org and have it bridged to ‘#fedora-ai-ml’ on libera.chat. I know that ‘#fedora-ml’ already exists but as far as I know, it’s not terribly active and keeping the naming consistent across discourse/matrix/irc makes more sense to me.

The remaining questions I have are:

  • Do the HC SIG folks want to join this “new” SIG or would they prefer to keep their own setup?
    • I don’t see a way to contact them, I will be sending an email out informing them of the new thread and asking for input
  • Does anyone violently object to the SIG naming?
  • What do we do with the existing wiki pages?
2 Likes

The question of “what to do with the existing wiki pages” came up and I don’t have any strong feelings about it, personally.

So long as the data is reasonably up to date, any pages we remove as part of the consolidation have redirects instead of 404s and information is easy enough to find, it’ll work. In an ideal world, we’d have all the information in one place for discoverability and ease of documenting where to send folks but we don’t exactly live in an ideal world.

If nobody has any objections, I propose the following:

  • Moving the ML SIG wiki page to ai-ml so that it matches the naming scheme we seem to be going with
  • Refactor the HC SIG wiki page
    • move the the membership stuff to the “new” ai-ml SIG wiki page
    • leave the packaging status information as either a sub page of the ai-ml SIG or something separate but make sure there’s a link from the ai-ml SIG page.
    • make sure that a redirect from the current HC SIG page points to the “new” ai-ml SIG page
  • Leave the existing ml@lists.fp.o list alone for now since there seems to be a push to move everything here to discourse but deleting it seems silly at the moment.

If it’s not clear - I don’t have strong feelings about any of this and I’m not trying to push anyone out of any efforts.

I’m just a believer that Fedora is a “do-ocracy” - voting and discussions don’t get terribly far; if you want to see a change, propose and (after a reasonable delay for feedback) do it; if there are objections, then prepare to re-think what you’ve proposed.

This organization and consolidation makes sense to me given the current states of the AI/ML related SIGs in Fedora. That being said, I’m certainly not omniscient - if you have better ideas or see how these changes could cause problems, please say something.

We now have a Matrix room for the SIG.

The Matrix room is bridged to #fedora-ai-ml on libera.chat if you prefer IRC.

1 Like

The recent activity has been on ROCm / HC side and not as much on the ML side. Creating a AI/ML SIG seems like we are killing off the HC side.

What I would like to see is ubiquitous integration of ROCm and similar into other Fedora packages, like bender (a package) and pytorch (not yet a package). And though I am all in on AI/ML, I think the SIG should try to help all the use cases that we can apply accelerated compute to.

My opinion on AI/ML is without acceleration it is pretty limited and is why I am working both sides. Pytorch has a number of not yet buildrequires packages in addition to ROCm, it is not clear who other than myself is working on them.

Tom

1 Like

I’m not sure I understand. Do you want to see the HC and AI/ML related SIGs remain separate or are you concerned that the non-ML parts of HC would get lost or forgotten if these groups are combined or something else that I’m not getting?

To be honest, I hadn’t really thought about the other uses for things like ROCm. My primary goals are to keep information and groups easy enough to find and not create extra overhead where it’s not needed. I figured that combining all the semi-related groups would be a good way to do those things.

1 Like

With ROCm accelerator stack integrated into Fedora, any package that does big math can benefit from an insane performance improvement over cpus. Last year when I was looking at OpenCL packages performance of the math library BLAS, GPU hardware was two orders of magnitude faster, so something taking a minute on a GPU would take an hour on a good CPU. When the problems get big a day on GPU is 2 months on a CPU, which rules out a CPU for these problems.

AI/ML is the latest class of problems that need to do big math. There are others I have been involved in. some examples

Folding@Home/scientific applications
gimp/blender/image processing applications
octave/pure math applications

I do not want these and other usecases to be lost, so I don’t think collapsing the two sigs together is a good idea.

On the AI sig page I think it would be a good idea to reference what big thing(s) we want to do, like the ROCm table on HC, and then work towards it.

This :arrow_up: basically sums up exactly my thought process up to now.

This makes sense to me, as you explain it — as long as there is energy for both, and close communication.

Is there enough here for ROCm and then OneAPI for Fedora to have a general goal of end-to-end open accelerated compute for all of the packages ?

Then HC and ML could move into a new Accelerated Compute SIG.

I’m not sure I follow - enough what? I’d love to see ROCm and OneAPI in Fedora but as I understand it, OneAPI isn’t in a state we can do that right now.

If there’s enough energy and interest, I have no issues with keeping the AI/ML and HC SIGs separate. Is there enough interest and energy to have another set of discourse tag and matrix room for HC that aren’t going to be dead?

Either way, it’d be nice to have some better communication in place. Just in the last week or so, we’ve had at least two instances of more than one of you, me and @mystro256 working on the same package at the same time.

1 Like

I listed out all the packages in the wiki page. I think we just need to add another column for “Who’s working on it” or similar, and update it if you start working on it.

With that said, I was working on rocprim at the same time as Tom, and it helped give me some insight on the library, so I reviewed it much quicker, despite being very busy.

1 Like

We’ve both marked what we’re working on in the “notes” column which should be OK unless a bunch of people start working on the rocm packages. That’d be cool but it seems unlikely that we’d get much more than one or two more people.

You do have a point about learning, though. I’m learning a lot by just trying to make these modules/things compile

I’m now reviving a revival thread that I started. Apologies for letting this fall off my plate.

The silence on this topic does seem to reinforce my concern over splitting up an already small group. I’ve watched plenty of SIGs wither on the vine due to a small user base.

I propose that we do the following:

  1. Keep the HC SIG separate for now but use the ai-ml SIG’s discourse tag and matrix room as contact points until there’s enough momentum to justify adding more things which will need to be watched
  2. Update the HC SIG wiki page with appropriate contact information, existing resources etc.
  3. Combine existing ML SIG into the newer AI/ML SIG
  4. Deactivate the existing mailing list for the ML SIG and redirect/rename its wiki page to the AI-ML SIG

If you object to any of this, speak up in the next couple of days. I assume that if anyone had objections to the general ideas, it would have been spoken by now and the overall reaction is “meh, stop emailing me about this already” :slight_smile:

1 Like

Yes, please leave the HC SIG as-is, it has ok amount of information on it now, more is better. I am trying to have a packages section that go into deeper info on how this or that HC is applied to other packages. no point reinventing the wheel, is ai/ml has a writeup, let’s just reference it.

I’ve done the following:

Closing the ML mailing list is going to be a bit more complex because it is the owner of a bug and the QA contact for several other bugs in bugzilla.

Do we have a FAS group for the SIG? If not, can we get one created and hooked up to the gitlab namespace? It’ll make it easier to track membership and get work done down the road.

I couldn’t attend the last Monday’s meetings. Are there any next steps or To-Do lists available anywhere?

I’m glad you asked this as I was under the impression that we were going to have to manage GitLab membership manually. As it turns out, we do not have to do that and can set up FAS groups such that the group members would be mapped to a role for a project in GitLab.

We do not currently have a FAS group for the SIG but it sounds like we should because I really don’t want to manage the GitLab stuff by hand.

The remaining question is how do we want to map things and what do we call the FAS group? It looks like there has been some recent discussion on the infra list about how groups should be named but there doesn’t appear to be a conclusion yet.

I’ll poke the thread to see if they’ve made a decision so we can request an appropriately-named group.

Unfortunately, we don’t have anything concrete yet. To be honest, we’re still trying to figure out where lists/documents/etc. should live :slight_smile:

Is there anything in particular that you were looking to help with?

ai-ml-sig or pytorch-sig would probably work, I think the only requirement is that it ends with -sig.