Fedora-Council/tickets ticket #503: [Initiative] Fedora AI Chatbot Development

t2dbot · September 2, 2024, 2:38pm

@amsharma filed Fedora-Council/tickets ticket #503. Discuss here and record votes and decisions in the ticket.

Ticket text:

hricky · September 2, 2024, 3:43pm

That seems interesting to me. If accepted, I will probably try to participate.

Could you please edit the Stakeholders section of the wiki page to make it more readable. Also, the link to Model Choices in the Risks section appears to be broken.

amsharma · September 2, 2024, 4:19pm

Thanks @hricky for showing interest infact we need more people to contribute to this. For wiki fixes - @davdunc or I will get to it soonish,

barryascott · September 3, 2024, 11:19am

Does efficiency cover the amount of electrical power needed to run the chatbot?

kevin · September 3, 2024, 7:23pm

I think it’s an interesting idea and well worth trying out, but we need
to be careful.

I am not sure introductions is the best place for this. The scope seems
more general than that. Perhaps we could make it have it’s own room in
the bot space like ‘autohelp’ or something?

Also, I think it might be good if it didn’t answer direct messages, only
in the room. This would allow people watching it’s results to know when
it went off the rails? But on the other hand that could be very noisy if
multiple people were talking to it at the same time.

jflory7 · September 4, 2024, 1:10pm

This is an interesting idea but I have concerns.

There is not a logic model for this Initiative. I want to see a logic model before casting an official vote because the scope of the Initiative is unclear to me. See past Initiatives for examples of the logic model.
What is meant by #introductions.im? This does not actually exist. Do you mean Discourse or Matrix? Do you mean the Fedora Join SIG Matrix room or something else? The proposal is not clear to me about where this chatbot is supposed to be used. This is more or less the same concern as @kevin above.
Who are the community stakeholders? It is unclear to me which group is being targeted for the chatbot and whether this group has been consulted as a stakeholder for this Initiative. My initial guess is the Fedora Join SIG but it is not clear to me and they are not named in this proposal. The proposal names “Fedora Community” as a stakeholder but it needs be more specific; this is too broad to target.
What quality controls are considered for the source data that trains the model? There is a lot of documentation and wiki content that is simply outdated and incorrect. I see a high risk of a chatbot providing false and untrue information, without an easy pathway of fixing it. For example, if we source any wiki content about the Ambassadors program and someone asks the chatbot a question about Ambassadors, it is almost certainly going to be incorrect and there is not up-to-date data source about how Ambassadors functions in the 2020s. There is no easyfix for incorrect data about Ambassadors.
Who is the development team? The Initiative names a development team but does not provide details about who is leading this development effort.
Who is responsible for long-term maintenance? Is there a team that is owning this or is it implied to be Fedora Infrastructure Team / Red Hat Community Platform Engineering team? If this is true, they need to be named as a stakeholder and also need to be consulted before voting on this proposal.
Was the Fedora Docs team consulted previously as a stakeholder? They seem to play an important role in this proposal but I don’t see any specific Docs Team members named as stakeholders of the Initiative.

I consider myself to be an optimistic skeptic when it comes to the use of AI inside of Fedora. I believe that AI can help us become more efficient and modernize some key aspects of how we work as a contributor community. However, I feel like this proposal needs more work as it is currently written and I am unsure whether the stakeholders needed for its success have been consulted. An Initiative needs to be proposed with the stakeholders and team involved at the outset; an Initiative does not build the Initiative team after it is approved.

mattdm · September 4, 2024, 3:37pm

@amsharma – I can help with this (but let’s get some of the other things resolved before we start). A good logic model will help answer some key questions like “what resources do we need, and which of those do we actually have?” at one end and “why are we doing this?” at the other, with “what exactly will we do?” and “how will we measure success?” in the middle.

ankursinha · September 5, 2024, 7:55am

I’m still unconvinced that LLMs are the answer to “everything” :). They do have their uses but one has to be quite careful of the problem theyre being used to solve.

So, may i please take a step back and ask: what is the problem that this is looking to solve please?

mattdm · September 5, 2024, 1:39pm

I really like the personal approach of actual humans responding to introductions, and the human connections of the Join process. It really emphasizes the “friends” aspect of Fedora in a way that an LLM-AI never well. But I can think of two situations where a bot might be helpful:

Everyone’s different. Some people may find it more comfortable to get the basics from something automated — maybe they’re afraid of feeling unqualified, of getting a dismissive gatekeepy answer, or just shy.^[1]
Sometimes, there just aren’t humans around, and it’d be nice to have something other than crickets.

Those first ones shouldn’t happen in Fedora, but that hasn’t always been the case — and they’re still unfortunately common experiences in the Linux world at large. ↩︎

ankursinha · September 6, 2024, 10:44am

Yeh, this is certainly a use case. It’s a tricky one though.

We certainly should make it easier for folks to do their own thing without forcing them to interact with others. There are plenty of things in Fedora that folks can do with minimal human contact and that’s great.

When it comes to new contributors that are feeling unqualified or shy or afraid of “asking silly questions” though, we do want them in the safe spaces we set up as early as possible because interacting with others there is how they learn that they’re not unqualified and that there aren’t any silly questions and that we’re all friends and so on. It’s other humans that reach out to them, welcome them and share their experiences and so on that makes newcomers feel comfortable and over time that they are part of the community.

So, my worry here is that the presence of a bot will make it rather easy to just “default to bot”, and while that’s great for people that’d prefer to not interact with other humans, it may also slow down/increase the time it takes for others to enter the safe spaces where they can make new friends.

I’m not sure I’ve written it super well, but I hope it gets the point across.

Yeh, but for this one, is the answer a bot or other initiatives to increase our community and/or diversify it to cover more time zones and so on?

Some of this will come down to analysing cost/benefits: do we know how many resources are needed to set this up (in terms of human power and infra/financial resources), and could these resources be used in other projects/initiatives that could more efficiently address the same problems, or is the bot the best/most efficient way forward?

mattdm · September 6, 2024, 3:04pm

Yeah, makes sense to me.

In fact, if this goes forward, I think we should make sure this or something like it is part of the bot’s instructions.

dcavalca · September 6, 2024, 5:27pm

This is my main concern as well. And even with a curated and up to date training set, the bot would likely still hallucinate in some scenarios and provide inaccurate or misleading information with an authoritative tone – which a user (and especially a new contributor) might not be able to identify as wrong. Even worse, people might take a wrong answer at face value and repeat it somewhere else, perpetuating the mistake.

I don’t have a good suggestion to counter this; I suppose we could (and probably should) prominently label the bot and its replies with a disclaimer, but people would likely ignore it or tune it out. We could have a human in the loop reviewing all answers before they go out, but that seems unlikely to scale in any meaningful way. We could have humans review interactions ex post and try to correct them, but that also seems difficult to scale, and it wouldn’t necessarily prevent the spreading of incorrect information.

davdunc · September 6, 2024, 8:18pm

I don’t know that we have that kind of visibility, but we can make the decision to use scalable methods in our public cloud providers where they identify use of renewable energy sources.

Espionage724 · September 7, 2024, 10:26am

It feels like deception.

If the community is quiet, it’s because the community is quiet. The community voluntarily associate with the forum.

I associate with what I presume to be a community with other real-world people. Someone’s AI machine is not a real-world person. I don’t like the idea of intermingling AI anywhere into communities of real-world people, especially when they’re implemented to replace the role of a real person (introduction → hi).

Heck, that’s my largest issue with Reddit today: can’t tell if you’re interacting with a person or a bot with a mission.

barryascott · September 7, 2024, 5:44pm

You raise an interesting point if we have a AI bot answering then it must clearly identify that its the Fedora AI bot and not a fake human.

davdunc · September 8, 2024, 3:07am

All sections have been updated for review.

misc · September 8, 2024, 7:53am

AI regulation have been voted in the EU in May 2024, and should be applicable soon (fully applicable by May 2026, but some parts are coming sooner in Nov 2024, Feb 2025 and May 2025 ). But I do not see any Fedora Legal review in the timeline.

Shouldn’t it be added somewhere ? Given the regulation start at page 155 and end at page 377 of a 419 pages document, that’s 222 pages (2.5 the size of GDPR), I assume the regulation may impact the project and verification will take some time that should be accounted for.

hricky · September 8, 2024, 7:56am

Thanks for updating the sections.

The link to Model Choices in the Risks section still seemed broken, so I took the liberty of fixing it. I hope this is not too presumptuous and will be helpful.

davdunc · September 9, 2024, 3:53am

I think that it should be very clear that it is a bot as well. I think it’s important that we provide some routing for new volunteers as quickly as we can. If the community is quiet it’s not likely to be the result of our lack of interest in new volunteers and more that we are having difficulty prioritizing the basic routing requirements.

I don’t think that there is any expectation that these new volunteers will see this as a human response and certainly there is a benefit to identifying that it represents the ai/ml special interest group. I think that deception is not what we want to provide here. I think it is more routing that decreases the amount of time we spend trying to discover what special interest groups are going to help them begin their interactions.

Beyond that, we can use the bot to follow up on their experience to determine where we need more assistance or ways we can lower the barrier to entry for additional new volunteers. So assistance I see value to the new volunteer provided by the chatbot are as follows:

self-identifying as an opportunity to engage in our automated routing using Fedora AI/ML techniques and providing a link to the documentation on how we use the information provided.
zeroing in on a suitable mentor and
identifying an area of expertise that is consistent with the interests of the new volunteer
offer and route the opportunity to discuss with someone from the referred teams.
following up with that new volunteer based on observed participation between introduction and some as yet undetermined timeline
notify commOps of the sentiment of the volunteer based on the interaction and recorded moniker and any FAS ID they opt to provide.

amsharma · September 9, 2024, 6:19am

+1 @davdunc . Also I agree that a logic model will help us address many questions here. Your help in this @mattdm will be really awesome.

I don’t see a lot of technical difficulties to implement a small chat bot which can be initially trained with limited data and later can be fine tuned. It can also be open for adding skills to it to Fedora Community.

I see the larger and complex area to address here is building policies around AI/ML solutions in Fedora community. Which is not just related to this particular initiative but needs a much bigger discussion and solution community wise. I am open and happy to work towards that solution too but it can be a parallel road and should not stop a small innovation to happen. Thoughts?

Topic		Replies	Views
Fedora-Council/tickets ticket #387: Ticket-linking bot experiment! Project Discussion council	1	253	January 4, 2022
Fedora-Council/tickets ticket #486: Discuss and decide on a Fedora Council policy on the use of AI-based tooling for contributions Project Discussion council	0	180	March 13, 2024
Fedora-Council/tickets ticket #527: we need to start on the survey! Project Discussion council	8	84	March 14, 2025
Discourse bots & Docs Team meta discussion (process, governance, outreach) Project Discussion mindshare , docs-team	20	575	December 19, 2023
2024-05-22 Council meeting summary Project Discussion council	0	634	May 22, 2024

Fedora-Council/tickets ticket #503: [Initiative] Fedora AI Chatbot Development

Ticket text:

Related topics