Council Policy Proposal: Policy on AI-Assisted Contributions

(post deleted by author)

1 Like

impact impact impact. everybody wants to make a big impression. but they want a guaranteed result. you’re asking someone else to do it for you and theres no shame in asking!

to have a genuine experience you need to have determined to do it or at least seek to determine something, and that’s when you find out - in a way we can’t yet espress in words. it’s because language is woefully corrupt. if you consider where it originates this isn’t hard to see. we’re thinking using words invented by ghosts essentially, if you believe the history we are told. so we find it very difficult to communicate. rest assured, it will not always (or ever) be the case actually. but yeah its interesting. the things we have even our bodies and yes the computer itself is on loan from the future

I don’t know if it counts… Windows 11? :person_shrugging:

1 Like

I get where you’re coming from, but I think it’s worth considering that writing policies is not Fedora’s primary purpose.

I think the point being made at the section in the policy you’ve quoted is that the most valuable contributions are the kinds of creative works that humans are best at. We reason and solve problems. We understand the world in a way that machines do not. And all of that can be true at the same time that a machine can summarize the unique points made in a thread and eliminate redundant statements, offering the human author of a policy a concise statement from which to work.

Machines can generate output much faster than humans, and so an unfiltered stream of machine generated output could overwhelm humans, waste their time, and prevent them from doing the kind of creative and valuable work that they are good at. I don’t think we’re seeing that happening, here. Especially so if writing policies is not the most valuable thing a human could be doing.

1 Like

My native language is not English, and in almost all of my contributions to the Fedora Project, including this one, I rely on Google Translate. According to Google Translate - Wikipedia, Google Translate is a multilingual neural machine translation service. Neural machine translation is an approach to machine translation that uses an artificial neural network.

When the Policy on AI-Assisted Contributions goes into effect, will I have to explicitly write notes in all my contributions about using Google Translate?

Another option would be to stop using Google Translate, but then my contributions would be still technically correct in terms of program code and command line examples, but probably grammatically and syntactically incorrect in terms of the English language.

2 Likes

This is a decent start. Some small feedback, which is an echo of what some others have said. In short, take a clearer stand and use more decisive language in some areas:

  • “Be transparent about your use of AI” — in this section, I strongly agree with @sgallagh, do add a note to the effect of: “if you repeatedly don’t declare, or hide, non-trivial assistance from AI tools, your trust in the community can be jeopardized.”
  • “Limit AI Tools for Reviewing” — here too, please add a note on disclosure, something to the effect: “when you use AI tools for review work, declare it clearly and plainly. Especially when there’s a mix of human and AI, clearly specify the 'author’ of each part.”
  • As a leading distribution, Fedora should acknowledge that the legal situation is still evolving (e.g. there’s no clarity on the copyright of AI-generated “complex artefacts”) and that Fedora is constantly adjusting its approach as significant new information emerges.
1 Like

Contributing to Fedora means vouching for the quality, license compliance, and utility of your submission.

How can I vouch for the license compliance of an AI-generated contribution? Are there tools available to everyone that can do this?

1 Like

What is the goal with disclosure of usage of AI tools ? An ‘Assisted-by: ’ tag is effectively providing continued free advertising for commercial tools.

Will knowing the specific choice of tool provide a meaningful benefit to Fedora ? If an AI tool has been used to partially author a commit, as a manitainer, does it really help me if I know it was ChatGPT vs Gemini ? Would it suffice to simply ask to disclose that AI was used, without the advertising ?

This is not an entirely new issue for OSS, because for a long time people have advertized use of a certain well known commercial static analysis tool, in commit messages when it has discovered security bugs they’ve reported. In such cases what matters is not the name of the tool, but the full details of the problem being reported & solved.

This touches on something that is again not a new problem for Fedora. The questions/ethics around sending user data to a remote service has come up in the context of Fedora on a number of occasions in the past before AI was involved. It has always triggered a lively and contentious debate, made more complex by having insufficient guiderails for how we evaluate things.

I thus wonder whether this point is something that belongs in the AI policy at all.

It rather feels like Fedora ought to have a general policy on how it evaluates & approves tools which process user data, which should be considered for any tool whether using AI or not.

IOW, we should avoid making the AI policy be a stand-in for pre-existing scenarios where Fedora has been missing a satisfactory policy.

2 Likes

Is there a link to the results of that survey somewhere ?

Is this point actually related to the policy headline of “AI-Assisted contributions” ? Surely it isn’t Fedora contributors whom would be triggering such disruptive scraping actions, rather 3rd parties whom the project has no direct interaction with. Disruptive scraping as a problem pre-dates AI, albeit at a lower level of severity.

If we want to make a statement about use of Fedora infrastructure and massing scraping that is of course reasonable, but does it belong in the AI usage policy for Fedora contributors, as opposed to in the project website terms of service ?

I’m hoping we can avoid that. I hope we can get to a point where we can make it easy for contributors to note when its general practice to use something in the ML/LLM space as part of workflow for translation generally.

What I want is to have a way for us to have an honest conversation about the situations where AI usage is adding value to the contributor experience. And the only way I can see to do that is to make it possible for people to point out how often they are using the technology as part of their personal workflow and the utility it provides.

I have a suspicion that translation is one of those areas where its being used far more widely than some people realize right now, with varying degrees of success. I’d like to see people, who are using it in particular ways, find each other and start having a conversation about good practices. And more importantly, once we build a bridge with the open source ML/LLM development community start having conversations with that developer community so we can figure out how to iterate on good practices in an arc towards a more ethical quality future state.

I am fully aware of the deep irony of talking about having more conversations about technology for translation. In my personal ranked patheon of possible social good uses for this technology, a high quality universal translator is pretty high for me.

And for my part, I’m trying very hard to be honest about the inherent privilege as a native English speaker in this project and try to imagine what it would be like for me to contribute as a non English speaker. If this project communicated primarily in Spanish, I could get by with un poco de ayuda. If it communicated primarily in French, I’d have to rely heavily on some sort of translation tool.

2 Likes

I talked about it in the 2024 State of Fedora:

Slides 7-14: State of Fedora, Flock 2024.

Fedora Council Meeting 2024-09-11: AI/ML Survey Analysis Discussion goes into a lot more depth.

@gwmngilfen do you have a blog post or link to your slides / graphs / charts from that?

The first thing I did was read all the comments myself. I used notebooklm to check each change I made against specific pieces of feedback, linked to who said it and what their exact words were.

As I said in the disclosure, I stand behind the contribution I made and I’m fully prepared to continue discussing it.

Yes, it’s left in because it’s one of the clear policy-type pieces of content, but you make a good point about whether it belongs here.

Also a good point. This is in here because it was a major concern that people expressed – the assurance that we weren’t going to put an AI assistant onto their Fedora desktops without their explicit approval. But I agree that this is the sort of thing we seem to have committed not to doing already elsewhere, as in opt-out vs opt-in telemetry.

I assume most of the “AI-assisted contributions” we’re meant to discuss means “generative AI technologies like LLM and such” (correct me if I’m wrong). But yes, I think it’s good manner to disclose usage of machine translation in discussions.

When people see one person using a specific language in discussion, they usually assume this person has a certain level of confidence in their language skill of this specific language, like being able to express their opinions in clear and so on. Machine translation sometimes cannot guarantee that, sometimes machine translation programs/services didn’t get the users’ point or make mistakes. So if a user explicitly announces they are using machine translation, others can anticipate and get ready when those mistakes are made.

It doesn’t have to be “I used machine translation in this one post” everytime you write something, you can add that to your profile’s “about me” section (take Discourse as an example).

3 Likes

I am somewhat confident in my English language skills and I think I can express my opinions relatively clean. I use Google Translate mainly as a spell and grammar checker and I almost always correct the translation because it is usually incorrect in terms of its Linux, and technical suggestions in general, as you mentioned. Rarely, on occasion, when I suggest something wrong and noticed it afterwards, I admit it and make the necessary clarifications and corrections in a following post.

In this post I am not relying on Google Translate. Do you think it is understandable?

4 Likes

That is my point and what I am asking for.

As in my previous post, I don’t use Google Translate. However, I am using Cambridge English Dictionary, so I checked if it relies on some form of AI. It turns out that Cambridge Dictionary is powered by IDM.

Although I am not using the Cambridge English Dictionary as a translator in my last two posts it is very helpful for grammar and spell checking and it is unlikely that I will stop using it entirely, no matter how my English language skills evolve. Maybe it would be useful if the Policy on AI-Assisted Contributions could explain such a use case.

2 Likes

I’m also not a native speaker, and I could understand you very well :slight_smile: Grammar and spelling is hard and even native speakers sometimes do mistakes, you’re fine.

2 Likes