F42 Change Proposal: ibus-speech-to-text (self-contained)

ibus-speech-to-text

This is a proposed Change for Fedora Linux.
This document represents a proposed Change. As part of the Changes process, proposals are publicly announced in order to receive community feedback. This proposal will only be implemented if approved by the Fedora Engineering Steering Committee.

Wiki
Announced

:link: Summary

ibus-speech-to-text will provide voice dictation capabilities to any application supporting IBus input methods in Fedora Linux 42, using VOSK for local voice recognition.

:link: Owner

:link: Detailed Description

  • ibus-speech-to-text provides a new input method that enables voice dictation in any application supporting IBus
  • Uses VOSK for local voice recognition, not requiring internet connectivity
  • Supports multiple languages through downloadable voice recognition models
  • Includes a setup tool built with GTK 4 and libadwaita for model management and configuration

:link: Feedback

:link: Benefit to Fedora

This package will bring several benefits to Fedora:

  • Provides accessibility improvements through voice input capabilities
  • Offers offline voice recognition, preserving user privacy
  • Integrates seamlessly with existing IBus infrastructure
  • Supports multiple languages through downloadable models
  • Enhances user productivity through voice commands

:link: Scope

  • Proposal owners:

    • Package ibus-speech-to-text (review) [done]
    • Package dependencies: gst-vosk (bz) and vosk-api (bz) [done]
  • Other developers: Parag Nemade

  • Release engineering: #Releng issue number

  • Policies and guidelines: N/A (not needed for this Change)

  • Trademark approval: N/A (not needed for this Change)

  • Alignment with the Fedora Strategy:

:link: Upgrade/compatibility impact

:link: Early Testing (Optional)

Do you require ‘QA Blueprint’ support? N

:link: How To Test

:link: Functionality Test

  1. Install required packages:sudo dnf install ibus-speech-to-text

  2. Restart IBus using ibus restart command

  3. Add Speech To Text in input sources

  4. Launch the IBus STT Setup tool from the preferences for a configuration and to download a language model

  5. Open a text editor

  6. This Input Method can also be enabled and disabled with the default shortcut (“Win + Space”) used to switch between IBus Input Methods

:link: User Experience

Users will be able to:

  • Dictate text in any application supporting IBus
  • Switch between typing and voice input easily
  • Manage language models through a modern IBus STT Setup tool

:link: Dependencies

:link: Contingency Plan

  • Contingency mechanism: Remove the package
  • Contingency deadline: N/A
  • Blocks release? N/A

:link: Documentation

:link: Release Notes

ibus-speech-to-text has been added to Fedora

Last edited by @amoloney 2025-01-23T20:13:03Z

Last edited by @amoloney 2025-01-23T20:13:03Z

2 Likes

How do you feel about the proposal as written?

  • Strongly in favor
  • In favor, with reservations
  • Neutral
  • Opposed, but could be convinced
  • Strongly opposed
0 voters

If you are in favor but have reservations, or are opposed but something could change your mind, please explain in a reply.

We want everyone to be heard, but many posts repeating the same thing actually makes that harder. If you have something new to say, please say it. If, instead, you find someone has already covered what you’d like to express, please simply give that post a :heart: instead of reiterating. You can even do this by email, by replying with the heart emoji or just “+1”. This will make long topics easier to follow.

Please note that this is an advisory “straw poll” meant to gauge sentiment. It isn’t a vote or a scientific survey. See About the Change Proposals category for more about the Change Process and moderation policy.

I see that the documentation for the current version states that this runs entirely offline. Is there any danger that a future update might (accidentally or otherwise) start using online resources? Could the “offline” policy be enforced somehow (e.g. SELinux rules)? Maybe “offline” should even be part of the package name to make that explicit and, if anyone later wants to use some sort of online system, they would have to build a different package?

2 Likes

This change proposal has now been submitted to FESCo with ticket #3363 for voting.

To find out more, please visit our Changes Policy documentation.

This change has been accepted by FESCo for Fedora Linux 42. A full list of approved changes to date can be found on the Change Set Page.

To find out more about how our changes policy works, please visit our docs site.

Is there any itent here to install this by default? Or just provide it as an option?

I am not sure if the Change owner Manish is aware of this discussion topic. Let me ping him personally.

Also, I see that the Change wiki page do not contain link to this discussion thread, I have fixed this now.

Hi @kevin , Currently ibus-speech-to-text is provided as an option, not installed by default and there is no plan to include it in the default Fedora installation.

1 Like

I think we can consider this as more of a “Tech Preview” - it is probably not ready nor mature enough for general use, but still it is an interesting open project, which people can try out and hopefully could improve over time.

1 Like

I am not sure how to do that - it would really require upstream work I think, but any pointers to similar handling are welcome I think.

Note it does require downloading voice model data when setting up. We will also be testing it as part of our I18N Test Week.

1 Like

I don’t know how to write such SELinux rules, but I’ve seen them work, so I’m pretty confident it can be done. For example, someone once tried to add a custom Python script that would send them notifications via the online Pushbullet service, to the smartd service, but they kept getting the below SELinux denial.

type=AVC msg=audit(1659121294.93:468): avc:  denied  { name_connect } for  pid=1770 comm="python" dest=443 scontext=system_u:system_r:fsdaemon_t:s0 tcontext=system_u:object_r:http_port_t:s0 tclass=tcp_socket permissive=0

SELinux was preventing the service that was running as fsdaemon_t from opening a TCP socket. That’s the sort of thing that I would like to happen if this new speech-to-text service were to try to route any data to any online service.

The full context of the above example can be found here: SELinux is preventing python from name_connect access on the tcp_socket port 443

P.S. I personally think locking this service down is quite important and I wouldn’t want to see it installed without such security measures. IMO, the potential of having an “open mic” plugged into the world-wide-web is a pretty serious concern.

ok. Thanks for the info! this looks like a pretty interesting thing.

1 Like

Could you open a bug so we can discuss and track it?
Sorry for the late response

1 Like

Hi Jens.

I made an attempt at filing a bug report for this issue:

Thanks.

1 Like