F42 Change Proposal: ibus-speech-to-text (self-contained)

amoloney · January 23, 2025, 8:09pm

ibus-speech-to-text

This is a proposed Change for Fedora Linux.
This document represents a proposed Change. As part of the Changes process, proposals are publicly announced in order to receive community feedback. This proposal will only be implemented if approved by the Fedora Engineering Steering Committee.

Wiki
Announced

Summary

ibus-speech-to-text will provide voice dictation capabilities to any application supporting IBus input methods in Fedora Linux 42, using VOSK for local voice recognition.

Owner

Name: Manish Tiwari
Email: matiwari [at] redhat [dot] com

Detailed Description

ibus-speech-to-text provides a new input method that enables voice dictation in any application supporting IBus
Uses VOSK for local voice recognition, not requiring internet connectivity
Supports multiple languages through downloadable voice recognition models
Includes a setup tool built with GTK 4 and libadwaita for model management and configuration

Feedback

Benefit to Fedora

This package will bring several benefits to Fedora:

Provides accessibility improvements through voice input capabilities
Offers offline voice recognition, preserving user privacy
Integrates seamlessly with existing IBus infrastructure
Supports multiple languages through downloadable models
Enhances user productivity through voice commands

Scope

Proposal owners:
- Package ibus-speech-to-text (review) [done]
- Package dependencies: gst-vosk (bz) and vosk-api (bz) [done]
Other developers: Parag Nemade
Release engineering: #Releng issue number
Policies and guidelines: N/A (not needed for this Change)
Trademark approval: N/A (not needed for this Change)
Alignment with the Fedora Strategy:

Upgrade/compatibility impact

Early Testing (Optional)

Do you require ‘QA Blueprint’ support? N

How To Test

Functionality Test

Install required packages:sudo dnf install ibus-speech-to-text
Restart IBus using ibus restart command
Add Speech To Text in input sources
Launch the IBus STT Setup tool from the preferences for a configuration and to download a language model
Open a text editor
This Input Method can also be enabled and disabled with the default shortcut (“Win + Space”) used to switch between IBus Input Methods

User Experience

Users will be able to:

Dictate text in any application supporting IBus
Switch between typing and voice input easily
Manage language models through a modern IBus STT Setup tool

Dependencies

Contingency Plan

Contingency mechanism: Remove the package
Contingency deadline: N/A
Blocks release? N/A

Documentation

Release Notes

ibus-speech-to-text has been added to Fedora

Last edited by @amoloney 2025-01-23T20:13:03Z

Last edited by @amoloney 2025-01-23T20:13:03Z

system · January 23, 2025, 8:09pm

How do you feel about the proposal as written?

Strongly in favor
In favor, with reservations
Neutral
Opposed, but could be convinced
Strongly opposed

0 voters

If you are in favor but have reservations, or are opposed but something could change your mind, please explain in a reply.

We want everyone to be heard, but many posts repeating the same thing actually makes that harder. If you have something new to say, please say it. If, instead, you find someone has already covered what you’d like to express, please simply give that post a instead of reiterating. You can even do this by email, by replying with the heart emoji or just “+1”. This will make long topics easier to follow.

Please note that this is an advisory “straw poll” meant to gauge sentiment. It isn’t a vote or a scientific survey. See About the Change Proposals category for more about the Change Process and moderation policy.

glb · January 23, 2025, 9:08pm

I see that the documentation for the current version states that this runs entirely offline. Is there any danger that a future update might (accidentally or otherwise) start using online resources? Could the “offline” policy be enforced somehow (e.g. SELinux rules)? Maybe “offline” should even be part of the package name to make that explicit and, if anyone later wants to use some sort of online system, they would have to build a different package?

amoloney · February 17, 2025, 10:36pm

This change proposal has now been submitted to FESCo with ticket #3363 for voting.

To find out more, please visit our Changes Policy documentation.

amoloney · February 18, 2025, 3:17pm

This change has been accepted by FESCo for Fedora Linux 42. A full list of approved changes to date can be found on the Change Set Page.

To find out more about how our changes policy works, please visit our docs site.

kevin · February 18, 2025, 10:20pm

Is there any itent here to install this by default? Or just provide it as an option?

pnemade · February 19, 2025, 2:51am

I am not sure if the Change owner Manish is aware of this discussion topic. Let me ping him personally.

pnemade · February 19, 2025, 3:07am

Also, I see that the Change wiki page do not contain link to this discussion thread, I have fixed this now.

matiwari · February 19, 2025, 5:36am

Hi @kevin , Currently ibus-speech-to-text is provided as an option, not installed by default and there is no plan to include it in the default Fedora installation.

petersen · February 19, 2025, 5:56am

I think we can consider this as more of a “Tech Preview” - it is probably not ready nor mature enough for general use, but still it is an interesting open project, which people can try out and hopefully could improve over time.

petersen · February 19, 2025, 6:01am

I am not sure how to do that - it would really require upstream work I think, but any pointers to similar handling are welcome I think.

Note it does require downloading voice model data when setting up. We will also be testing it as part of our I18N Test Week.

glb · February 19, 2025, 6:33am

I don’t know how to write such SELinux rules, but I’ve seen them work, so I’m pretty confident it can be done. For example, someone once tried to add a custom Python script that would send them notifications via the online Pushbullet service, to the smartd service, but they kept getting the below SELinux denial.

type=AVC msg=audit(1659121294.93:468): avc:  denied  { name_connect } for  pid=1770 comm="python" dest=443 scontext=system_u:system_r:fsdaemon_t:s0 tcontext=system_u:object_r:http_port_t:s0 tclass=tcp_socket permissive=0

SELinux was preventing the service that was running as fsdaemon_t from opening a TCP socket. That’s the sort of thing that I would like to happen if this new speech-to-text service were to try to route any data to any online service.

The full context of the above example can be found here: SELinux is preventing python from name_connect access on the tcp_socket port 443

P.S. I personally think locking this service down is quite important and I wouldn’t want to see it installed without such security measures. IMO, the potential of having an “open mic” plugged into the world-wide-web is a pretty serious concern.

kevin · February 19, 2025, 5:10pm

ok. Thanks for the info! this looks like a pretty interesting thing.

petersen · May 29, 2025, 2:53pm

Could you open a bug so we can discuss and track it?
Sorry for the late response

glb · May 29, 2025, 3:53pm

Hi Jens.

I made an attempt at filing a bug report for this issue:

Thanks.

Topic		Replies	Views
F42 Change Proposal: ibus-libpinyin 1.16 (self-contained) Change Proposals fesco , f42	8	234	February 17, 2025
F41 Change Proposal: IBus Chewing For Traditional Chinese (Taiwan) Desktop by Default (self contained) Change Proposals fesco , f41	3	344	June 21, 2024
Has anyone tested out any general speech to text systems? Ask Fedora f40	0	373	May 28, 2024
F42 Change Proposal: Opt-In Metrics for Fedora Workstation (system-wide) Change Proposals fesco , f42	152	4179	March 11, 2025
Test Week: Internationalization (i18n) features for Fedora 34 Community Blog	0	293	March 7, 2021