OPENAI Whisper ASR - Whisper.cpp with GPU support on Fedora 39-41

Hi Folks,

Spent a day or so farting about trying to get the above installed and working. It was a bit painful, I do not have it running as yet.

I have built the same on Debian 12 for this particular install, which worked first go within 60-90 mins, obviously building on what I had learned along the way with Fedora attempts.

I have read most of the posts about RPM Fusion. I also read the post from the GPU developer with a load of manual tweaks to get things going.

Can anyone provide a short set of instructions which result in a working Fedora system?

Is it possible Fedora is perhaps not the optimal platform for GPU work?

Many thanks

What is whisper.cpp and where does it come from?

It is very cool stuff, if you have a use for it and these things excite you?

I am testing it to convert live sporting commentary to text files.

It can be very interesting to record live commentary and then compare the version you have and the one you can download from the cloud post match. From time to time you could potentially find differences.

If you check out your browser console or redirect through a proxy to Zap you will easily understand how it is very possible to manipulate live broadcast via the CDN. It’s just lots of numbered buckets. Who is going to stop any particular broadcaster adding a few extra, perhaps at the request of a 3rd party? Who is looking for this technique and catching it?

We inherently seem to trust what is being broadcast, as we all receive the same for a live group broadcast, perhaps harking back to terrestrial days.

What is interesting is if you can catch a broadcaster at it and then test if they are actually keeping a record of those additional buckets or not. The spectrum analyzer will not lie and you can pretty much demonstrate the packets are real.

Our login to a streaming service identifies us. Our browser signature is unique, for most, most of the time. Apparently individual or a group can be readily identified.

Anyway, there are a few AI ASR models and this one is doing a very good job transcribing.

Now if I can combine this live broadcast manipulation with a feedback channel, phone, tv remote, etc, etc.

The defense to this feedback channel is a buffer at your premises.

There are many techniques in this space. This is one which can be caught.

I am sure you get the picture.

HTH, thanks for asking.

The prerequisites should be the following:

DNF5 requires a minor syntax adjustment.
Otherwise, the instructions should apply as is.


Then try using their CUDA Docker image:

Remember about SELinux:

Can you please give me the URL of the post(s) from the GPU developer to whom you refer?

I have it working on Fedora 41, xfce4 spin, on an Acer Aspire 3 315-44p laptop, which carries the AMD Ryzen 7 5700u CPU with integrated AMD Radeon (Renoir) iGPU (rocminfo calls it a gfx900). Obviously CUDA ain’t gonna happen, and I haven’t managed to get it to work with ROCM either. However, nevertheless, it does transcribe from an .mp4 video quite well. I compiled whisper.cpp from git source, downloaded into /home/myusername/Downloads/whisper.cpp. I used GGML_VULKAN=1 make -j. It can use either the CPU or the GPU using Vulkan. Vulkan transcription is about twice as fast, but has an unacceptable dropouts rate. Using the CPU, in the following bash script, transcription of a 43-minute video takes about 3 minutes:

#!/bin/sh
#transcribe-whisper.cpp.sh
#usage: transcribe-whisper.cpp.sh /path-to-.mp4-file/name-of-.mp4-file
rm -f /home/myusername/Downloads/whisper.cpp/whisper.cpp/input.wav
rm -f /home/myusername/Downloads/whisper.cpp/whisper.cpp/input.wav.txt
origdircpp=$PWD
#use ffmpeg to transform the original into the 16 KHz .wav format whisper.cpp likes:
ffmpeg -i “$@” -ar 16000 -ac 1 /home/admin/Downloads/whisper.cpp/whisper.cpp/input.wav
wait 2s
cd /home/myusername/Downloads/whisper.cpp/whisper.cpp
#the following gives you a console window so you can monitor progress.
#Please note that the final .txt transcription, in the folder alongside your original .mp4 video,
#is not created until you close this monitor window.
#Please note that you must do
#cd /home/myusername/Downloads/whisper.cpp/whisper.cpp
#sh ./models/download-ggml-model.sh base.en
#to obtain the base.en model first, before you run this script.
#-ng enforces don’t use gpu when whisper.cpp is compiled with Vulkan support
#-nt = no timestamps, omit if you want timestamps.
#-otxt provides output in text (.txt) form
#-t 7 means use 7 threads. Use -t 4 if you don’t have an 8-core CPU.
xterm -hold -e ./main --ov-e-device CPU -ng -nt -otxt -t 7 --model /home/admin/Downloads/whisper.cpp/whisper.cpp/models/ggml-base.en.bin -f “/home/admin/Downloads/whisper.cpp/whisper.cpp/input.wav”
wait 2s
cd “$origdircpp”
#modify output textfile name to suit your taste:
mv -f /home/admin/Downloads/whisper.cpp/whisper.cpp/input.wav.txt “$@”.cpp.no-ts.cpu-base.en-t7.txt
rm -f /home/admin/Downloads/whisper.cpp/whisper.cpp/input.wav