Hello Fedora Community,
I am hoping to get some expert eyes on a persistent issue I’m facing while trying to set up a containerized AI environment on my new, custom-built Fedora workstation. I have been working on this for many days with the help of Google’s Gemini, and we have solved numerous issues (drivers, networking, container syntax), but we are stuck on what appears to be the final hurdle.
The Goal:
To run a custom-built, GPU-accelerated Ollama container (ollama-local-xpu) and the standard open-webui container as systemd user services, managed by Quadlet.
The Hardware (The ‘Fedora Creator Station’):
This is an all-Intel build, running a fresh installation of Fedora 42 (Workstation Edition).
- CPU: Intel Core Ultra 7 265K (with integrated Arrow Lake-S graphics)
- Motherboard: MSI Z890 GAMING PLUS WIFI
- dGPU: ASRock Arc Pro B60 Passive 24GB (Battlemage G21)
- RAM: 128GB (2x64GB) DDR5 6000MHz
- Storage: Samsung 9100 PRO 4TB M.2 NVMe PCIe 5.0 SSD
Summary of Steps Taken & Issues Solved:
- Driver Configuration: Successfully forced the system to use the
xekernel driver for both the iGPU and dGPU via a GRUB parameter (xe.force_probe=*).lspci -kconfirms both devices are usingxe. - Container Build: The pre-built Intel containers were unreliable. We successfully built a custom image (
ollama-local-xpu) from theipex-llmGitHub source. - Initial Failures: Early attempts to run the containers with
podman runfailed due to various issues, including networking (This site can't be reached), incorrect container arguments, and SELinux policies. - Switch to Quadlets: We moved to using Quadlet
.containerfiles in~/.config/containers/systemd/for a stable, automated setup. - Syntax Errors: With the help of the
podman-system-generator --user --dryruncommand, we fixed several syntax errors in our.containerfiles (e.g., changingDevicetoAddDeviceandPrivilegedtoPodmanArgs=--privileged). - Missing Dependency: We discovered the container was crashing with
ModuleNotFoundError: No module named 'uvloop'. We have since edited the Dockerfile to includeRUN pip install uvloopand have successfully rebuilt the container (confirmed withpodman images, which shows a recent build time).
The Current Problem: A Persistent Crash Loop
Even after rebuilding the container with the uvloop dependency, the ollama.service is still stuck in a crash loop. The systemctl --user status command shows it repeatedly starting and failing.
Here is the full, unabridged log from journalctl --user -u ollama.service -b for the failing service:
Oct 05 19:09:44 fedora-ai-workstation systemd-ollama[1702]: Traceback (most recent call last):
Oct 05 19:09:44 fedora-ai-workstation systemd-ollama[1702]: File "<frozen runpy>", line 198, in _run_module_as_main
Oct 05 19:09:44 fedora-ai-workstation systemd-ollama[1702]: File "<frozen runpy>", line 88, in _run_code
Oct 05 19:09:44 fedora-ai-workstation systemd-ollama[1702]: File "/usr/local/lib/python3.11/dist-packages/ipex_llm/vllm/xpu/entrypoints/openai/api_server.py", line 22, in <module>
Oct 05 19:09:44 fedora-ai-workstation systemd-ollama[1702]: import uvloop
Oct 05 19:09:44 fedora-ai-workstation systemd-ollama[1702]: ModuleNotFoundError: No module named 'uvloop'
Oct 05 19:09:45 fedora-ai-workstation systemd[1504]: ollama.service: Main process exited, code=exited, status=1/FAILURE
Oct 05 19:09:45 fedora-ai-workstation systemd[1504]: ollama.service: Failed with result 'exit-code'.
... (and so on, repeating) ...
Final Configuration Files
Here are the exact, final versions of the Quadlet files we are using:
~/.config/containers/systemd/ollama.container:
[Unit]
Description=Ollama Service
After=network-online.target
[Container]
Image=localhost/ollama-local-xpu:latest
AddDevice=/dev/dri
Network=host
Volume=ollama:/root/.ollama
Exec=python -m ipex_llm.vllm.xpu.entrypoints.openai.api_server --model meta-llama/Llama-3-8B-Instruct --host 0.0.0.0 --port 11434
PodmanArgs=--privileged
[Service]
Restart=always
[Install]
WantedBy=default.target
~/.config/containers/systemd/webui.container:
[Unit]
Description=Open WebUI Service
After=ollama.service
Requires=ollama.service
[Container]
Image=ghcr.io/open-webui/open-webui:main
Network=host
Environment=OLLAMA_BASE_URL=http://127.0.0.1:11434
Volume=open-webui:/app/backend/data
[Service]
Restart=always
[Install]
WantedBy=default.target
The Question
Why would the ollama.service still be failing with ModuleNotFoundError: No module named 'uvloop' when we have explicitly added the pip install uvloop command to the Dockerfile and successfully rebuilt the image?
It feels like the systemd service is somehow running an old, cached version of the container, despite podman images showing the new build. I have run systemctl --user daemon-reload many times.
Any insight or diagnostic steps the community could provide would be immensely appreciated. Thank you for your time and expertise.