Quadlet Service Failing to Start Custom-Built GPU Container (ModuleNotFoundError)

Hello Fedora Community,

I am hoping to get some expert eyes on a persistent issue I’m facing while trying to set up a containerized AI environment on my new, custom-built Fedora workstation. I have been working on this for many days with the help of Google’s Gemini, and we have solved numerous issues (drivers, networking, container syntax), but we are stuck on what appears to be the final hurdle.

The Goal:
To run a custom-built, GPU-accelerated Ollama container (ollama-local-xpu) and the standard open-webui container as systemd user services, managed by Quadlet.

The Hardware (The ‘Fedora Creator Station’):
This is an all-Intel build, running a fresh installation of Fedora 42 (Workstation Edition).

  • CPU: Intel Core Ultra 7 265K (with integrated Arrow Lake-S graphics)
  • Motherboard: MSI Z890 GAMING PLUS WIFI
  • dGPU: ASRock Arc Pro B60 Passive 24GB (Battlemage G21)
  • RAM: 128GB (2x64GB) DDR5 6000MHz
  • Storage: Samsung 9100 PRO 4TB M.2 NVMe PCIe 5.0 SSD

Summary of Steps Taken & Issues Solved:

  1. Driver Configuration: Successfully forced the system to use the xe kernel driver for both the iGPU and dGPU via a GRUB parameter (xe.force_probe=*). lspci -k confirms both devices are using xe.
  2. Container Build: The pre-built Intel containers were unreliable. We successfully built a custom image (ollama-local-xpu) from the ipex-llm GitHub source.
  3. Initial Failures: Early attempts to run the containers with podman run failed due to various issues, including networking (This site can't be reached), incorrect container arguments, and SELinux policies.
  4. Switch to Quadlets: We moved to using Quadlet .container files in ~/.config/containers/systemd/ for a stable, automated setup.
  5. Syntax Errors: With the help of the podman-system-generator --user --dryrun command, we fixed several syntax errors in our .container files (e.g., changing Device to AddDevice and Privileged to PodmanArgs=--privileged).
  6. Missing Dependency: We discovered the container was crashing with ModuleNotFoundError: No module named 'uvloop'. We have since edited the Dockerfile to include RUN pip install uvloop and have successfully rebuilt the container (confirmed with podman images, which shows a recent build time).

The Current Problem: A Persistent Crash Loop

Even after rebuilding the container with the uvloop dependency, the ollama.service is still stuck in a crash loop. The systemctl --user status command shows it repeatedly starting and failing.

Here is the full, unabridged log from journalctl --user -u ollama.service -b for the failing service:

Oct 05 19:09:44 fedora-ai-workstation systemd-ollama[1702]: Traceback (most recent call last):
Oct 05 19:09:44 fedora-ai-workstation systemd-ollama[1702]:   File "<frozen runpy>", line 198, in _run_module_as_main
Oct 05 19:09:44 fedora-ai-workstation systemd-ollama[1702]:   File "<frozen runpy>", line 88, in _run_code
Oct 05 19:09:44 fedora-ai-workstation systemd-ollama[1702]:   File "/usr/local/lib/python3.11/dist-packages/ipex_llm/vllm/xpu/entrypoints/openai/api_server.py", line 22, in <module>
Oct 05 19:09:44 fedora-ai-workstation systemd-ollama[1702]:     import uvloop
Oct 05 19:09:44 fedora-ai-workstation systemd-ollama[1702]: ModuleNotFoundError: No module named 'uvloop'
Oct 05 19:09:45 fedora-ai-workstation systemd[1504]: ollama.service: Main process exited, code=exited, status=1/FAILURE
Oct 05 19:09:45 fedora-ai-workstation systemd[1504]: ollama.service: Failed with result 'exit-code'.
... (and so on, repeating) ...

Final Configuration Files

Here are the exact, final versions of the Quadlet files we are using:

~/.config/containers/systemd/ollama.container:

[Unit]
Description=Ollama Service
After=network-online.target

[Container]
Image=localhost/ollama-local-xpu:latest
AddDevice=/dev/dri
Network=host
Volume=ollama:/root/.ollama
Exec=python -m ipex_llm.vllm.xpu.entrypoints.openai.api_server --model meta-llama/Llama-3-8B-Instruct --host 0.0.0.0 --port 11434
PodmanArgs=--privileged

[Service]
Restart=always

[Install]
WantedBy=default.target

~/.config/containers/systemd/webui.container:

[Unit]
Description=Open WebUI Service
After=ollama.service
Requires=ollama.service

[Container]
Image=ghcr.io/open-webui/open-webui:main
Network=host
Environment=OLLAMA_BASE_URL=http://127.0.0.1:11434
Volume=open-webui:/app/backend/data

[Service]
Restart=always

[Install]
WantedBy=default.target

The Question

Why would the ollama.service still be failing with ModuleNotFoundError: No module named 'uvloop' when we have explicitly added the pip install uvloop command to the Dockerfile and successfully rebuilt the image?

It feels like the systemd service is somehow running an old, cached version of the container, despite podman images showing the new build. I have run systemctl --user daemon-reload many times.

Any insight or diagnostic steps the community could provide would be immensely appreciated. Thank you for your time and expertise.

Are you absolutely positive that the uvloop module was installed into the same python installation that ollama is using.

python -m pip list will give you a full list of all modules; uvloop should be in there (version 0.21.0 is the version I have)

You could also prove it’s usable and working by firing up the python reply inside your container and importing it. If you can use it then I can’t think of a reason why ollama shouldn’t be able to too.

Thank you, Steve. Your advice enabled me to make a significant move forward in the configuration of the system. Ultimately, I have run into a bug in the Intel Xe drivers development, which I have forwarded to GitLab for assistance - Making sure you're not a bot!

1 Like