[Fedora Silverblue] Ollama with ROCM - failed to check permission on /dev/kfd: open /dev/kfd: invalid argument

Hello,

I am trying to get ollama running with rocm and no matter what I try I get the error:
failed to check permission on /dev/kfd: open /dev/kfd: invalid argument

The whole startup looks like this:

2024/09/20 00:38:49 routes.go:1153: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/var/home/nmcbride/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2024-09-20T00:38:49.103-04:00 level=INFO source=images.go:753 msg="total blobs: 9"
time=2024-09-20T00:38:49.103-04:00 level=INFO source=images.go:760 msg="total unused blobs removed: 0"
time=2024-09-20T00:38:49.103-04:00 level=INFO source=routes.go:1200 msg="Listening on 127.0.0.1:11434 (version 0.3.11)"
time=2024-09-20T00:38:49.104-04:00 level=INFO source=common.go:135 msg="extracting embedded files" dir=/tmp/ollama2084712407/runners
time=2024-09-20T00:38:54.891-04:00 level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[rocm_v60102 cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12]"
time=2024-09-20T00:38:54.891-04:00 level=INFO source=gpu.go:199 msg="looking for compatible GPUs"
time=2024-09-20T00:38:54.894-04:00 level=WARN source=amd_linux.go:60 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
time=2024-09-20T00:38:54.898-04:00 level=INFO source=amd_linux.go:346 msg="amdgpu is supported" gpu=0 gpu_type=gfx1102
time=2024-09-20T00:38:54.898-04:00 level=WARN source=amd_linux.go:341 msg="amdgpu is not supported" gpu=1 gpu_type=gfx1103 library=/var/home/nmcbride/Downloads/ollama/v0.3.11/lib supported_types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx900 gfx906 gfx908 gfx90a gfx940 gfx941 gfx942]"
time=2024-09-20T00:38:54.898-04:00 level=WARN source=amd_linux.go:343 msg="See https://github.com/ollama/ollama/blob/main/docs/gpu.md#overrides for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-09-20T00:38:54.899-04:00 level=INFO source=amd_linux.go:346 msg="amdgpu is supported" gpu=2 gpu_type=gfx1100
time=2024-09-20T00:38:54.899-04:00 level=ERROR source=amd_linux.go:364 msg="amdgpu devices detected but permission problems block access" error="failed to check permission on /dev/kfd: open /dev/kfd: invalid argument"
time=2024-09-20T00:38:54.899-04:00 level=INFO source=gpu.go:346 msg="no compatible GPUs were discovered"
time=2024-09-20T00:38:54.899-04:00 level=INFO source=types.go:107 msg="inference compute" id=0 library=cpu variant=avx2 compute="" driver=0.0 name="" total="60.6 GiB" available="52.7 GiB"

/dev/kfd is owned by root and render. I have tried starting ollama both as root and with my user added to the render group. Neither of these made any difference.

crw-rw-rw-. 1 root render 235, 0 Sep 20 00:29 /dev/kfd

I am not sure if this is an actual permission issue or if there is some other silverblue quirks getting in the way.

Any help would be appreciated Iā€™ve been fighting getting rocm working with ollama for quite awhile on silverblue to be honest. My workaround has been to run it on a windows machine with a 3090 and access it across the network but Iā€™d like it to work locally.

Thank you.

For image-based Fedora systems (Atomic Desktops), I would recommend running Ollama and ROCm in a container.

@hricky

Yes I did try that as my first option but get the exact same issue. Really looks like it is some sort of issue with the device on the host side but I just am not sure.

me@framework-16-sb:~/Downloads/ollama/v0.3.11$ podman run --security-opt label=disable --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:rocm
Resolved "ollama/ollama" as an alias (/var/home/me/.cache/containers/short-name-aliases.conf)
Trying to pull docker.io/ollama/ollama:rocm...
Getting image source signatures
Copying blob d720c73460f5 done   | 
Copying blob 6414378b6477 done   | 
Copying blob 8ec3a60293bb done   | 
Copying blob 90ad675a9966 done   | 
Copying blob 54193bcffb20 done   | 
Copying blob aaaa61dc9370 done   | 
Copying blob 85543e5712b9 done   | 
Copying blob b48fcfdf2702 done   | 
Copying config 44e9f274fd done   | 
Writing manifest to image destination
2024/09/20 16:58:06 routes.go:1153: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2024-09-20T16:58:06.204Z level=INFO source=images.go:753 msg="total blobs: 9"
time=2024-09-20T16:58:06.204Z level=INFO source=images.go:760 msg="total unused blobs removed: 0"
time=2024-09-20T16:58:06.204Z level=INFO source=routes.go:1200 msg="Listening on [::]:11434 (version 0.3.11)"
time=2024-09-20T16:58:06.205Z level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[rocm_v60102 cpu cpu_avx cpu_avx2]"
time=2024-09-20T16:58:06.205Z level=INFO source=gpu.go:199 msg="looking for compatible GPUs"
time=2024-09-20T16:58:06.206Z level=WARN source=amd_linux.go:60 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
time=2024-09-20T16:58:06.209Z level=INFO source=amd_linux.go:346 msg="amdgpu is supported" gpu=0 gpu_type=gfx1102
time=2024-09-20T16:58:06.211Z level=WARN source=amd_linux.go:341 msg="amdgpu is not supported" gpu=1 gpu_type=gfx1103 library=/usr/lib/ollama supported_types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx900 gfx906 gfx908 gfx90a gfx940 gfx941 gfx942]"
time=2024-09-20T16:58:06.211Z level=WARN source=amd_linux.go:343 msg="See https://github.com/ollama/ollama/blob/main/docs/gpu.md#overrides for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-09-20T16:58:06.212Z level=INFO source=amd_linux.go:346 msg="amdgpu is supported" gpu=2 gpu_type=gfx1100
time=2024-09-20T16:58:06.212Z level=ERROR source=amd_linux.go:364 msg="amdgpu devices detected but permission problems block access" error="failed to check permission on /dev/kfd: open /dev/kfd: invalid argument"
time=2024-09-20T16:58:06.212Z level=INFO source=gpu.go:346 msg="no compatible GPUs were discovered"
time=2024-09-20T16:58:06.212Z level=INFO source=types.go:107 msg="inference compute" id=0 library=cpu variant=avx2 compute="" driver=0.0 name="" total="60.6 GiB" available="52.7 GiB"

I havenā€™t run into any errors, but Iā€™ve found that I donā€™t get GPU/ROCm acceleration unless I am running the container as root - run the podman commands with sudo. Otherwise, Ollama runs using the CPU, even when using the ollama:rocm image.

For reference, Iā€™m using a AMD 6900XT and Kinoite 40.

1 Like

Iā€™m using an Ollama/ROCm Fedora based container image built for my own use, but this one from docker.io should work too.

Hereā€™s an example command that will probably work, but youā€™ll need to modify for your needs.

name="ollama-rocm"
image="docker.io/ollama/ollama:rocm"

sudo podman container run \
                --name $name \
                --detach \
                --interactive \
                --tty \
                --device /dev/kfd \
                --device /dev/dri \
                --security-opt label=disable \
                --volume $HOME/.ollama:/root/.ollama \
                --publish 11434:11434 \
                $image

@guiltydoggy @hricky

I just gave that a shot and have the exact same issue:

command:

sudo podman container run --name ollama-rocm --detach --interactive --tty --device /dev/kfd --device /dev/dri --security-opt label=disable --volume $HOME/.ollama:/root/.ollama --publish 11434:11434 "docker.io/ollama/ollama:rocm"

Output:

root@framework-16-sb:~# podman logs ollama-rocm
2024/09/20 18:54:48 routes.go:1153: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2024-09-20T18:54:48.621Z level=INFO source=images.go:753 msg="total blobs: 9"
time=2024-09-20T18:54:48.621Z level=INFO source=images.go:760 msg="total unused blobs removed: 0"
time=2024-09-20T18:54:48.621Z level=INFO source=routes.go:1200 msg="Listening on [::]:11434 (version 0.3.11)"
time=2024-09-20T18:54:48.621Z level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 rocm_v60102]"
time=2024-09-20T18:54:48.621Z level=INFO source=gpu.go:199 msg="looking for compatible GPUs"
time=2024-09-20T18:54:48.623Z level=WARN source=amd_linux.go:60 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
time=2024-09-20T18:54:48.624Z level=INFO source=amd_linux.go:346 msg="amdgpu is supported" gpu=0 gpu_type=gfx1102
time=2024-09-20T18:54:48.625Z level=WARN source=amd_linux.go:341 msg="amdgpu is not supported" gpu=1 gpu_type=gfx1103 library=/usr/lib/ollama supported_types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx900 gfx906 gfx908 gfx90a gfx940 gfx941 gfx942]"
time=2024-09-20T18:54:48.625Z level=WARN source=amd_linux.go:343 msg="See https://github.com/ollama/ollama/blob/main/docs/gpu.md#overrides for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-09-20T18:54:48.626Z level=INFO source=amd_linux.go:346 msg="amdgpu is supported" gpu=2 gpu_type=gfx1100
time=2024-09-20T18:54:48.626Z level=ERROR source=amd_linux.go:364 msg="amdgpu devices detected but permission problems block access" error="failed to check permission on /dev/kfd: open /dev/kfd: invalid argument"
time=2024-09-20T18:54:48.626Z level=INFO source=gpu.go:346 msg="no compatible GPUs were discovered"
time=2024-09-20T18:54:48.626Z level=INFO source=types.go:107 msg="inference compute" id=0 library=cpu variant=avx2 compute="" driver=0.0 name="" total="60.6 GiB" available="51.0 GiB"

"amdgpu devices detected but permission problems block access" error="failed to check permission on /dev/kfd: open /dev/kfd: invalid argument"

Could you please share the output of the command grep --extended-regexp --regexp='.*ollama' /etc/passwd /etc/group /usr/lib/passwd /usr/lib/group.

I do not have any ollama user / group on my system.

Hereā€™s how itā€™s set up on my end.

grep --extended-regexp  --regexp='.*ollama' /etc/passwd /etc/group /usr/lib/passwd /usr/lib/group

/etc/passwd:ollama:x:960:960::/usr/share/ollama:/bin/false
/etc/group:video:x:39:hricky,ollama
/etc/group:render:x:105:hricky,ollama
/etc/group:ollama:x:960:hricky,root

Replicate it, by changing hricky to your user, reboot and try the container again. Not sure, but maybe it will help.

Hereā€™s how Iā€™ve run ollama with podman:

podman run --pull newer --detach --security-opt label=type:container_runtime_t --replace --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:rocm

(The largest difference: This avoids label=disable mentioned in the above solutions, and still works in a container with the GPU on an AMD system in a Podman user account.)

And to run the ollama command manually, do this:

podman exec -it ollama ollama

(Add ā€œrunā€ or whatever after if you want to run a model, like podman exec -it ollama ollama run mistral ā€” this will download the model the first time you run it.)

Then, if you want a web UI for it, thereā€™s this too:

podman run --replace --pull newer -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

After you have ollama and the web UI running, it can be accessed at http://localhost:8080/

If you want to be really fancy, you can create quadlets (podman containers as a systemd service):

Create a file at .config/containers/systemd/ollama.container with this as the contents:

[Container]
ContainerName=ollama
Image=ollama/ollama:rocm
Volume=ollama:/root/.ollama
Pull=newer
AddDevice=/dev/kfd
AddDevice=/dev/dri
PublishPort=11434:11434
SecurityLabelType=container_runtime_t

And this as .config/containers/systemd/ollama-webui.container

[Container]
ContainerName=open-webui
Image=ghcr.io/open-webui/open-webui:main
Pull=newer
Volume=open-webui:/app/backend/data
Environment=OLLAMA_BASE_URL=http://127.0.0.1:11434
Network=host

Then run these commands to have it start as services when you log in:

systemctl --user daemon-reload
systemctl --user enable --now ollama ollama-webui

Notes:

  1. If you donā€™t want the web UI, just use the ollama container.
  2. If you donā€™t want this to run on start, use systemctl --user start ollama ollama-webui instead to start on demand. (The daemon-reload command is only needed after adding or changing the systemd-related files.)
  3. Like the above podman commands, it will take a while on first start, as it needs to download. It also will take a little while to start when the containers change, as itā€™ll download the new parts. Youā€™ll see the changes when running the podman command, but the quadlet version via systemctl will download changes in the background.
  4. This has persistent storage, thanks to the volume.
  5. Thereā€™s a bunch of neat stuff you can do with an LLM (translations in a more conversational way, suggestions of all kinds of things in various topics, writing a rhyme based on some text, etc.), but donā€™t trust it. The level of trust you should place in its outputs is about the same as a random person you donā€™t know at a bar who starts talking with you: They might know what theyā€™re talking about or mightā€™ve had something to drink and are completely making things up, intentional or not. Or somewhere in-between.
  6. Seriously again; this is worth repeating: Donā€™t trust it. To get a feel for it, start by asking it several things on topics you know very well and then apply that same skepticism to anything else it ever says. (This goes for all LLMs, not just ones that can run in ollama.)
1 Like

Iā€™m trying this now.

Would you mind showing your permissions on /dev/dri and /dev/kfd?

$ sudo ls -Zlathri /dev/dri /dev/kfd

491 crw-rw-rw-. 1 root render system_u:object_r:hsa_device_t:s0 235, 0 21 сŠµŠæ 12:32 /dev/kfd

/dev/dri:
total 0
570 drwxr-xr-x.  3 root root   system_u:object_r:device_t:s0          100 21 сŠµŠæ 12:32 .
571 crw-rw-rw-.  1 root render system_u:object_r:dri_device_t:s0 226, 128 21 сŠµŠæ 12:32 renderD128
576 drwxr-xr-x.  2 root root   system_u:object_r:device_t:s0           80 21 сŠµŠæ 12:32 by-path
572 crw-rw----+  1 root video  system_u:object_r:dri_device_t:s0 226,   1 22 сŠµŠæ 10:22 card1
  1 drwxr-xr-x. 22 root root   system_u:object_r:device_t:s0         4,5K 22 сŠµŠæ 10:23 ..

I think mine looks ok I do not see anything that is majorly different:

569 crw-rw-rw-. 1 root render system_u:object_r:hsa_device_t:s0 235, 0 Sep 21 23:12 /dev/kfd

/dev/dri:
total 0
 763 crw-rw-rw-.  1 root render system_u:object_r:dri_device_t:s0 226, 130 Sep 21 23:12 renderD130
 755 crw-rw-rw-.  1 root render system_u:object_r:dri_device_t:s0 226, 128 Sep 21 23:12 renderD128
 217 drwxr-xr-x.  3 root root   system_u:object_r:device_t:s0          180 Sep 21 23:13 .
1387 crw-rw-rw-.  1 root render system_u:object_r:dri_device_t:s0 226, 129 Sep 21 23:13 renderD129
 774 drwxr-xr-x.  2 root root   system_u:object_r:device_t:s0          160 Sep 21 23:13 by-path
   1 drwxr-xr-x. 21 root root   system_u:object_r:device_t:s0         5.0K Sep 21 23:13 ..
 764 crw-rw----+  1 root video  system_u:object_r:dri_device_t:s0 226,   3 Sep 22 22:30 card3
1388 crw-rw----+  1 root video  system_u:object_r:dri_device_t:s0 226,   0 Sep 24 16:40 card0
 756 crw-rw----+  1 root video  system_u:object_r:dri_device_t:s0 226,   1 Sep 24 16:51 card1

Do you guys think it could be because I have multiple cards?

I donā€™t think that should be an issue, but should be pretty quick to test yourself.

Whatā€™s your setup look like? Isnā€™t gfx1103 a mobile APU? Do you have the other GPUs attached via OCuLink or something? I have no idea if that would be relevant at all to the issue, but curious.

Yea that one you mentioned is the one built into the processor which is not supported.

Then I have a S7700 dedicated on the bus which is supported

As well as a 7900 XTX on thunderbolt in an egpu enclosure which is supported