Hello,
I am trying to get ollama running with rocm and no matter what I try I get the error:
failed to check permission on /dev/kfd: open /dev/kfd: invalid argument
The whole startup looks like this:
2024/09/20 00:38:49 routes.go:1153: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/var/home/nmcbride/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2024-09-20T00:38:49.103-04:00 level=INFO source=images.go:753 msg="total blobs: 9"
time=2024-09-20T00:38:49.103-04:00 level=INFO source=images.go:760 msg="total unused blobs removed: 0"
time=2024-09-20T00:38:49.103-04:00 level=INFO source=routes.go:1200 msg="Listening on 127.0.0.1:11434 (version 0.3.11)"
time=2024-09-20T00:38:49.104-04:00 level=INFO source=common.go:135 msg="extracting embedded files" dir=/tmp/ollama2084712407/runners
time=2024-09-20T00:38:54.891-04:00 level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[rocm_v60102 cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12]"
time=2024-09-20T00:38:54.891-04:00 level=INFO source=gpu.go:199 msg="looking for compatible GPUs"
time=2024-09-20T00:38:54.894-04:00 level=WARN source=amd_linux.go:60 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
time=2024-09-20T00:38:54.898-04:00 level=INFO source=amd_linux.go:346 msg="amdgpu is supported" gpu=0 gpu_type=gfx1102
time=2024-09-20T00:38:54.898-04:00 level=WARN source=amd_linux.go:341 msg="amdgpu is not supported" gpu=1 gpu_type=gfx1103 library=/var/home/nmcbride/Downloads/ollama/v0.3.11/lib supported_types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx900 gfx906 gfx908 gfx90a gfx940 gfx941 gfx942]"
time=2024-09-20T00:38:54.898-04:00 level=WARN source=amd_linux.go:343 msg="See https://github.com/ollama/ollama/blob/main/docs/gpu.md#overrides for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-09-20T00:38:54.899-04:00 level=INFO source=amd_linux.go:346 msg="amdgpu is supported" gpu=2 gpu_type=gfx1100
time=2024-09-20T00:38:54.899-04:00 level=ERROR source=amd_linux.go:364 msg="amdgpu devices detected but permission problems block access" error="failed to check permission on /dev/kfd: open /dev/kfd: invalid argument"
time=2024-09-20T00:38:54.899-04:00 level=INFO source=gpu.go:346 msg="no compatible GPUs were discovered"
time=2024-09-20T00:38:54.899-04:00 level=INFO source=types.go:107 msg="inference compute" id=0 library=cpu variant=avx2 compute="" driver=0.0 name="" total="60.6 GiB" available="52.7 GiB"
/dev/kfd
is owned by root and render. I have tried starting ollama both as root and with my user added to the render group. Neither of these made any difference.
crw-rw-rw-. 1 root render 235, 0 Sep 20 00:29 /dev/kfd
I am not sure if this is an actual permission issue or if there is some other silverblue quirks getting in the way.
Any help would be appreciated Iāve been fighting getting rocm working with ollama for quite awhile on silverblue to be honest. My workaround has been to run it on a windows machine with a 3090 and access it across the network but Iād like it to work locally.
Thank you.