NO NO NO!
So, I tried to update my system and everything broke. Firefox wouldn’t even start. OMG, AMD, you drive me insane. I had to unstall AMDGPU. I’ll try to start all over again. Maybe 21.50 doesn’t work on F35.
NO NO NO!
So, I tried to update my system and everything broke. Firefox wouldn’t even start. OMG, AMD, you drive me insane. I had to unstall AMDGPU. I’ll try to start all over again. Maybe 21.50 doesn’t work on F35.
Well, there goes another 5 hours of my life. Just after I gave AMD praise, too. They got me again! I have no one to blame but myself, I guess. I really shouldn’t keep falling for it.
So, that “update” idea didn’t work at all. I finally got my system back and, as usual, it was so difficult that I don’t even remember how I ended up fixing it.
I did learn one thing very useful, though: it took way fewer packages than I had installed the last several times I did this. Check this out:
amdgpu-dkms.noarch 1:5.13.11.21.50.50000-1373477.el8 @amdgpu-prd
amdgpu-dkms-firmware.noarch 1:5.13.11.21.50.50000-1373477.el8 @amdgpu-prd
amdgpu-pro-core.noarch 21.50-1373477.el8 @amdgpu-proprietary
ocl-icd-amdgpu-pro.x86_64 21.50-1373477.el8 @amdgpu-proprietary
So, here’s one thing to watch out for: if you have amdgpu-install pkg installed, upgrading it will recreate your /etc/yum.repos.d/files, destroying any modifications you made to make the damn thing work in the first place. So, hint #1: build your own repo files so amdgpu installer cannot mess them up.
Also, it doesn’t look like you need amdgpu-install for exactly the reason I sited before: You’re better of without it! Hint #2: DO NOT INSTALL AMDGPU-INSTALL. Pretty much the worst thing you can do is follow AMD’s instructions. I think amdgpu-install is now worse than it has ever been for Fedora. It used to limp a long and work with some cheating, but not anymore.
So, here’s something else: when I finally gave up and decided to start over, I ran amdgpu-install --uninstall…but that failed, of course, and left me with a bunch of installed pkgs. I used dnf’s reporting of the install repo to find everything and back out. That’s Hint #4 if you didn’t already know about it.
Finally, I don’t even think the amdgpu-dkms pkgs do anything. Boot just throws a bunch of errors about them. So, I uninstalled them, and they didn’t affect the other pkgs. So, after all that pain, all I needed was TWO packages???
Oh, also, the gpg repo keys are in the amdgpu-install pkg. So, I guess you can either disable that for your repos or install it, copy the keys, and then uninstall it??? Whatever. I’ll fix it later. Need sleep.
Okay so, I can’t stop picking at this…
I managed to mess it up, again, but recovered by reinstalling some installed packages. So here is my MWE for amdgpu 21.50:
$ sudo -i dnf list installed '*amdgpu*' '*rocm*' '*roct*' '*hsa*' | grep -v 'procmail|setproctitle'
Installed Packages
amdgpu-pro-core.noarch 21.50-1373477.el8 @amdgpu-proprietary-prd
hsa-rocr.x86_64 1.5.0.50000-49.el8 @rocm-prd
hsakmt.x86_64 1.0.6-17.rocm3.9.0.fc35 @fedora
hsakmt-roct-devel.x86_64 20211222.1.5.50000-49.el8 @rocm-prd
ocl-icd-amdgpu-pro.x86_64 21.50-1373477.el8 @amdgpu-proprietary-prd
rocm-core.x86_64 5.0.0.50000-49.el8 @rocm-prd
rocm-ocl-icd.x86_64 2.0.0.50000-49.el8 @rocm-prd
rocm-opencl.x86_64 2.0.0.50000-49.el8 @rocm-prd
rocm-runtime.x86_64 3.9.0-2.fc35 @fedora
rocminfo.x86_64 3.9.0-2.fc35 @fedora
xorg-x11-drv-amdgpu.x86_64 21.0.0-1.fc35 @fedora
But, that doesn’t tell the whole story. Consider these confusing factors:
amdgpu-core
, but it failed, but that’s always been true, since amdgpu-core
pkg was first introduced.libamdocl64.so
…but that’s not in amdgpu-pro-core
libamdocl64.so
is actuall in rocm-opencl
rocm-opencl
is not a dependency of any other installed pkg!/etc/ld.so.conf.d/10-rocm-opencl.conf
is required to get linking work. It points /opt/rocm-5.0.0/opencl/lib
which contains libamdocl64.so
/etc/ld.so.conf.d/10-rocm-opencl.conf
! So, I have no idea what pkg install created it or how. Why would a package create a simple file with a script instead of just delivering the file?libOpenCL.so.1.2
. I think one is for ROCm-based access to OpenCL and one is for, uh, normal OpenCL?/opt/rocm-5.0.0/opencl/lib/libOpenCL.so.1.2
/opt/amdgpu-pro/lib64/libOpenCL.so.1.2
Arrgh!!!
amdgpu-pro-core
ocl-icd-amdgpu-pro
clinfo
. I know if my system is working when there are 2 “plaforms” detected. One is old OpenCL/Mesa/Clover and doesn’t work (although it should, it used to), and the other one is newer OpenCL and says “AMD-APP ([version])”, where [version] is what changes when you get an update. It’s worth keeping track of this!dnf update
??So, because of this recent disaster, upgrading with dnf is still untested on Fedora. I suppose I could go back and test, but…no. I will report back here after then next release.
The clinfo
that comes with rocm (/opt/rocm…/) doesn’t give any errors. But, the default clinfo
that is part of the base Fedora distro might give an error like this:
fatal error: cannot open file '/usr/lib64/clc/gfx1010-amdgcn-mesa-mesa3d.bc': No such file or directory
This is the reason the old OpenCL “platform” doesn’t work anymore. But, AFAIK, this is only a problem with NAVI10 (gfx1010/5700XT) GPUs. This missing file is the “secret sauce” that Mesa hasn’t been able to include, but is included for other, older cards, and is known to work without any extra amdgpu packages from AMD. It may also be a problem for the latest gen (6800s); I don’t have one to test. But, don’t assume you need extra AMDGPU stuff to make OpenCL work; it didn’t used to be like this!
Okay, here we go! Looks like some admgpu packages were updated in the ‘…/latest/…’ RHEL 8.5 RPM repository. DNF seems to be doing the right thing by identifying them as updated, automatically. I’ll update and reboot and see how it goes…
Hey, I think it worked. I installed the new pkgs from the AMD repo (manual repo config) with many other system updates and…no problems detected.
$ sudo -i dnf list installed '*amdgpu*' '*rocm*' '*roct*' '*hsa*' | grep -v 'procmail|setproctitle'
Installed Packages
amdgpu-pro-core.noarch 22.10-1395274.el8 @amdgpu-proprietary-prd
hsa-rocr.x86_64 1.5.0.50100-36.el8 @rocm-prd
hsakmt.x86_64 1.0.6-17.rocm3.9.0.fc35 @fedora
hsakmt-roct-devel.x86_64 20220128.1.7.50100-36.el8 @rocm-prd
ocl-icd-amdgpu-pro.x86_64 22.10-1395274.el8 @amdgpu-proprietary-prd
rocm-core.x86_64 5.1.0.50100-36.el8 @rocm-prd
rocm-ocl-icd.x86_64 2.0.0.50100-36.el8 @rocm-prd
rocm-opencl.x86_64 2.0.0.50100-36.el8 @rocm-prd
rocm-runtime.x86_64 3.9.0-2.fc35 @fedora
rocminfo.x86_64 3.9.0-2.fc35 @fedora
xorg-x11-drv-amdgpu.x86_64 22.0.0-1.fc35 @updates
$ clinfo
Number of platforms 2
Platform Name AMD Accelerated Parallel Processing
Platform Vendor Advanced Micro Devices, Inc.
Platform Version OpenCL 2.1 AMD-APP (3423.0)
...
I guess I should update the solution to this thread, now that I have been able to confirm that, at least for the most recent update, AMD’s public RPM repository can be made to work as expected.
amdgpu-install
dnf
/yum
repo files in /etc/yum.repo.d
for yourself.gpgkey
key in those repo files. You’ll need it for later. A good place to put it might be /etc/pki/rpm-gpg/
with all your other ones.amdgpu-install
amdgpu-install
created. This is all so that it will not break when we uninstall it, or if you ever have to install amdgpu-install, again, it will not overwrite or ruin what you have done.baseurl
key configured for your new remote repos to point to the URL that will get updates, instead of the one amdgpu-install
created, which points to a URL for just that specific version of amdgpu. (I know, it’s so crazy that it doesn’t make sense when you write it out.) There is still a different URL for each distribution, though. For F35, for example, I figure CentOS8 or RHEL/8.5 is the closest match, so, I used:amdgpu
:https://repo.radeon.com/amdgpu/latest/rhel/8.5/main/x86_64
amdgpu-pro
:https://repo.radeon.com/amdgpu/latest/rhel/8.5/proprietary/x86_64
rocm
:https://repo.radeon.com/rocm/centos8/rpm
sudo -i dnf makecache
. That should list the new repositories you created and fetch the available packages. If it’s working, you can now browse the remote repository however you like.ocl-icd-amdgpu-pro
(and it’s dependencies), but I found ldd
was linking to a .so in rocm-opencl
, so I’m leaving that installed, plus it’s dependencies. I don’t think you really need any HSA stuff.I don’t think what I wrote above is “wrong”, but I now realize it’s not quite accurate. Here’s what I recently learned.
amdgpu-install
(which does not work on Fedora).amdgpu-install
at all! The GPG key for those RPMs is downloadable, and you can just create a .repo file using any example you find. All you need to know is the following (and name it something unique, like myrocm
)baseurl=https://repo.radeon.com/rocm/centos8/rpm/
amdgpu-core
, ocl-icd-amdgpu-pro
, and dependencies. This is great news, because those old pkgs had kernel dependencies and amdgpu-core
would always fail to install. But, you don’t need it anymore!rocm-language-runtime
, rocm-opencl-runtime
, & rocm-ocl-icd
(and deps)So, now, I have OpenCL working without any of the packages I used to think were absolutely essential. The world is changed!
Installed Packages
hsakmt-roct-devel.x86_64 20220128.1.7.50100-36.el8 @rocm-prd
rocm-core.x86_64 5.1.0.50100-36.el8 @rocm-prd
rocm-language-runtime.x86_64 5.1.0.50100-36.el8 @rocm-prd
rocm-ocl-icd.x86_64 2.0.0.50100-36.el8 @rocm-prd
rocm-opencl.x86_64 2.0.0.50100-36.el8 @rocm-prd
rocm-opencl-runtime.x86_64 5.1.0.50100-36.el8 @rocm-prd
(You also need some official, Fedora pkgs, which are dependencies of these, but I didn’t show them.)
And, I’m sure you don’t need -devel
either.
gfx1030
mesa3d.bc data, which is missing, just like gfx1010
was missing for the 5700XT. I think these don’t exist anymore in part because support is moved to ROCm. Rather than helping MESA OSS, AMD is just doing it through ROCm, which is their OSS platform. I thought it was some kind of mistake and that MESA support would return, but now I think I understand why it never materialized for NAVI10. So, things are making a lot more sense, now.So, there are problems still to be fixed. You might want to install some ROCm packages that you can’t, due to Fedora/AMD packaging disagreements. Many of the new ROCm pkgs for Centos8 (which are the same as for RHEL8) require /usr/libexec/platform-python
, which is depricated, AFIAKT, and Fedora has appropriately removed it since RHEL8 was introduced. This affects many of the packages that the install documentation for ROCm (see below) discusses, like HIP, ML, & OpenMP runtimes for ROCm. These are cool for programming, but are not necessary for getting compiled OpenCL programs to work over ROCm. So, not a problem unless you want to write or build code.
But, even here, there is hope. This issue about /usr/libexec/platform-python
goes back to Python 2.7, so it’s old. And, I see an empty stub for RHEL9 (Index of /rocm/rhel9/) already on the ROCm RPM repo server. So, fingers crossed, we’ll get to install these pkgs when AMD publishes them for RHEL9, which, presumably, will have not just newer Python, but the Python pkgs built for Fedora that we are using now. So, there’s a chance that, as long as Fedora doesn’t go too far ahead, RHEL9 will be sufficiently like Fedora on launch that we’ll get to use those pkgs.
I had a bit if a scare when I replaced my 5700XT with a 6800XT. I was supper excited because it seemed to be crunching BOINC Einstein@Home tasks very fast, but then I playing a game on it, which crashed, and, when I came back the next day, I noticed that all BOINC GPU tasks were running to completion but finishing with an “compute error” code. I freaked out, thinking there was something wrong with the whole setup. I removed amdgpu-opencl
pkgs and went searching online for how it’s supposed to work, again, which I do every so often, and I can never figure it out. Well, after rebooting, I noticed that clinfo
showed I still had a working OpenCL “platform”! This got me thinking in the right way, finally. I went to the ROCm page and started reading the install docs. Look what I found:
rocm-langauage-runtime
layer interfacing all kernel layer communication, and you see an OpenCL layer on top of that, plus there exits an rocm-ocl-icd
. So, that makes sense; that is now it works, now, and that’s how your OpenCL programs can work without anything like amdgpu-opencl-.
You might find these links useful for more information:
https://docs.amd.com/bundle/ROCm_Installation_Guidev5.0/page/Meta-packages_in_ROCm_Programming_Models.html#_ROCm_Package_Naming
https://docs.amd.com/bundle/ROCm_Installation_Guidev5.0/page/Overview_of_ROCm_Installation_Methods.html
amdgpu-install
?I think what I learned is a net positive for the Fedora community. ROCm seems to be making life easier for us because it exits as another layer of indirection. It looks like we don’t even need to mess with amdgpu-install
any more…at least not for a while. I recommend upgrading to a NAVI10+ card for this reason.
Today, I noticed that some ROCm stuff got update…but not from the AMD RHEL 8 repo. They came directly from Fedora ‘updates’ repo!
So, now my system is in some type of hybrid state. I don’t like it, but so far, no issues to report. Although, I just rebooted so that could be misleading.
I’m reporting this here because I don’t even think I need any stuff from AMD’s repo any more. I didn’t try removing any pkgs, but I think I could. I now have rocm-runtime, rocminfo, and rocm-opengl pkgs from Fedora, offical-like. I think this is another good sign. I checked, and I don’t see those pkgs in the ‘fedora’ repo, they only exist in ‘updates’, so I don’t think I missed them before when I upgraded to F36.
$ sudo -i dnf list installed "*rocm-*" rocminfo "*hsa*" "*hip*"
Installed Packages
hsa-rocr.x86_64 1.5.0.50200-65.el8 @rocm-prd
hsa-rocr-devel.x86_64 1.5.0.50200-65.el8 @rocm-prd
hsakmt.x86_64 1.0.6-23.rocm5.2.0.fc36 @updates
hsakmt-roct-devel.x86_64 20220426.0.86.50200-65.el8 @rocm-prd
rocm-comgr.x86_64 5.2.0-1.fc36 @updates
rocm-core.x86_64 5.2.0.50200-65.el8 @rocm-prd
rocm-device-libs.x86_64 5.2.0-1.fc36 @updates
rocm-language-runtime.x86_64 5.2.0.50200-65.el8 @rocm-prd
rocm-ocl-icd.x86_64 2.0.0.50200-65.el8 @rocm-prd
rocm-opencl.x86_64 5.2.0-1.fc36 @updates
rocm-runtime.x86_64 5.2.0-1.fc36 @updates
rocm-smi.noarch 4.0.0-5.fc36 @fedora
rocminfo.x86_64 5.2.0-1.fc36 @updates
I think I’m okay, at least until the AMD and Fedora repos get out of sync; if AMD’s repo updates rocm-opencl but fedora doesn’t, I’m guessing DNF will overwrite the fedora pkgs with newer ones from AMD, which could be less compatible. I’d like to stick with fedora ones, assuming they’re going to keep them updated.
However, I checked for any “hip” stuff or “hsa” stuff from fedora, and there’s nothing. So, that’s still an issue yet to be resolved. But for pure OpenCL…I think AMD is actually doing it completely the right way and Fedora is supported???
I would (temporarily maybe) disable the ‘rocm-prd’ repo and see what happens with updates.
Fedora does not usually release packages that still depend upon a 3rd party repo. They also do not arbitrarily release packages that do not have a maintainer to keep them up to date. It seems likely that the packages (I see 7) that came from fedora are all that are needed. The others that have names ending in .el8 are probably superfluous and could be removed
Right, that was my conclusion, too. But, that doesn’t make sense, since these packages are completely new, so, does that mean F36 was incomplete when released? The changelog shows July 5 was the first rocm-opencl pkg for fedora.
It makes sense to not let these packages update from AMD’s official RHEL8 repo, but, on the other hand, we are desperate for a HIP RPM that is compatible with F36.
Hmm, I checked, and now, there are actually RPMs in the AMDGPU RHEL9 repo, new pkgs updated yesterday. But, still no rocm RHEL9 files. I think that’s what we are waiting for. Or, maybe we’ll get the other pkgs + hip in the near future. Fingers crossed!
Actually it does make sense.
Fedora always lags a bit behind the upstream source of packages, and they also tend to repackage things in a way that makes sense to the people at fedoraproject.
Sometimes upstream packages may be combined to fewer packages or may be split into more pieces depending on how it fits into the OS. The intent seems to be to make things just fit with less redundancy and to not overwrite already existing libraries/files (which could break something else). Package dependencies handle that.
Right, sure. I get the lag. But, that’s not what I meant. AFAICT, these packages didn’t exist on fedora…ever, and they were/are necessary. I mean, I looked for them and related pkgs many times in the past, and they were not there. Without them, the functionality did not exist, so I’m certain it wasn’t contained in other packages. Looks like they just appeared at version 5.2. So, it’s not lag. To have new functionality show up, suddenly, seems unusual. I could be wrong about that, but I would have expected such a thing between major releases.
I’m still not sure all the functionality is present, now, either. I’m pretty sure it’s not, actually. I’m not seeing anything about HIP or ROCr. So, it’s not clear if fedora plans to create packages for this functionality or leave it to AMD to do, which they will not do…at least not officially. So, I it’s not as though we can afford to ignore these AMD repos. There’s no way to know if the problem is just that AMD is not supporting fedora, or fedora just hasn’t not around to repackaging some AMD functionality that they have provided for other distributions, or if fedora has decided to leave functionality to a third party.
Between major releases, I remove all these packages and start over. Looking forward to F37. I’ll see what we get then.
As I understand it, fedora does not ever distribute packages that are subject to licensing, copyright, or patent encumbrances. I have no idea what, if any, of those restrictions apply to what you are asking about, but if so restricted then fedora will not include them in their repos. 3rd party repos may provide them however.
Oh yes, I’m sure you are right about that. Maybe that’s why these other packages are still missing. Could be. It would be really helpful if there was some documentation on this. As far as AMD is concerned, they’re providing these technologies to the community, but they’re not willing to support fedora. So, only fedora could explain to users what is missing from the repos and why. I don’t use Blender, but I think those folks need these other pieces.
I’m trying to run Davinci Resolve (OpenCL) with my AMD Radeon RX 6700 XT via an EGPU on my thunderbolt laptop.
After weeks of trying other guides, the furthest I got was with AMDGPU drivers, the kernel args “radeon.cik_support=0 amdgpu.cik_support=1
”, and running mokutil --import /root/mok.der
. I was able to open Resolve and it saw my GPU, but pictures and videos wouldn’t render. Minecraft used the GPU seemingly without issue. Minecraft no longer used and Resolve no longer detected my GPU after a reboot, and I had many DKMS problems.
This thread seemed like the cleanest solution, so I would like to look for assistance here. However, if my issue is more appropriate as a new post, I can post there instead.
My GPU isn’t detected by Resolve and isn’t used by Minecraft.
OS: Fedora 37
Kernel: 6.0.12-300.fc37.x86_64
Kernel args: ro rootflags=subvol=root rd.luks.uuid=luks-x-x-x-x-x rhgb quiet module_blacklist=hid_sensor_hub nvme.noacpi=1
Display manager: Wayland
Device: Framework Laptop 12th gen (thunderbolt 4)
Secure boot: Enabled
EGPU: Akitio Node Titan (thunderbolt 3)
GPU: Gigabyte (AMD) Radeon RX 6700 XT
AMD Main repo: Index of /amdgpu/5.4.1/rhel/9.1/main/x86_64/
AMD Proprietary repo: Index of /amdgpu/5.4.1/rhel/9.1/proprietary/x86_64/
ROCM repo: Index of /rocm/rhel9/rpm/
All relevant packages (as far as I’m aware) (sudo dnf list installed '*amd-gpu*' '*amdgpu*' '*rocm*' '*roct*' '*hsa*' '*mesa*' '*vulkan*' | grep -v 'procmail|setproctitle'
):
Installed Packages
amd-gpu-firmware.noarch 20221109-144.fc37 @updates
hsa-rocr.x86_64 1.7.0.50400-72.el9 @rocm-copy
hsa-rocr-devel.x86_64 1.7.0.50400-72.el9 @rocm-copy
hsakmt.x86_64 1.0.6-26.rocm5.3.0.fc37 @updates
hsakmt-roct-devel.x86_64 20221020.0.2.50400-72.el9 @rocm-copy
mesa-dri-drivers.i686 22.2.3-1.fc37 @updates
mesa-dri-drivers.x86_64 22.2.3-1.fc37 @updates
mesa-filesystem.i686 22.2.3-1.fc37 @updates
mesa-filesystem.x86_64 22.2.3-1.fc37 @updates
mesa-libEGL.i686 22.2.3-1.fc37 @updates
mesa-libEGL.x86_64 22.2.3-1.fc37 @updates
mesa-libGL.i686 22.2.3-1.fc37 @updates
mesa-libGL.x86_64 22.2.3-1.fc37 @updates
mesa-libGLU.x86_64 9.0.1-7.fc37 @fedora
mesa-libOSMesa.i686 22.2.3-1.fc37 @updates
mesa-libOSMesa.x86_64 22.2.3-1.fc37 @updates
mesa-libgbm.i686 22.2.3-1.fc37 @updates
mesa-libgbm.x86_64 22.2.3-1.fc37 @updates
mesa-libglapi.i686 22.2.3-1.fc37 @updates
mesa-libglapi.x86_64 22.2.3-1.fc37 @updates
mesa-libxatracker.x86_64 22.2.3-1.fc37 @updates
mesa-va-drivers.i686 22.2.3-1.fc37 @updates
mesa-vulkan-drivers.i686 22.2.3-1.fc37 @updates
mesa-vulkan-drivers.x86_64 22.2.3-1.fc37 @updates
rocm-comgr.x86_64 5.3.0-1.fc37 @updates
rocm-core.x86_64 5.4.0.50400-72.el9 @rocm-copy
rocm-device-libs.x86_64 1.0.0.50400-72.el9 @rocm-copy
rocm-language-runtime.x86_64 5.4.0.50400-72.el9 @rocm-copy
rocm-ocl-icd.x86_64 2.0.0.50400-72.el9 @rocm-copy
rocm-opencl.x86_64 2.0.0.50400-72.el9 @rocm-copy
rocm-opencl-runtime.x86_64 5.4.0.50400-72.el9 @rocm-copy
rocm-runtime.x86_64 5.3.0-2.fc37 @updates
rocm-smi.noarch 4.0.0-6.fc37 @fedora
rocminfo.x86_64 1.0.0.50400-72.el9 @rocm-copy
vulkan-loader.i686 1.3.216.0-3.fc37 @fedora
vulkan-loader.x86_64 1.3.216.0-3.fc37 @fedora
This is the process I used to test GPU detection in this section:
1a. Resolve: “Unsupported GPU Processing Mode. Please review the GPU drivers and GPU configuration under preferences.”
1b. Minecraft: F3 menu GPU name “Mesa Intel(R) Graphics (ADL GT2) - 4.6 (Core Profile) Mesa 22.2.4 (git-80df10f902)”
1c. lspci
- GPU is detected whilst EGPU is connected via thunderbolt (sudo lspci -vnn | grep VGA -A 12
)
00:02.0 VGA compatible controller [0300]: Intel Corporation Alder Lake-P Integrated Graphics Controller [8086:4626] (rev 0c) (prog-if 00 [VGA controller])
Subsystem: Device [f111:0002]
Flags: bus master, fast devsel, latency 0, IRQ 149, IOMMU group 1
Memory at 605c000000 (64-bit, non-prefetchable) [size=16M]
Memory at 4000000000 (64-bit, prefetchable) [size=256M]
I/O ports at 3000 [size=64]
Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
Capabilities: [40] Vendor Specific Information: Len=0c <?>
Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit-
Capabilities: [d0] Power Management version 2
Capabilities: [100] Process Address Space ID (PASID)
Capabilities: [200] Address Translation Service (ATS)
--
06:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M] [1002:73df] (rev c5) (prog-if 00 [VGA controller])
Subsystem: Gigabyte Technology Co., Ltd Device [1458:2331]
Flags: fast devsel, IRQ 207, IOMMU group 24
Memory at 6000000000 (64-bit, prefetchable) [disabled] [size=256M]
Memory at 6010000000 (64-bit, prefetchable) [disabled] [size=2M]
I/O ports at 4000 [disabled] [size=256]
Memory at 7c000000 (32-bit, non-prefetchable) [virtual] [size=1M]
Expansion ROM at 7c100000 [virtual] [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [64] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
1d. clinfo
- no OpenCL devices detected:
Number of platforms 1
Platform Name AMD Accelerated Parallel Processing
Platform Vendor Advanced Micro Devices, Inc.
Platform Version OpenCL 2.1 AMD-APP (3513.0)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd cl_amd_event_callback
Platform Extensions function suffix AMD
Platform Host timer resolution 1ns
Platform Name AMD Accelerated Parallel Processing
Number of devices 0
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No platform
clCreateContext(NULL, ...) [default] No platform
clCreateContext(NULL, ...) [other]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) No devices found in platform
1e. glxinfo
- CPU is being used rather than the GPU (glxinfo | grep "OpenGL renderer"
):
OpenGL renderer string: Mesa Intel(R) Graphics (ADL GT2)
The difference here (compared to GPU Dection 1) is I plug in the EGPU at the Gnome lockscreen instead of pre-boot. Doing this allows clinfo
to see my GPU.
This is the process I used to test GPU detection in this section:
2a. Resolve: “Unsupported GPU Processing Mode. Please review the GPU drivers and GPU configuration under preferences.”
2b. Minecraft: F3 menu GPU name “Mesa Intel(R) Graphics (ADL GT2) - 4.6 (Core Profile) Mesa 22.2.4 (git-80df10f902)”
2c. lspci
- GPU is detected whilst EGPU is connected via thunderbolt (sudo lspci -vnn | grep VGA -A 12
)
00:02.0 VGA compatible controller [0300]: Intel Corporation Alder Lake-P Integrated Graphics Controller [8086:4626] (rev 0c) (prog-if 00 [VGA controller])
Subsystem: Device [f111:0002]
Flags: bus master, fast devsel, latency 0, IRQ 149, IOMMU group 1
Memory at 606c000000 (64-bit, non-prefetchable) [size=16M]
Memory at 4000000000 (64-bit, prefetchable) [size=256M]
I/O ports at 4000 [size=64]
Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
Capabilities: [40] Vendor Specific Information: Len=0c <?>
Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit-
Capabilities: [d0] Power Management version 2
Capabilities: [100] Process Address Space ID (PASID)
Capabilities: [200] Address Translation Service (ATS)
--
06:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M] [1002:73df] (rev c5) (prog-if 00 [VGA controller])
Subsystem: Gigabyte Technology Co., Ltd Device [1458:2331]
Flags: fast devsel, IRQ 207, IOMMU group 24
Memory at 6000000000 (64-bit, prefetchable) [disabled] [size=256M]
Memory at 6010000000 (64-bit, prefetchable) [disabled] [size=2M]
I/O ports at 3000 [disabled] [size=256]
Memory at 52000000 (32-bit, non-prefetchable) [virtual] [size=1M]
Expansion ROM at 52100000 [virtual] [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [64] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
2d. clinfo
- AMD GPU is now detected:
Number of platforms 1
Platform Name AMD Accelerated Parallel Processing
Platform Vendor Advanced Micro Devices, Inc.
Platform Version OpenCL 2.1 AMD-APP (3513.0)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd cl_amd_event_callback
Platform Extensions function suffix AMD
Platform Host timer resolution 1ns
Platform Name AMD Accelerated Parallel Processing
Number of devices 1
Device Name gfx1031
Device Vendor Advanced Micro Devices, Inc.
Device Vendor ID 0x1002
Device Version OpenCL 2.0
Driver Version 3513.0 (HSA1.1,LC)
Device OpenCL C Version OpenCL C 2.0
Device Type GPU
Device Board Name (AMD) AMD Radeon RX 6700 XT
Device PCI-e ID (AMD) 0x73df
Device Topology (AMD) PCI-E, 0000:06:00.0
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 20
SIMD per compute unit (AMD) 4
SIMD width (AMD) 32
SIMD instruction width (AMD) 1
Max clock frequency 2725MHz
Graphics IP (AMD) 10.3
Device Partition (core)
Max number of sub-devices 20
Supported partition types None
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 1024x1024x1024
Max work group size 256
Preferred work group size (AMD) 256
Max work group size (AMD) 1024
Preferred work group size multiple (kernel) 32
Wavefront width (AMD) 32
Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 1 / 1 (cl_khr_fp16)
float 1 / 1
double 1 / 1 (cl_khr_fp64)
Half-precision Floating-point support (cl_khr_fp16)
Denormals No
Infinity and NANs No
Round to nearest No
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Address bits 64, Little-Endian
Global memory size 12868124672 (11.98GiB)
Global free memory (AMD) 12566528 (11.98GiB) 12566528 (11.98GiB)
Global memory channels (AMD) 6
Global memory banks per channel (AMD) 4
Global memory bank width (AMD) 256 bytes
Error Correction support No
Max memory allocation 10937905968 (10.19GiB)
Unified memory for Host and Device No
Shared Virtual Memory (SVM) capabilities (core)
Coarse-grained buffer sharing Yes
Fine-grained buffer sharing Yes
Fine-grained system sharing No
Atomics No
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Preferred alignment for atomics
SVM 0 bytes
Global 0 bytes
Local 0 bytes
Max size for global variable 10937905968 (10.19GiB)
Preferred total size of global vars 12868124672 (11.98GiB)
Global Memory cache type Read/Write
Global Memory cache size 16384 (16KiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 29663
Max size for 1D images from buffer 134217728 pixels
Max 1D or 2D image array size 8192 images
Base address alignment for 2D image buffers 256 bytes
Pitch alignment for 2D image buffers 256 pixels
Max 2D image size 16384x16384 pixels
Max 3D image size 16384x16384x8192 pixels
Max number of read image args 128
Max number of write image args 8
Max number of read/write image args 64
Max number of pipe args 16
Max active pipe reservations 16
Max pipe packet size 2347971376 (2.187GiB)
Local memory type Local
Local memory size 65536 (64KiB)
Local memory size per CU (AMD) 65536 (64KiB)
Local memory banks (AMD) 32
Max number of constant args 8
Max constant buffer size 10937905968 (10.19GiB)
Preferred constant buffer size (AMD) 16384 (16KiB)
Max size of kernel argument 1024
Queue properties (on host)
Out-of-order execution No
Profiling Yes
Queue properties (on device)
Out-of-order execution Yes
Profiling Yes
Preferred size 262144 (256KiB)
Max size 8388608 (8MiB)
Max queues on device 1
Max events on device 1024
Prefer user sync for interop Yes
Number of P2P devices (AMD) 0
Profiling timer resolution 1ns
Profiling timer offset since Epoch (AMD) 0ns (Thu Jan 1 01:00:00 1970)
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Thread trace supported (AMD) No
Number of async queues (AMD) 8
Max real-time compute queues (AMD) 8
Max real-time compute units (AMD) 20
printf() buffer size 4194304 (4MiB)
Built-in kernels (n/a)
Device Extensions cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) No platform
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No platform
clCreateContext(NULL, ...) [default] No platform
clCreateContext(NULL, ...) [other] Success [AMD]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name AMD Accelerated Parallel Processing
Device Name gfx1031
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name AMD Accelerated Parallel Processing
Device Name gfx1031
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name AMD Accelerated Parallel Processing
Device Name gfx1031
1e. glxinfo
- CPU is being used rather than the GPU (glxinfo | grep "OpenGL renderer"
):
OpenGL renderer string: Mesa Intel(R) Graphics (ADL GT2)
I’ve tried every guide I could find over the past few weeks, but to no avail. Any ideas or suggestions would be really appreciated.
Well, I’m not really sure what is your issue, so it’s hard to say if it’s best addressed in this thread or not. But, let’s think through a few things:
In the past when using an Nvidia GPU with Resolve I’m given the options “CUDA” and “OpenCL” in the GPU settings.
To my understanding, it’s used to import secure boot keys so that DKMS can work along side secure boot.
I found them during my research into getting AMD GPUs to work on Fedora. During that one instance I mentioned where Resolve would start up and Minecraf twould utilise the GPU, glxinfo
was reporting my AMD GPU as being the primary display driver (rather than Intel internal graphics). Perhaps this is related to those kernel arguments?
The first GPU is my CPU’s Intel internal graphics. I’m not sure they support OpenCL, as if they did I would expect Resolve to start up and allow me to use Intel internal graphics as its GPU. To clarify, Resolve requires a GPU to get beyond the settings page and actually start up.
I’m unfortunately unable to do this.
I have documented my attempt at removing all the packages and simply relying on the amdgpu
kernel driver. In short, the GPU was seemingly only seen by lspci
, and no applications used it. Perhaps there is a kernel argument I could use to force it as the primary GPU so that all applications use it?
I use the “free” version of Davinci Resolve, rather than the paid “Davinci Resolve Studio” so the support I receive may be limited, however I have seen threads about AMD GPUs on Linux on their forums from this year so it may still be worth a go.
I was using “Minecraft” as my test application alongside Resolve as my expectation was if the kernel GPU driver was working correctly then Minecraft would automatically use the AMD GPU rather than my CPU. It is possible that the amdgpu
kernel module is functioning correctly, and the only problem is for some reason my system prefers the CPU so uses that as the primary GPU. If I could figure out how to force the AMD GPU into being the primary GPU, the results of my tests with various package configurations may change.
This is an eGPU though, so ideally even if it is set as the primary GPU, if my system boots with it disconnected I definitely want to fallback to Intel internal graphics as the GPU so I’m not dumped into a TTY session with no Gnome GUI.
I have local copies of the AMD Main, AMD Proprietary, and AMD ROCM repositories so I should have full control of which AMD Pro packages I install without depending on the amdgpu-install
package.
I have further configuration attempts which I have posted on the Framework forum, but I’m unsure if it’s appropriate to continue sharing verbose logs to this “solved” thread. Is it a better idea to link to the Framework forum here, or create a new thread on this forum so members of this forum can contribute ideas without needing to sign up to another forum?
I found them during my research into getting AMD GPUs to work on Fedora. During that one instance I mentioned where Resolve would start up and Minecraf twould utilise the GPU, glxinfo was reporting my AMD GPU as being the primary display driver (rather than Intel internal graphics). Perhaps this is related to those kernel arguments?
I doubt it. I’m pretty confident those options are irrelevant. The ‘radeon’ driver doesn’t even work with your GPU, and you’re past CI generation, so AMDGPU is the only driver that could work.
I don’t think I see where you got your eGPU to show up as “primary”; looks like the same output both times.
The first GPU is my CPU’s Intel internal graphics. I’m not sure they support OpenCL, as if they did I would expect Resolve to start up and allow me to use Intel internal graphics as its GPU. To clarify, Resolve requires a GPU to get beyond the settings page and actually start up.
I’ve never had an integrated GPU, so I have no experience to lend, here. But, I know that, in the past, sometimes, they can be disabled in the BIOS. This may make your system unbootable, or it might just delay video output until later in the boot process where it finds your eGPU. It might just think it’s an embedded device, which is fine. You may want to configure SSHD or something for remote access. But, even if not, and if you can disable it, it would be worth trying. It’s not going to hurt anything to try.
I’m guessing your integrated GPU is supported for OpenGL via MESA, at least, but it looks like OpenCL isn’t working on it. Not sure where to look for that, but it doesn’t seem important.
I have documented my attempt at removing all the packages and simply relying on the amdgpu kernel driver. In short, the GPU was seemingly only seen by lspci, and no applications used it. Perhaps there is a kernel argument I could use to force it as the primary GPU so that all applications use it?
Perhaps; I don’t know. But, if it is possible to make your eGPU be your primary video card, then you don’t need your internal one. Again, if you can disable it, I’d try that. You will not get dumped to a TTY if you have NO working video adapter. And, if you only had the eGPU working, that would be very helpful for testing.
Sure, you’d like it to fall back, but that’s something your BIOS/UEFI config would have to support as a feature. As you have it now, you’re trying to make it work with two different GPUs. That’s not the most straight forward hardware configuration to get working. In the long past, X11 wouldn’t support two different drivers, IIRC, but most dual head setups just had two identical cards or dual-head cards so nobody cared much. Even after that, there was SLI, so again, people usually had cards that worked together. I think your heterogeneous situation should work and is possible, but it requires features that are NOT required for more common setups.
Regardless, any AMD card that is supported by AMDGPU (or radeon) doesn’t need any third-party pkgs for basic functionality. AFAIK, and have proved in this thread for at least my cards, meaning RDNA+, even basic OpenGL/CL works without the AMD repo stuff. I still think you should go back to full fedora pkgs. If you see it in clinfo, then you’re system is probably fine, and I get that without any non-Fedora pkgs. That’s what we’ve shown in this thread.
Since you say Render uses OpenCL, then I suggest using an OpenCL app to test. clpeak
will let you benchmark a specific GPU. You should definitely try that. If that works, then your system is functional and the problem is a compatibility issue between the app and the system, which may or may not be something you can fix on the system side.
I was using “Minecraft” as my test application alongside Resolve as my expectation was if the kernel GPU driver was working correctly then Minecraft would automatically use the AMD GPU rather than my CPU.
I’m pretty sure your AMDGPU stuff is fine. I don’t know Minecraft, but unless you know how to make it use a specific GPU on a multi-GPU system, then it’s not much help in this test. For GLXgears to work, you need to have a display working in the GPU, but you should be able to do that; I’m assuming you see both GPUs as adapters in GNOME under Wayland.
I have local copies of the AMD Main, AMD Proprietary, and AMD ROCM repositories so I should have full control of which AMD Pro packages I install without depending on the amdgpu-install package.
Finally, and this is why I started this thread, the pkgs are NOT all that amdgpu-install does. You CANNOT install AMDGPU-PRO buy installing pkgs from the AMD -PRO repo. In fact, you cannot install AMDGPU this way, either. The only thing you can do with the AMD repos is get pkgs COMPATIBLE with the Fedora AMDGPU driver and install them. This can add functionality, but not all pkgs there are compatible and there’s nothing you can do to make it work if it doesn’t. This is the whole thing with this thread. AMDGPU-PRO compiles a new kernel module for your system with proprietary blobs. AFAIK, this guaranteed to fail on Fedora. I’ve tried many times in the past. There are people who can tear into the installer and have said they can get it to work by manually extracting package and fixing multiple scripts. I assume I could do this, but it would take a long time for me to figure it out, and no one has published on this method lately that I’ve found. Let me know if you do.
Not that I want AMDGPU-PRO. If I did, I would have given up on Fedora long ago. If you want AMDGPU-PRO, I’d say fedora is the wrong distro.
Is it a better idea to link to the Framework forum here, or create a new thread on this forum so members of this forum can contribute ideas without needing to sign up to another forum?
Yeah, I think you should probably start a new thread. I appreciate thinking it through with you, though. I think you should link here so we can follow you.
Just to follow up with yet another test, I can confirm, yet again, that the solution to this thread is: just don’t. You DO NOT NEED amdgpu-installer on Fedora.
I got myself a 7900, which is sooner after release than I usually pick up hardware. I had some trouble that made me question my understanding, but I have confirmed that OpenGL and OpenCL are fully supported, and it seems they have been supported from (nearly?) the initial release day.
So, don’t get tricked into fiddling with extra pkgs for these two features. Fedora has you covered. For OpenCL, just make sure you don’t have MESA-openCl installed, too, only because it doesn’t work and some software doesn’t let you choose the software stack to use, so that can confuse your program, but it doesn’t prevent the ROCm stack from working.
Finally, your application may need more that just OGL & OCL. There are many other features implemented no top of ROCm, like PRIMs, BLAS, FFT, etc., but these may not be working, yet. There have been some nice tables showing support for each sub-component by GPU, but I cannot find a static location where the most up-to-date information can be found, easily. The code for ROCm is OSS on github, but the documentation is not easy to find/read. Maybe refer to your particular application’s community.
But, what I have learned is that, “full support” should appear before the professional cards based on the same architecture come out. So, for example, 7900 is gfx11. When there is a professional card based on gfx11, then you can be sure ROCm will have support for it. Such cards were just announced earlier last week or something. Information about ROCm & the pro cards is easy to find in the ROCm documentation/release information under “Requirements”. (AMD Documentation - Portal) But, as I say, functionality will likely be available earlier than that, but not much earlier.