How to deal with AMD's new amdgpu-installer 20.40

NO NO NO!

So, I tried to update my system and everything broke. Firefox wouldn’t even start. OMG, AMD, you drive me insane. I had to unstall AMDGPU. I’ll try to start all over again. Maybe 21.50 doesn’t work on F35.

Well, there goes another 5 hours of my life. Just after I gave AMD praise, too. They got me again! I have no one to blame but myself, I guess. I really shouldn’t keep falling for it.

So, that “update” idea didn’t work at all. I finally got my system back and, as usual, it was so difficult that I don’t even remember how I ended up fixing it.

I did learn one thing very useful, though: it took way fewer packages than I had installed the last several times I did this. Check this out:

amdgpu-dkms.noarch                                1:5.13.11.21.50.50000-1373477.el8       @amdgpu-prd
amdgpu-dkms-firmware.noarch                       1:5.13.11.21.50.50000-1373477.el8       @amdgpu-prd
amdgpu-pro-core.noarch                            21.50-1373477.el8                       @amdgpu-proprietary
ocl-icd-amdgpu-pro.x86_64                         21.50-1373477.el8                       @amdgpu-proprietary                                                                                             

So, here’s one thing to watch out for: if you have amdgpu-install pkg installed, upgrading it will recreate your /etc/yum.repos.d/files, destroying any modifications you made to make the damn thing work in the first place. So, hint #1: build your own repo files so amdgpu installer cannot mess them up.

Also, it doesn’t look like you need amdgpu-install for exactly the reason I sited before: You’re better of without it! Hint #2: DO NOT INSTALL AMDGPU-INSTALL. Pretty much the worst thing you can do is follow AMD’s instructions. I think amdgpu-install is now worse than it has ever been for Fedora. It used to limp a long and work with some cheating, but not anymore.

So, here’s something else: when I finally gave up and decided to start over, I ran amdgpu-install --uninstall…but that failed, of course, and left me with a bunch of installed pkgs. I used dnf’s reporting of the install repo to find everything and back out. That’s Hint #4 if you didn’t already know about it.

Finally, I don’t even think the amdgpu-dkms pkgs do anything. Boot just throws a bunch of errors about them. So, I uninstalled them, and they didn’t affect the other pkgs. So, after all that pain, all I needed was TWO packages???

Oh, also, the gpg repo keys are in the amdgpu-install pkg. So, I guess you can either disable that for your repos or install it, copy the keys, and then uninstall it??? Whatever. I’ll fix it later. Need sleep.

1 Like

Ranting Section

Okay so, I can’t stop picking at this…

I managed to mess it up, again, but recovered by reinstalling some installed packages. So here is my MWE for amdgpu 21.50:

$ sudo -i dnf list installed '*amdgpu*' '*rocm*' '*roct*' '*hsa*' | grep -v 'procmail|setproctitle'
Installed Packages
amdgpu-pro-core.noarch       21.50-1373477.el8           @amdgpu-proprietary-prd
hsa-rocr.x86_64              1.5.0.50000-49.el8          @rocm-prd              
hsakmt.x86_64                1.0.6-17.rocm3.9.0.fc35     @fedora                
hsakmt-roct-devel.x86_64     20211222.1.5.50000-49.el8   @rocm-prd              
ocl-icd-amdgpu-pro.x86_64    21.50-1373477.el8           @amdgpu-proprietary-prd
rocm-core.x86_64             5.0.0.50000-49.el8          @rocm-prd              
rocm-ocl-icd.x86_64          2.0.0.50000-49.el8          @rocm-prd              
rocm-opencl.x86_64           2.0.0.50000-49.el8          @rocm-prd              
rocm-runtime.x86_64          3.9.0-2.fc35                @fedora                
rocminfo.x86_64              3.9.0-2.fc35                @fedora                
xorg-x11-drv-amdgpu.x86_64   21.0.0-1.fc35               @fedora

But, that doesn’t tell the whole story. Consider these confusing factors:

  • There are no pkgs installed from the non-proprietary repo. That makes no sense at all, right? I’m trying to use the non-pro stack. Why are the pkgs I need both labelled ‘-pro’.
  • dnf tried to install amdgpu-core, but it failed, but that’s always been true, since amdgpu-core pkg was first introduced.
  • one of the most critical files is libamdocl64.so…but that’s not in amdgpu-pro-core
  • amdgpu-pro-core has only one file, the EULA agreement. Gee, thanks a lot.
  • libamdocl64.so is actuall in rocm-opencl
  • But, rocm-opencl is not a dependency of any other installed pkg!
  • Also, file /etc/ld.so.conf.d/10-rocm-opencl.confis required to get linking work. It points /opt/rocm-5.0.0/opencl/lib which contains libamdocl64.so
  • BUT no package owns /etc/ld.so.conf.d/10-rocm-opencl.conf! So, I have no idea what pkg install created it or how. Why would a package create a simple file with a script instead of just delivering the file?
  • Also, there are two copies of libOpenCL.so.1.2. I think one is for ROCm-based access to OpenCL and one is for, uh, normal OpenCL?
    /opt/rocm-5.0.0/opencl/lib/libOpenCL.so.1.2
    /opt/amdgpu-pro/lib64/libOpenCL.so.1.2
  • I don’t know if the ROCm stuff is strictly necessary for what I’m doing with OpenCL. ROCm seems critical from the way AMD talks about it, but that might be marketing. But, my system did seem to break when I uninstall it.

Arrgh!!!

What I learned: How to get OpenCL to work without installing AMDGPU-PRO (21.50)

  1. I suggest installing amdgpu-install, first. It will install some .repo files and the gpg key for those repos.
  2. Copy all of those and make your own .repo files with them; you have to have different names. It took some time, but now my repo configs are nice and clean, and I can install/remove pkgs from the remote repos.
  3. Then, uninstall amdgpu-install; it will delete the .repo files.
  4. Then, install
    amdgpu-pro-core
    ocl-icd-amdgpu-pro
  5. Now, check you configuration with clinfo. I know if my system is working when there are 2 “plaforms” detected. One is old OpenCL/Mesa/Clover and doesn’t work (although it should, it used to), and the other one is newer OpenCL and says “AMD-APP ([version])”, where [version] is what changes when you get an update. It’s worth keeping track of this!
  6. If you are lucky, in the future, try dnf update ??

So, because of this recent disaster, upgrading with dnf is still untested on Fedora. I suppose I could go back and test, but…no. I will report back here after then next release.

More about clinfo and why things are so difficult

The clinfo that comes with rocm (/opt/rocm…/) doesn’t give any errors. But, the default clinfo that is part of the base Fedora distro might give an error like this:

fatal error: cannot open file '/usr/lib64/clc/gfx1010-amdgcn-mesa-mesa3d.bc': No such file or directory

This is the reason the old OpenCL “platform” doesn’t work anymore. But, AFAIK, this is only a problem with NAVI10 (gfx1010/5700XT) GPUs. This missing file is the “secret sauce” that Mesa hasn’t been able to include, but is included for other, older cards, and is known to work without any extra amdgpu packages from AMD. It may also be a problem for the latest gen (6800s); I don’t have one to test. :frowning: But, don’t assume you need extra AMDGPU stuff to make OpenCL work; it didn’t used to be like this!

1 Like

Okay, here we go! Looks like some admgpu packages were updated in the ‘…/latest/…’ RHEL 8.5 RPM repository. DNF seems to be doing the right thing by identifying them as updated, automatically. I’ll update and reboot and see how it goes…

Hey, I think it worked. I installed the new pkgs from the AMD repo (manual repo config) with many other system updates and…no problems detected.

$ sudo -i dnf list installed '*amdgpu*' '*rocm*' '*roct*' '*hsa*' | grep -v 'procmail|setproctitle'
Installed Packages
amdgpu-pro-core.noarch       22.10-1395274.el8           @amdgpu-proprietary-prd
hsa-rocr.x86_64              1.5.0.50100-36.el8          @rocm-prd              
hsakmt.x86_64                1.0.6-17.rocm3.9.0.fc35     @fedora                
hsakmt-roct-devel.x86_64     20220128.1.7.50100-36.el8   @rocm-prd              
ocl-icd-amdgpu-pro.x86_64    22.10-1395274.el8           @amdgpu-proprietary-prd
rocm-core.x86_64             5.1.0.50100-36.el8          @rocm-prd              
rocm-ocl-icd.x86_64          2.0.0.50100-36.el8          @rocm-prd              
rocm-opencl.x86_64           2.0.0.50100-36.el8          @rocm-prd              
rocm-runtime.x86_64          3.9.0-2.fc35                @fedora                
rocminfo.x86_64              3.9.0-2.fc35                @fedora                
xorg-x11-drv-amdgpu.x86_64   22.0.0-1.fc35               @updates               
$ clinfo                                                             
Number of platforms                               2                     
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.1 AMD-APP (3423.0)
...

I guess I should update the solution to this thread, now that I have been able to confirm that, at least for the most recent update, AMD’s public RPM repository can be made to work as expected.

Start here if you don’t have a copy of AMD RPM GPG signing key AND/OR you don’t want to create the RPM repository config files from scratch

  • Temporarily install amdgpu-install
  • Make a copy of all the new dnf/yum repo files in /etc/yum.repo.d for yourself.
  • Make a copy of the AMD RPM GPG signing key referenced by the gpgkey key in those repo files. You’ll need it for later. A good place to put it might be /etc/pki/rpm-gpg/ with all your other ones.
  • Uninstall amdgpu-install

Skip to here if you have a copy of AMD RPM GPG signing key AND you don’t need any help creating remote RPM repository configurations

  • Define your own remote RPM repository config files (/etc/yum.repos.d/) (or use the ones amdgpu-install installs as a guide from above) for each of the following:
    – amdgpu-[foo]
    – amdgpu-pro-[foo]
    – rocm-[foo]
    where [foo] is some string that differentiates your manually defined remote repositories from the ones amdgpu-install created. This is all so that it will not break when we uninstall it, or if you ever have to install amdgpu-install, again, it will not overwrite or ruin what you have done.
  • change the baseurl key configured for your new remote repos to point to the URL that will get updates, instead of the one amdgpu-install created, which points to a URL for just that specific version of amdgpu. (I know, it’s so crazy that it doesn’t make sense when you write it out.) There is still a different URL for each distribution, though. For F35, for example, I figure CentOS8 or RHEL/8.5 is the closest match, so, I used:
    amdgpu:https://repo.radeon.com/amdgpu/latest/rhel/8.5/main/x86_64
    amdgpu-pro:https://repo.radeon.com/amdgpu/latest/rhel/8.5/proprietary/x86_64
    rocm:https://repo.radeon.com/rocm/centos8/rpm
    The key is the part of the URL that is ‘…/latest/…’; that’s all we are really achieving by doing all this. For rocm, there isn’t a ‘latest’, but whatever. Make sure the URL you are specifying exists using a web-brower or curl or something.
  • Update software package cache. I assume this works from GNOME Software, but I didn’t test. I used sudo -i dnf makecache. That should list the new repositories you created and fetch the available packages. If it’s working, you can now browse the remote repository however you like.
  • install just the packages you want. I still don’t know what is MWE. I think you just need ocl-icd-amdgpu-pro (and it’s dependencies), but I found ldd was linking to a .so in rocm-opencl, so I’m leaving that installed, plus it’s dependencies. I don’t think you really need any HSA stuff.
2 Likes

Epilogue: Forget AMDGPU-PRO, You Only Need ROCm

I don’t think what I wrote above is “wrong”, but I now realize it’s not quite accurate. Here’s what I recently learned.

  • The above methods works fine; it will install a working OpenCL for AMDGPU. And I still think it’s worth the trouble if you want to have all the AMD repositories setup without amdgpu-install (which does not work on Fedora).
  • But, you could skip all that! You only need the ROCm repo (Index of /rocm/centos8/rpm/), which you can set up yourself, manually, without messing with amdgpu-install at all! The GPG key for those RPMs is downloadable, and you can just create a .repo file using any example you find. All you need to know is the following (and name it something unique, like myrocm)
    baseurl=https://repo.radeon.com/rocm/centos8/rpm/
    – gpgkey: https://repo.radeon.com/rocm/rocm.gpg.key
  • And, you don’t need (the old) OpenCL pkgs anymore! I mean, you need OpenCL support, but not the old packages. AFAICT, there is no direct support for OpenCL on AMDGPU for “newer” cards (i.e. Navi10+). A direct OpenCL layer only exists for cards prior to AMD Navi10. (Also, no MESA support, see below.)
  • New cards, meaning Navi10 and newer, provide OpenCL interface on top of ROCm, not directly.
  • So, you don’t need the “old” opencl pkgs with which you are familiar. By “old” OpenCL pkgs I mean amdgpu-core, ocl-icd-amdgpu-pro, and dependencies. This is great news, because those old pkgs had kernel dependencies and amdgpu-core would always fail to install. But, you don’t need it anymore!
  • What you really need is the ROCm-based OpenCL stuff: rocm-language-runtime, rocm-opencl-runtime, & rocm-ocl-icd (and deps)

My Working (AMD-supplied) PKG List

So, now, I have OpenCL working without any of the packages I used to think were absolutely essential. The world is changed!

Installed Packages
hsakmt-roct-devel.x86_64         20220128.1.7.50100-36.el8           @rocm-prd  
rocm-core.x86_64                 5.1.0.50100-36.el8                  @rocm-prd  
rocm-language-runtime.x86_64     5.1.0.50100-36.el8                  @rocm-prd  
rocm-ocl-icd.x86_64              2.0.0.50100-36.el8                  @rocm-prd  
rocm-opencl.x86_64               2.0.0.50100-36.el8                  @rocm-prd  
rocm-opencl-runtime.x86_64       5.1.0.50100-36.el8                  @rocm-prd  

(You also need some official, Fedora pkgs, which are dependencies of these, but I didn’t show them.)
And, I’m sure you don’t need -devel either.

More about OpenCL and why things are so difficult (Continued)

  • Also, I was wrong about Mesa support; it’s not just NAVI10 that doesn’t work, it’s all “new” cards. The 6800XT needs gfx1030 mesa3d.bc data, which is missing, just like gfx1010 was missing for the 5700XT. I think these don’t exist anymore in part because support is moved to ROCm. Rather than helping MESA OSS, AMD is just doing it through ROCm, which is their OSS platform. I thought it was some kind of mistake and that MESA support would return, but now I think I understand why it never materialized for NAVI10. So, things are making a lot more sense, now.

Caveats: There are still problems

So, there are problems still to be fixed. You might want to install some ROCm packages that you can’t, due to Fedora/AMD packaging disagreements. Many of the new ROCm pkgs for Centos8 (which are the same as for RHEL8) require /usr/libexec/platform-python, which is depricated, AFIAKT, and Fedora has appropriately removed it since RHEL8 was introduced. This affects many of the packages that the install documentation for ROCm (see below) discusses, like HIP, ML, & OpenMP runtimes for ROCm. These are cool for programming, but are not necessary for getting compiled OpenCL programs to work over ROCm. So, not a problem unless you want to write or build code.

But, even here, there is hope. This issue about /usr/libexec/platform-python goes back to Python 2.7, so it’s old. And, I see an empty stub for RHEL9 (Index of /rocm/rhel9/) already on the ROCm RPM repo server. So, fingers crossed, we’ll get to install these pkgs when AMD publishes them for RHEL9, which, presumably, will have not just newer Python, but the Python pkgs built for Fedora that we are using now. So, there’s a chance that, as long as Fedora doesn’t go too far ahead, RHEL9 will be sufficiently like Fedora on launch that we’ll get to use those pkgs.

Background

I had a bit if a scare when I replaced my 5700XT with a 6800XT. I was supper excited because it seemed to be crunching BOINC Einstein@Home tasks very fast, but then I playing a game on it, which crashed, and, when I came back the next day, I noticed that all BOINC GPU tasks were running to completion but finishing with an “compute error” code. I freaked out, thinking there was something wrong with the whole setup. I removed amdgpu-opencl pkgs and went searching online for how it’s supposed to work, again, which I do every so often, and I can never figure it out. Well, after rebooting, I noticed that clinfo showed I still had a working OpenCL “platform”! This got me thinking in the right way, finally. I went to the ROCm page and started reading the install docs. Look what I found:


So, what you see is that there is the rocm-langauage-runtime layer interfacing all kernel layer communication, and you see an OpenCL layer on top of that, plus there exits an rocm-ocl-icd. So, that makes sense; that is now it works, now, and that’s how your OpenCL programs can work without anything like amdgpu-opencl-.

You might find these links useful for more information:

https://docs.amd.com/bundle/ROCm_Installation_Guidev5.0/page/Meta-packages_in_ROCm_Programming_Models.html#_ROCm_Package_Naming
https://docs.amd.com/bundle/ROCm_Installation_Guidev5.0/page/Overview_of_ROCm_Installation_Methods.html

Full Circle: What’s with amdgpu-install?

I think what I learned is a net positive for the Fedora community. ROCm seems to be making life easier for us because it exits as another layer of indirection. It looks like we don’t even need to mess with amdgpu-install any more…at least not for a while. I recommend upgrading to a NAVI10+ card for this reason.

1 Like

Yet More Epilogue

Today, I noticed that some ROCm stuff got update…but not from the AMD RHEL 8 repo. They came directly from Fedora ‘updates’ repo!

So, now my system is in some type of hybrid state. I don’t like it, but so far, no issues to report. Although, I just rebooted so that could be misleading.

I’m reporting this here because I don’t even think I need any stuff from AMD’s repo any more. I didn’t try removing any pkgs, but I think I could. I now have rocm-runtime, rocminfo, and rocm-opengl pkgs from Fedora, offical-like. I think this is another good sign. I checked, and I don’t see those pkgs in the ‘fedora’ repo, they only exist in ‘updates’, so I don’t think I missed them before when I upgraded to F36.

$ sudo -i dnf list installed "*rocm-*" rocminfo "*hsa*" "*hip*"
Installed Packages
hsa-rocr.x86_64                                                        1.5.0.50200-65.el8                                                    @rocm-prd
hsa-rocr-devel.x86_64                                                  1.5.0.50200-65.el8                                                    @rocm-prd
hsakmt.x86_64                                                          1.0.6-23.rocm5.2.0.fc36                                               @updates 
hsakmt-roct-devel.x86_64                                               20220426.0.86.50200-65.el8                                            @rocm-prd
rocm-comgr.x86_64                                                      5.2.0-1.fc36                                                          @updates 
rocm-core.x86_64                                                       5.2.0.50200-65.el8                                                    @rocm-prd
rocm-device-libs.x86_64                                                5.2.0-1.fc36                                                          @updates 
rocm-language-runtime.x86_64                                           5.2.0.50200-65.el8                                                    @rocm-prd
rocm-ocl-icd.x86_64                                                    2.0.0.50200-65.el8                                                    @rocm-prd
rocm-opencl.x86_64                                                     5.2.0-1.fc36                                                          @updates 
rocm-runtime.x86_64                                                    5.2.0-1.fc36                                                          @updates 
rocm-smi.noarch                                                        4.0.0-5.fc36                                                          @fedora  
rocminfo.x86_64                                                        5.2.0-1.fc36                                                          @updates 

I think I’m okay, at least until the AMD and Fedora repos get out of sync; if AMD’s repo updates rocm-opencl but fedora doesn’t, I’m guessing DNF will overwrite the fedora pkgs with newer ones from AMD, which could be less compatible. I’d like to stick with fedora ones, assuming they’re going to keep them updated.

However, I checked for any “hip” stuff or “hsa” stuff from fedora, and there’s nothing. So, that’s still an issue yet to be resolved. But for pure OpenCL…I think AMD is actually doing it completely the right way and Fedora is supported???

I would (temporarily maybe) disable the ‘rocm-prd’ repo and see what happens with updates.

Fedora does not usually release packages that still depend upon a 3rd party repo. They also do not arbitrarily release packages that do not have a maintainer to keep them up to date. It seems likely that the packages (I see 7) that came from fedora are all that are needed. The others that have names ending in .el8 are probably superfluous and could be removed

Right, that was my conclusion, too. But, that doesn’t make sense, since these packages are completely new, so, does that mean F36 was incomplete when released? The changelog shows July 5 was the first rocm-opencl pkg for fedora.

It makes sense to not let these packages update from AMD’s official RHEL8 repo, but, on the other hand, we are desperate for a HIP RPM that is compatible with F36.

Hmm, I checked, and now, there are actually RPMs in the AMDGPU RHEL9 repo, new pkgs updated yesterday. But, still no rocm RHEL9 files. I think that’s what we are waiting for. Or, maybe we’ll get the other pkgs + hip in the near future. Fingers crossed!

Actually it does make sense.

Fedora always lags a bit behind the upstream source of packages, and they also tend to repackage things in a way that makes sense to the people at fedoraproject.

Sometimes upstream packages may be combined to fewer packages or may be split into more pieces depending on how it fits into the OS. The intent seems to be to make things just fit with less redundancy and to not overwrite already existing libraries/files (which could break something else). Package dependencies handle that.

Right, sure. I get the lag. But, that’s not what I meant. AFAICT, these packages didn’t exist on fedora…ever, and they were/are necessary. I mean, I looked for them and related pkgs many times in the past, and they were not there. Without them, the functionality did not exist, so I’m certain it wasn’t contained in other packages. Looks like they just appeared at version 5.2. So, it’s not lag. To have new functionality show up, suddenly, seems unusual. I could be wrong about that, but I would have expected such a thing between major releases.

I’m still not sure all the functionality is present, now, either. I’m pretty sure it’s not, actually. I’m not seeing anything about HIP or ROCr. So, it’s not clear if fedora plans to create packages for this functionality or leave it to AMD to do, which they will not do…at least not officially. So, I it’s not as though we can afford to ignore these AMD repos. There’s no way to know if the problem is just that AMD is not supporting fedora, or fedora just hasn’t not around to repackaging some AMD functionality that they have provided for other distributions, or if fedora has decided to leave functionality to a third party.

Between major releases, I remove all these packages and start over. Looking forward to F37. I’ll see what we get then.

As I understand it, fedora does not ever distribute packages that are subject to licensing, copyright, or patent encumbrances. I have no idea what, if any, of those restrictions apply to what you are asking about, but if so restricted then fedora will not include them in their repos. 3rd party repos may provide them however.

Oh yes, I’m sure you are right about that. Maybe that’s why these other packages are still missing. Could be. It would be really helpful if there was some documentation on this. As far as AMD is concerned, they’re providing these technologies to the community, but they’re not willing to support fedora. So, only fedora could explain to users what is missing from the repos and why. I don’t use Blender, but I think those folks need these other pieces.

I’m trying to run Davinci Resolve (OpenCL) with my AMD Radeon RX 6700 XT via an EGPU on my thunderbolt laptop.

After weeks of trying other guides, the furthest I got was with AMDGPU drivers, the kernel args “radeon.cik_support=0 amdgpu.cik_support=1”, and running mokutil --import /root/mok.der. I was able to open Resolve and it saw my GPU, but pictures and videos wouldn’t render. Minecraft used the GPU seemingly without issue. Minecraft no longer used and Resolve no longer detected my GPU after a reboot, and I had many DKMS problems.

This thread seemed like the cleanest solution, so I would like to look for assistance here. However, if my issue is more appropriate as a new post, I can post there instead.

My GPU isn’t detected by Resolve and isn’t used by Minecraft.

This is my current configuration:

OS: Fedora 37
Kernel: 6.0.12-300.fc37.x86_64
Kernel args: ro rootflags=subvol=root rd.luks.uuid=luks-x-x-x-x-x rhgb quiet module_blacklist=hid_sensor_hub nvme.noacpi=1
Display manager: Wayland

Device: Framework Laptop 12th gen (thunderbolt 4)
Secure boot: Enabled

EGPU: Akitio Node Titan (thunderbolt 3)
GPU: Gigabyte (AMD) Radeon RX 6700 XT

AMD Main repo: Index of /amdgpu/5.4.1/rhel/9.1/main/x86_64/
AMD Proprietary repo: Index of /amdgpu/5.4.1/rhel/9.1/proprietary/x86_64/
ROCM repo: Index of /rocm/rhel9/rpm/

All relevant packages (as far as I’m aware) (sudo dnf list installed '*amd-gpu*' '*amdgpu*' '*rocm*' '*roct*' '*hsa*' '*mesa*' '*vulkan*' | grep -v 'procmail|setproctitle'):

Installed Packages
amd-gpu-firmware.noarch             20221109-144.fc37                 @updates  
hsa-rocr.x86_64                     1.7.0.50400-72.el9                @rocm-copy
hsa-rocr-devel.x86_64               1.7.0.50400-72.el9                @rocm-copy
hsakmt.x86_64                       1.0.6-26.rocm5.3.0.fc37           @updates  
hsakmt-roct-devel.x86_64            20221020.0.2.50400-72.el9         @rocm-copy
mesa-dri-drivers.i686               22.2.3-1.fc37                     @updates  
mesa-dri-drivers.x86_64             22.2.3-1.fc37                     @updates  
mesa-filesystem.i686                22.2.3-1.fc37                     @updates  
mesa-filesystem.x86_64              22.2.3-1.fc37                     @updates  
mesa-libEGL.i686                    22.2.3-1.fc37                     @updates  
mesa-libEGL.x86_64                  22.2.3-1.fc37                     @updates  
mesa-libGL.i686                     22.2.3-1.fc37                     @updates  
mesa-libGL.x86_64                   22.2.3-1.fc37                     @updates  
mesa-libGLU.x86_64                  9.0.1-7.fc37                      @fedora   
mesa-libOSMesa.i686                 22.2.3-1.fc37                     @updates  
mesa-libOSMesa.x86_64               22.2.3-1.fc37                     @updates  
mesa-libgbm.i686                    22.2.3-1.fc37                     @updates  
mesa-libgbm.x86_64                  22.2.3-1.fc37                     @updates  
mesa-libglapi.i686                  22.2.3-1.fc37                     @updates  
mesa-libglapi.x86_64                22.2.3-1.fc37                     @updates  
mesa-libxatracker.x86_64            22.2.3-1.fc37                     @updates  
mesa-va-drivers.i686                22.2.3-1.fc37                     @updates  
mesa-vulkan-drivers.i686            22.2.3-1.fc37                     @updates  
mesa-vulkan-drivers.x86_64          22.2.3-1.fc37                     @updates  
rocm-comgr.x86_64                   5.3.0-1.fc37                      @updates  
rocm-core.x86_64                    5.4.0.50400-72.el9                @rocm-copy
rocm-device-libs.x86_64             1.0.0.50400-72.el9                @rocm-copy
rocm-language-runtime.x86_64        5.4.0.50400-72.el9                @rocm-copy
rocm-ocl-icd.x86_64                 2.0.0.50400-72.el9                @rocm-copy
rocm-opencl.x86_64                  2.0.0.50400-72.el9                @rocm-copy
rocm-opencl-runtime.x86_64          5.4.0.50400-72.el9                @rocm-copy
rocm-runtime.x86_64                 5.3.0-2.fc37                      @updates  
rocm-smi.noarch                     4.0.0-6.fc37                      @fedora   
rocminfo.x86_64                     1.0.0.50400-72.el9                @rocm-copy
vulkan-loader.i686                  1.3.216.0-3.fc37                  @fedora   
vulkan-loader.x86_64                1.3.216.0-3.fc37                  @fedora

GPU detection 1

This is the process I used to test GPU detection in this section:

  1. Shut down laptop
  2. Connect to EGPU
  3. Boot up laptop
  4. Login at Gnome lockscreen
  5. Open terminal and type in the respective command / test respective application

1a. Resolve: “Unsupported GPU Processing Mode. Please review the GPU drivers and GPU configuration under preferences.”

1b. Minecraft: F3 menu GPU name “Mesa Intel(R) Graphics (ADL GT2) - 4.6 (Core Profile) Mesa 22.2.4 (git-80df10f902)”

1c. lspci - GPU is detected whilst EGPU is connected via thunderbolt (sudo lspci -vnn | grep VGA -A 12)

00:02.0 VGA compatible controller [0300]: Intel Corporation Alder Lake-P Integrated Graphics Controller [8086:4626] (rev 0c) (prog-if 00 [VGA controller])
	Subsystem: Device [f111:0002]
	Flags: bus master, fast devsel, latency 0, IRQ 149, IOMMU group 1
	Memory at 605c000000 (64-bit, non-prefetchable) [size=16M]
	Memory at 4000000000 (64-bit, prefetchable) [size=256M]
	I/O ports at 3000 [size=64]
	Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
	Capabilities: [40] Vendor Specific Information: Len=0c <?>
	Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
	Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit-
	Capabilities: [d0] Power Management version 2
	Capabilities: [100] Process Address Space ID (PASID)
	Capabilities: [200] Address Translation Service (ATS)
--
06:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M] [1002:73df] (rev c5) (prog-if 00 [VGA controller])
	Subsystem: Gigabyte Technology Co., Ltd Device [1458:2331]
	Flags: fast devsel, IRQ 207, IOMMU group 24
	Memory at 6000000000 (64-bit, prefetchable) [disabled] [size=256M]
	Memory at 6010000000 (64-bit, prefetchable) [disabled] [size=2M]
	I/O ports at 4000 [disabled] [size=256]
	Memory at 7c000000 (32-bit, non-prefetchable) [virtual] [size=1M]
	Expansion ROM at 7c100000 [virtual] [disabled] [size=128K]
	Capabilities: [48] Vendor Specific Information: Len=08 <?>
	Capabilities: [50] Power Management version 3
	Capabilities: [64] Express Legacy Endpoint, MSI 00
	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>

1d. clinfo - no OpenCL devices detected:

Number of platforms                               1
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.1 AMD-APP (3513.0)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback 
  Platform Extensions function suffix             AMD
  Platform Host timer resolution                  1ns

  Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 0

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No devices found in platform

1e. glxinfo - CPU is being used rather than the GPU (glxinfo | grep "OpenGL renderer"):

OpenGL renderer string: Mesa Intel(R) Graphics (ADL GT2)

GPU Detection 2

The difference here (compared to GPU Dection 1) is I plug in the EGPU at the Gnome lockscreen instead of pre-boot. Doing this allows clinfo to see my GPU.

This is the process I used to test GPU detection in this section:

  1. Shut down laptop
  2. Disconnect EGPU
  3. Boot up laptop
  4. Connect EGPU at Gnome lockscreen
  5. Login at Gnome lockscreen
  6. Open terminal and type in the respective command

2a. Resolve: “Unsupported GPU Processing Mode. Please review the GPU drivers and GPU configuration under preferences.”

2b. Minecraft: F3 menu GPU name “Mesa Intel(R) Graphics (ADL GT2) - 4.6 (Core Profile) Mesa 22.2.4 (git-80df10f902)”

2c. lspci - GPU is detected whilst EGPU is connected via thunderbolt (sudo lspci -vnn | grep VGA -A 12)

00:02.0 VGA compatible controller [0300]: Intel Corporation Alder Lake-P Integrated Graphics Controller [8086:4626] (rev 0c) (prog-if 00 [VGA controller])
	Subsystem: Device [f111:0002]
	Flags: bus master, fast devsel, latency 0, IRQ 149, IOMMU group 1
	Memory at 606c000000 (64-bit, non-prefetchable) [size=16M]
	Memory at 4000000000 (64-bit, prefetchable) [size=256M]
	I/O ports at 4000 [size=64]
	Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
	Capabilities: [40] Vendor Specific Information: Len=0c <?>
	Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
	Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit-
	Capabilities: [d0] Power Management version 2
	Capabilities: [100] Process Address Space ID (PASID)
	Capabilities: [200] Address Translation Service (ATS)
--
06:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M] [1002:73df] (rev c5) (prog-if 00 [VGA controller])
	Subsystem: Gigabyte Technology Co., Ltd Device [1458:2331]
	Flags: fast devsel, IRQ 207, IOMMU group 24
	Memory at 6000000000 (64-bit, prefetchable) [disabled] [size=256M]
	Memory at 6010000000 (64-bit, prefetchable) [disabled] [size=2M]
	I/O ports at 3000 [disabled] [size=256]
	Memory at 52000000 (32-bit, non-prefetchable) [virtual] [size=1M]
	Expansion ROM at 52100000 [virtual] [disabled] [size=128K]
	Capabilities: [48] Vendor Specific Information: Len=08 <?>
	Capabilities: [50] Power Management version 3
	Capabilities: [64] Express Legacy Endpoint, MSI 00
	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>

2d. clinfo - AMD GPU is now detected:

Number of platforms                               1
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.1 AMD-APP (3513.0)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback 
  Platform Extensions function suffix             AMD
  Platform Host timer resolution                  1ns

  Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 1
  Device Name                                     gfx1031
  Device Vendor                                   Advanced Micro Devices, Inc.
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 2.0 
  Driver Version                                  3513.0 (HSA1.1,LC)
  Device OpenCL C Version                         OpenCL C 2.0 
  Device Type                                     GPU
  Device Board Name (AMD)                         AMD Radeon RX 6700 XT
  Device PCI-e ID (AMD)                           0x73df
  Device Topology (AMD)                           PCI-E, 0000:06:00.0
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               20
  SIMD per compute unit (AMD)                     4
  SIMD width (AMD)                                32
  SIMD instruction width (AMD)                    1
  Max clock frequency                             2725MHz
  Graphics IP (AMD)                               10.3
  Device Partition                                (core)
    Max number of sub-devices                     20
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x1024
  Max work group size                             256
  Preferred work group size (AMD)                 256
  Max work group size (AMD)                       1024
  Preferred work group size multiple (kernel)     32
  Wavefront width (AMD)                           32
  Preferred / native vector sizes                 
    char                                                 4 / 4       
    short                                                2 / 2       
    int                                                  1 / 1       
    long                                                 1 / 1       
    half                                                 1 / 1        (cl_khr_fp16)
    float                                                1 / 1       
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     No
    Infinity and NANs                             No
    Round to nearest                              No
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              12868124672 (11.98GiB)
  Global free memory (AMD)                        12566528 (11.98GiB) 12566528 (11.98GiB)
  Global memory channels (AMD)                    6
  Global memory banks per channel (AMD)           4
  Global memory bank width (AMD)                  256 bytes
  Error Correction support                        No
  Max memory allocation                           10937905968 (10.19GiB)
  Unified memory for Host and Device              No
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   Yes
    Fine-grained system sharing                   No
    Atomics                                       No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Preferred alignment for atomics                 
    SVM                                           0 bytes
    Global                                        0 bytes
    Local                                         0 bytes
  Max size for global variable                    10937905968 (10.19GiB)
  Preferred total size of global vars             12868124672 (11.98GiB)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        16384 (16KiB)
  Global Memory cache line size                   64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             29663
    Max size for 1D images from buffer            134217728 pixels
    Max 1D or 2D image array size                 8192 images
    Base address alignment for 2D image buffers   256 bytes
    Pitch alignment for 2D image buffers          256 pixels
    Max 2D image size                             16384x16384 pixels
    Max 3D image size                             16384x16384x8192 pixels
    Max number of read image args                 128
    Max number of write image args                8
    Max number of read/write image args           64
  Max number of pipe args                         16
  Max active pipe reservations                    16
  Max pipe packet size                            2347971376 (2.187GiB)
  Local memory type                               Local
  Local memory size                               65536 (64KiB)
  Local memory size per CU (AMD)                  65536 (64KiB)
  Local memory banks (AMD)                        32
  Max number of constant args                     8
  Max constant buffer size                        10937905968 (10.19GiB)
  Preferred constant buffer size (AMD)            16384 (16KiB)
  Max size of kernel argument                     1024
  Queue properties (on host)                      
    Out-of-order execution                        No
    Profiling                                     Yes
  Queue properties (on device)                    
    Out-of-order execution                        Yes
    Profiling                                     Yes
    Preferred size                                262144 (256KiB)
    Max size                                      8388608 (8MiB)
  Max queues on device                            1
  Max events on device                            1024
  Prefer user sync for interop                    Yes
  Number of P2P devices (AMD)                     0
  Profiling timer resolution                      1ns
  Profiling timer offset since Epoch (AMD)        0ns (Thu Jan  1 01:00:00 1970)
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Thread trace supported (AMD)                  No
    Number of async queues (AMD)                  8
    Max real-time compute queues (AMD)            8
    Max real-time compute units (AMD)             20
  printf() buffer size                            4194304 (4MiB)
  Built-in kernels                                (n/a)
  Device Extensions                               cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program 

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [AMD]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  Success (1)
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   gfx1031
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   gfx1031
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 AMD Accelerated Parallel Processing
    Device Name                                   gfx1031

1e. glxinfo - CPU is being used rather than the GPU (glxinfo | grep "OpenGL renderer"):

OpenGL renderer string: Mesa Intel(R) Graphics (ADL GT2)

I’ve tried every guide I could find over the past few weeks, but to no avail. Any ideas or suggestions would be really appreciated.

Well, I’m not really sure what is your issue, so it’s hard to say if it’s best addressed in this thread or not. But, let’s think through a few things:

  • Do you know if you need openCL or openGL or both? if you only need OpenGL, I would drop the openCL stuff, if you can, just to focus on the important components until you get this figure out.
  • I have no idea what is mokutil; you’re scaring me.
  • Those kernel options seem old. I recognize them from a long time ago. Not saying they aren’t needed, but your eGPU is very new and I don’t think anything ‘cik’ would be useful. IIRC, that’s GCN 1/2 stuff? Your eGPU is RDNA2, right? That’s generations apart.
  • You have two GPUs.
    – Have you tried with just the one GPU you want to use? This may not be possible, but if it is, this would be well worth trying. Can you disable the internal one?
    – Your clinfo looks great. That means that you’ve done your job at the sysadmin level. Could it be that the software you want to use just cannot use it, not because it’s unusable, but because that software isn’t clever enough to find and access it?
    – Can you make your software use BOTH GPUs?
    – Can you try your eGPU on another Fedora system or another GNU/Linux system without another GPU to get in the way?
  • Have you tired removing all the EL9 stuff. If I was going to use AMD’s repos, I’d use EL9, too, but, as others have suggested here and elsewhere, you should try with Fedora repos only, first. Remove everything, boot with no AMD stuff that isn’t absolutely necessary for booting, then disable AMD repos and install all the AMD stuff you want from official. But, I think it’s clear that
  • Davinci Resolve looks like a serious, professional product. Have you tried the community for that product? Sure, Fedora users are pretty eclectic, but users of that software would probably be better able to help you. I feel like this is a software issue, not a system issue; first impression. You said Resolve actually detected the GPU at one point, but didn’t work. Sounds like that was close and you could ask product support about it.
  • I don’t know anything about this software. It’s going to be hard to troubleshoot this problem outside the software, itself, because I only know other software, so, even if we try some other things, there’s no guarantee it will lead to an useful insight. My only idea is to find some other software that we can try to make use your eGPU, like, say, BOINC, and then at least we’ll know if the problem is software compatibility or missing system functionality. But BOINC is OpenCL, not OpenGL. Can you get glxgears to work with a specific GPU? Hmm, yes, it works on a particular display. If you can hook up your eGPU to a display, set up Wayland to work it, you should be able to test OpenGL accel on it an compare with your other monitor. Maybe? I mean, I’m grasping at this point. If I think of anything, I’ll let you know, but I hope you can see where I’m going with it this approach.
  • Last idea is to go full AMDGPU-PRO, but that has to be worse today than ever before. You’d really have to tear into the installer and fix it for fedora. I not aware of anyone using -PRO anymore on Fedora. It used to work and break things…now it just doesn’t seem possible or necessary.

In the past when using an Nvidia GPU with Resolve I’m given the options “CUDA” and “OpenCL” in the GPU settings.

To my understanding, it’s used to import secure boot keys so that DKMS can work along side secure boot.

I found them during my research into getting AMD GPUs to work on Fedora. During that one instance I mentioned where Resolve would start up and Minecraf twould utilise the GPU, glxinfo was reporting my AMD GPU as being the primary display driver (rather than Intel internal graphics). Perhaps this is related to those kernel arguments?

The first GPU is my CPU’s Intel internal graphics. I’m not sure they support OpenCL, as if they did I would expect Resolve to start up and allow me to use Intel internal graphics as its GPU. To clarify, Resolve requires a GPU to get beyond the settings page and actually start up.

I’m unfortunately unable to do this.

I have documented my attempt at removing all the packages and simply relying on the amdgpu kernel driver. In short, the GPU was seemingly only seen by lspci, and no applications used it. Perhaps there is a kernel argument I could use to force it as the primary GPU so that all applications use it?

I use the “free” version of Davinci Resolve, rather than the paid “Davinci Resolve Studio” so the support I receive may be limited, however I have seen threads about AMD GPUs on Linux on their forums from this year so it may still be worth a go.

I was using “Minecraft” as my test application alongside Resolve as my expectation was if the kernel GPU driver was working correctly then Minecraft would automatically use the AMD GPU rather than my CPU. It is possible that the amdgpu kernel module is functioning correctly, and the only problem is for some reason my system prefers the CPU so uses that as the primary GPU. If I could figure out how to force the AMD GPU into being the primary GPU, the results of my tests with various package configurations may change.

This is an eGPU though, so ideally even if it is set as the primary GPU, if my system boots with it disconnected I definitely want to fallback to Intel internal graphics as the GPU so I’m not dumped into a TTY session with no Gnome GUI.

I have local copies of the AMD Main, AMD Proprietary, and AMD ROCM repositories so I should have full control of which AMD Pro packages I install without depending on the amdgpu-install package.

I have further configuration attempts which I have posted on the Framework forum, but I’m unsure if it’s appropriate to continue sharing verbose logs to this “solved” thread. Is it a better idea to link to the Framework forum here, or create a new thread on this forum so members of this forum can contribute ideas without needing to sign up to another forum?

I found them during my research into getting AMD GPUs to work on Fedora. During that one instance I mentioned where Resolve would start up and Minecraf twould utilise the GPU, glxinfo was reporting my AMD GPU as being the primary display driver (rather than Intel internal graphics). Perhaps this is related to those kernel arguments?

I doubt it. I’m pretty confident those options are irrelevant. The ‘radeon’ driver doesn’t even work with your GPU, and you’re past CI generation, so AMDGPU is the only driver that could work.

I don’t think I see where you got your eGPU to show up as “primary”; looks like the same output both times.

The first GPU is my CPU’s Intel internal graphics. I’m not sure they support OpenCL, as if they did I would expect Resolve to start up and allow me to use Intel internal graphics as its GPU. To clarify, Resolve requires a GPU to get beyond the settings page and actually start up.

I’ve never had an integrated GPU, so I have no experience to lend, here. But, I know that, in the past, sometimes, they can be disabled in the BIOS. This may make your system unbootable, or it might just delay video output until later in the boot process where it finds your eGPU. It might just think it’s an embedded device, which is fine. You may want to configure SSHD or something for remote access. But, even if not, and if you can disable it, it would be worth trying. It’s not going to hurt anything to try.

I’m guessing your integrated GPU is supported for OpenGL via MESA, at least, but it looks like OpenCL isn’t working on it. Not sure where to look for that, but it doesn’t seem important.

I have documented my attempt at removing all the packages and simply relying on the amdgpu kernel driver. In short, the GPU was seemingly only seen by lspci, and no applications used it. Perhaps there is a kernel argument I could use to force it as the primary GPU so that all applications use it?

Perhaps; I don’t know. But, if it is possible to make your eGPU be your primary video card, then you don’t need your internal one. Again, if you can disable it, I’d try that. You will not get dumped to a TTY if you have NO working video adapter. And, if you only had the eGPU working, that would be very helpful for testing.

Sure, you’d like it to fall back, but that’s something your BIOS/UEFI config would have to support as a feature. As you have it now, you’re trying to make it work with two different GPUs. That’s not the most straight forward hardware configuration to get working. In the long past, X11 wouldn’t support two different drivers, IIRC, but most dual head setups just had two identical cards or dual-head cards so nobody cared much. Even after that, there was SLI, so again, people usually had cards that worked together. I think your heterogeneous situation should work and is possible, but it requires features that are NOT required for more common setups.

Regardless, any AMD card that is supported by AMDGPU (or radeon) doesn’t need any third-party pkgs for basic functionality. AFAIK, and have proved in this thread for at least my cards, meaning RDNA+, even basic OpenGL/CL works without the AMD repo stuff. I still think you should go back to full fedora pkgs. If you see it in clinfo, then you’re system is probably fine, and I get that without any non-Fedora pkgs. That’s what we’ve shown in this thread.

Since you say Render uses OpenCL, then I suggest using an OpenCL app to test. clpeak will let you benchmark a specific GPU. You should definitely try that. If that works, then your system is functional and the problem is a compatibility issue between the app and the system, which may or may not be something you can fix on the system side.

I was using “Minecraft” as my test application alongside Resolve as my expectation was if the kernel GPU driver was working correctly then Minecraft would automatically use the AMD GPU rather than my CPU.

I’m pretty sure your AMDGPU stuff is fine. I don’t know Minecraft, but unless you know how to make it use a specific GPU on a multi-GPU system, then it’s not much help in this test. For GLXgears to work, you need to have a display working in the GPU, but you should be able to do that; I’m assuming you see both GPUs as adapters in GNOME under Wayland.

I have local copies of the AMD Main, AMD Proprietary, and AMD ROCM repositories so I should have full control of which AMD Pro packages I install without depending on the amdgpu-install package.

Finally, and this is why I started this thread, the pkgs are NOT all that amdgpu-install does. You CANNOT install AMDGPU-PRO buy installing pkgs from the AMD -PRO repo. In fact, you cannot install AMDGPU this way, either. The only thing you can do with the AMD repos is get pkgs COMPATIBLE with the Fedora AMDGPU driver and install them. This can add functionality, but not all pkgs there are compatible and there’s nothing you can do to make it work if it doesn’t. This is the whole thing with this thread. AMDGPU-PRO compiles a new kernel module for your system with proprietary blobs. AFAIK, this guaranteed to fail on Fedora. I’ve tried many times in the past. There are people who can tear into the installer and have said they can get it to work by manually extracting package and fixing multiple scripts. I assume I could do this, but it would take a long time for me to figure it out, and no one has published on this method lately that I’ve found. Let me know if you do.

Not that I want AMDGPU-PRO. If I did, I would have given up on Fedora long ago. If you want AMDGPU-PRO, I’d say fedora is the wrong distro.

Is it a better idea to link to the Framework forum here, or create a new thread on this forum so members of this forum can contribute ideas without needing to sign up to another forum?

Yeah, I think you should probably start a new thread. I appreciate thinking it through with you, though. I think you should link here so we can follow you.

I’ve created a new thread for my specific issue here.

Just to follow up with yet another test, I can confirm, yet again, that the solution to this thread is: just don’t. You DO NOT NEED amdgpu-installer on Fedora.

I got myself a 7900, which is sooner after release than I usually pick up hardware. I had some trouble that made me question my understanding, but I have confirmed that OpenGL and OpenCL are fully supported, and it seems they have been supported from (nearly?) the initial release day.

So, don’t get tricked into fiddling with extra pkgs for these two features. Fedora has you covered. For OpenCL, just make sure you don’t have MESA-openCl installed, too, only because it doesn’t work and some software doesn’t let you choose the software stack to use, so that can confuse your program, but it doesn’t prevent the ROCm stack from working.

Finally, your application may need more that just OGL & OCL. There are many other features implemented no top of ROCm, like PRIMs, BLAS, FFT, etc., but these may not be working, yet. There have been some nice tables showing support for each sub-component by GPU, but I cannot find a static location where the most up-to-date information can be found, easily. The code for ROCm is OSS on github, but the documentation is not easy to find/read. Maybe refer to your particular application’s community.

But, what I have learned is that, “full support” should appear before the professional cards based on the same architecture come out. So, for example, 7900 is gfx11. When there is a professional card based on gfx11, then you can be sure ROCm will have support for it. Such cards were just announced earlier last week or something. Information about ROCm & the pro cards is easy to find in the ROCm documentation/release information under “Requirements”. (AMD Documentation - Portal) But, as I say, functionality will likely be available earlier than that, but not much earlier.

1 Like