I switched from a RTX 3060 Ti to an RX 6900 XT, and I’ve been having a lot of issues with system (mostly graphics) stability.
To explain the issue, basically anywhere from every few seconds to minutes, the system will hitch, which is noticeable from my cursor (and any other display) freezing for a split second, and audio will glitch (pop/crackle) about 50% of every time. It usually progresses the longer the system has been on, to the point where it will happen several times per second, and eventually my primary monitor will completely freeze, to where I have to turn both monitors off for a while, and then it’ll be back to normal again. (Edit: Just had it happen where nothing I did would unfreeze either of my displays, had to REISUB it.)
I have never had any issues like this once on my nvidia graphics card, so I don’t know what’s up with this.
The only other factor I can think of is I’m currently on an ungrounded outlet (GFCI protected, but ungrounded) - but from what I could tell from brief searching, this shouldn’t have any direct effect on system stability. I can probably get an extension cord to the basement if I really had to test that.
Maybe reduce the hdmi version in the monitor settings.
The amdgpu open-source driver on Linux is currently limited to the HDMI 2.0 standard for its core functionality. While modern AMD Radeon hardware (RX 6000 series and newer) physically supports HDMI 2.1, the amdgpu driver cannot provide full HDMI 2.1 features—such as 4K at 120Hz—due to licensing restrictions imposed by the HDMI Forum.
I don’t know how to do this, but I don’t think I even can anyway. My card has one HDMI port, and plugged into it is an older Dell monitor, 1920x1080 60hz. I don’t think HDMI 2.0 was even around when this monitor was built, let alone 2.1.
My other monitor (primary) is a Dell G2724D - 2560x1440 165hz (but I run it at 144), plugged into the first (middle) DP port. Second port is occupied by my VR headset (not present when these issues started, so unrelated)
I’m already using all three of my ports, best I could do is HDMI>DP but I don’t see how that would actually fix anything (since the origin port is still HDMI), unless it really would make a difference.
Okay, well that answers that.
No errors whatsoever while any of this is going on. (There are unrelated logs, but not errors.)
Fun extra note: after my REISUB incident, and a couple other unrelated restarts, I never had a single hitch for say, two hours afterwards (until I shut it down for the night). I wish this issue was more consistent, but I also can’t complain too much when it decides to disappear entirely.
One of those restarts was to roll back to Mesa 25.2.4 for issues I was having with VR. Unfortunately the hitches are still happening as I currently type this, so that hasn’t helped this problem at all (not that I expected it to).
You mean from HDMI to DP? That does not work.
You can go from DP to anything else.
Which is the native implementation all GPU as I understand it.
They then convert down to HDMI etc.
Maybe something else but in my Lenovo laptop I have an AMD GPU: [AMD/ATI] Rembrandt [Radeon 680M] and an Nvidia one: [GeForce RTX 3060 Mobile]
When I do not install the Nvidia drivers my system is also unstable. I do need to install and use the Nvidia drivers and GPU to make it stable. Sometimes it would be nice to just use the AMD one to save battery power, but with this system and these OS’es it’s not possible.
So, it’s appearing that HDMI was/is the issue. I’ve unplugged the HDMI cable from my card, and for the entire day I haven’t had a single hiccup, even while running VR.
Considering I would like to have two monitors, as well as run VR, but I only have two DP ports, what are my options? Ideally I would like the HDMI port to just work, whatever that takes. But I could probably use a display switch if I had to.
Wrong again! I just had another case where my keyboard completely stopped responding, which forced me to reboot the system, and upon restarting I’m immediately getting the hitches again, even without my HDMI monitor plugged in. And yet again, absolutely no logs in journald, between losing keyboard input and telling the system to restart. I have no idea what to do at this point.
I’m more puzzled by what could have allowed it to last for the entire day without a single issue, and then it’s suddenly back again.
Some further investigations:
Ran systemctl --user restart wireplumber pipewire pipewire-pulse in terminal
Random popping stopped
Opened sound panel from tray icon, several pops happen and then it starts doing the random ones again
Restarted everything again, opened discord, it popped once, and after about 10 seconds it locked up my entire display and I had to turn my monitor off and back on for it to unfreeze
So it seems like opening the sound panel OR opening discord causes the popping to start (which both actions do touch the sound devices briefly), which is weird because I had discord open for the entire day when it was working fine. Worth noting that if I restart the sound system, I also have to restart discord or no audio will go in/out of it.
So I found the missing factor that made it work fine for the whole day. I was having audio dropouts in VR, and was directed to this guide, which did solve the issue but also gives an immense amount of delay to all system audio (maybe 100-200ms or so?). Apparently it also solved the random popping and freezes. The reason it stopped working just now is because I had disabled that config file again and restarted all the audio services, thinking it was unrelated, but apparently that was the tape holding it all together.
So… this isn’t even graphics related whatsoever. It’s the audio system freaking out. But that’s strange, since my audio setup hasn’t changed at all between switching my graphics cards, so I don’t get why that would suddenly come up. Question is, what can I do with that? Should I just make a new post at this point? Are there some other things I can try to see if they would fix the issue?
Unfortunately that did not change anything.
Shortly after booting with the downgraded versions of those two packages, I was hit with another system-completely-unresponsive hangup with severe graphical glitches on screen, which caused me to have to hold the power button to turn off the system. So basically no different than what’s already been happening…
I’ve continued experimenting with the alsa config file, incrementing api.alsa.headroom every restart, since 8192 worked but caused a significant amount of audio delay. I started at 128 and have doubled it twice since then to 512, but what I’m noticing is every time I double it, it takes twice as long for the hitches to start occurring. This feels like some kind of memory leak, or more accurately a process leak? Like something is continually taking up more and more headroom until everything crashes.
In any case, my current data point is I have the headroom set to 512, I’ve been booted for about an hour and 45 minutes, and the hitching started maybe 10-15 minutes ago and has continually gotten worse to where it’s happening at least twice (but up to or over ten times) every minute. I will be trying 1024 next and keep an eye on how long it takes to deteriorate.
I’ve been using the Kjournald browser to view my logs, but there’s absolutely nothing relevant showing up in there. I’ll look into btop and see if it shows me anything useful.