GNOME/X hang

Hello everyone! :wave:

Recently I’ve been noticing more problems with GNOME/X. While this one does not occur frequently, it is disastrous as it leaves the system unusable.

The desktop hangs, with applications no longer working correctly, such as Spotify no longer playing audio.

The system is not completely locked though, as indicated by “num lock” working correctly, as well as services continuing to run and messages continuing to be logged.

I’ve attempted to switch to another terminal for recovery (e.g. ctrl+alt+F3), but it just gives me a flashing underscore instead of a log in prompt. I’m not sure what else to do at this point, so I reboot the system.

Looking through journalctl, I see very little which is helpful. It starts off with gdm complaining about my system being too slow, which I find hard to believe is accurate.

Jul 02 17:40:12 fedora /usr/libexec/gdm-x-session[2992]: (EE) client bug: timer event9 debounce: scheduled expiry is in the past (-97ms), your system is too slow

There is not much else which is helpeful. However, there is one thing which stands out to me.

Jul 02 17:41:01 fedora systemd-oomd[1447]: Killed /user.slice/user-1000.slice/user@1000.service/app.slice/app-gnome-opera-19368.scope due to memory pressure for /user.slice/user-1000.slice/user@1000.service being 63.22% > 50.00% for > 20>

As an aside, this line is incomplete. What is the best way to copy such a long line to clipboard? I tried triple clicking it, but it only gave me the visible portion.

I searched for this message, but did not find anything I could use to help. I’m not sure exactly what this means. It sounds like Opera is using 62% of my memory. This is a massive amount, but it should not be enough to hang the system so drastically.

It also sounds like it was killed, so that should not cause the hang at all.

However it seems something is using too many resources, causing the system to become unresponsive for an extended period of time.

How would I go about monitoring this, or finding the root cause of the problem? I’m not sure what steps to take, given that the system is practically unusable in this state.

There is no warning as far as I’m aware either. Everything is running fine one second, then the desktop hangs the next.

Thanks for reading, as well as any ideas! :slight_smile:

FF

This happened again shortly after my first post, so I realized it was when I started my backup script (bash script running duplicity).

I waited more patiently to use another terminal, in which I found top reporting numerous processes using well over 100% CPU. This strikes me as impossible, but that is what was being reported.

Because it started with the backup script, I killed duplicity, which did not help.

Because of the massive amount of CPU overuse, I suspected there is a problem with reporting the numbers correctly. On a hunch I powered down my running virtual machine, which caused the CPU load to drop, then everything started responding again.

It’s great that I’ve found the root of the problem! :smiley:

However I do not understand it at all. I see no connection between duplicity and my virtual machine. The virtual machine is not being backed up.

I’d appreciate some suggestions for next steps on understanding what is going on and stopping the problem from re-occuring.

Thank you! :slight_smile:

FF

The VM is consuming resources when it is running. You were required to assign a certain number of cpu cores and an amount of memory to that VM and those resources are no longer available to the host while the VM is running. Depending on how many cpu cores and how much memory remains it could be triggering the slowdown with processes that are memory or cpu intensive.

Thanks for your input JV!

Yes, of course the VM is consuming resources while it is running. :slight_smile: The VM runs fine all day long while consuming resources.

I’m quite surprised to read that the resources assigned to the VM are not available to the host. Not only does VMware ESX work differently, which is most of my experience with virtualization, but I noticed that the Red Hat documentation on KVM talks about over-committing resources. As far as I understand, it is a requirement of leaving those resources available on the host to be able to over-commit them.

Regardless, this does not explain why processes which normally run all day long are suddenly using vastly increased amounts of system resources during the backup process.

I will try re-nicing the backup process, perhaps this will help.

Although this article is old, the over-commitment intricacies are discussed here. No matter how much is available for sharing, the difficulty is in actually doing so. You said the problem is solved when you shut down the guest, so it seems reasonably certain the issue is lack of resources to handle the task when the VM is running.

1 Like

Thank you JV, I have bookmarked the article to read when I have more time available to do so.

I agree with your analysis that the problem is related to a lack of available resources. While this provides an immediate work-around, I am unhappy that I do not feel comfortable with my understanding of the details in this situation.

Even if all of the resources assigned to the guest are reserved, I am surprised and unhappy with the way the system responds under load. This is certainly not the Red Hat based kernel I remember, though I am aware that there are crucial differences between a Red Hat based kernel and a Red Hat kernel.

I will read the article as well as investigate further on my end. Do you, or anyone, have a suggestion for a good place to leave my feedback on this issue?

I fear it may not be treated as important, but I would still like to share my finding about how the system operates under load like this, in hopes it will be improved.

FF