Softlockups being reported after live migrating KVM guests

Problem

For non-idle qemukvm guests running on a ppc64 LPAR with f40 as qemukvm host, softlocks might be reported after live migration is completed. These lockups may/may-not recover.

Cause

The issue is primarily caused by not saving and restoring some of registers that comprise the KVM guests state during guest migration. These are:

  • MMCR3
  • SDAR
  • DEXCR
  • HASHKEYR

Related Issues

Bugzilla report: #2293597

Workarounds

No workarounds are available at the moment however rebooting the qemukvm guests after softlockups are reported, works.

There are couple of kernel/qemu patches that are already merged in upstream and mentioned in the associated bugzilla report. Once these patches make it into F40s Qemu/Kernel builds, the problem will get addressed.

Thanks for the writeup. I currently think that this doesn’t affect a large enough portion of our userbase to be included in Common Issues . It is a very concrete and specialized bug that most people won’t hit. But you can try to convince me :slight_smile:

From Proposed Common Issues to Ask Fedora

Kernel patches to fix this issue were proposed/merged by patchset [PATCH v2 0/8] KVM: PPC: Book3S HV: Nested guest migration fixes and is available in Fedora kernel linux-6.11.11 and later