Stack-protector: Kernel stack is corrupted in __x64_sys_poll

noblehatiki · July 18, 2025, 11:46am

In the last couple of days I’ve started having strange kernel issues. The first was about four days ago, an exception (IIRC a page fault) in a kernel thread that resulted in it stuck in a spin loop. After updating and rebooting, I’ve had three panics with messages about a clobbered stack canary in __x64_sys_poll. It’s currently running memtest, so the information I can provide is a bit limited at the moment, but here’s an overview:

OS: Fedora 42, kernel is whatever is currently in updates-testing
The system is about five months old
CPU: Intel Core Ultra 9 285k
Memory: 4x Crucial CP32G56C46U5. It was running on a 5200MT/s XMP profile, but the issue has persisted after disabling XMP
Motherboard: ASUS PRIME Z890M-PLUS WIFI. I tried updating to the latest firmware and the issue still persists.
GPU: AMD Radeon RX 7800 XT

Every time an issue has occurred has been whilst playing Kerbal Space Program in Wine, but the system has never previously had issues either running KSP or running other workloads so I’m not sure what to make of that. I’m not aware of any obvious trigger; last time it happened was after a couple of hours of running KSP.

Things I’ve tried:

Updating to the latest kernel
Dug around the forums and RedHat and kernel Bugzillas for reports of similar issues, but I haven’t found anything similar
Running memtest. No errors reported either a full pass that I ran yesterday and no errors reported after the one-and-a-half passes that I’m currently running
Updating the BIOS
Disabling XMP

Theories:

Genuine stack-overflow bug in the kernel. It seems unlikely that I alone would be experiencing a serious bug in one of the most used syscalls in the kernel.
Canary overwritten by exploitation attempt. This also seems unlikely since there’s surely simpler ways to exploit a single-user Linux system.
Hardware issue. Given the newness of the system and that I’m more used to working with older components, it seems pretty likely that I’ve misinstalled or misconfigured something that’s leading to memory corruption. But it seems very odd that it’s been running perfectly fine for months, that the issue that manifests is specifically a clobbered canary in poll, and that memcheck isn’t showing any issues.

I’m at a loss for how to go about further diagnostics. Unfortunately semester starts next week, so a half-dozen day-long memtest runs after swapping out DIMMs or reseating components is not really ideal. Any assistance in narrowing down the issue would be greatly appreciated.

barryascott · July 18, 2025, 4:31pm

Does this mean you were over-clocking the memory?

It is possible that the code that detected the problem is not the code that caused the problem.

noblehatiki · July 18, 2025, 4:55pm

Yes, although it’s not like I was manually dialing in values to push beyond manufacturer specifications. The RAM kits are labelled on the box as operating at 5600MT/s, and the XMP profiles that allow such speeds are stored in on-DIMM ROMs which are automatically applied by the BIOS.

That’s true, but I don’t understand your point. Does that hint at something I should be trying?

barryascott · July 18, 2025, 5:00pm

I bug in component X can corrupt memory and crash component Y.
So saying Y is used by everybody so why did no one else see the problem is because others do not have X on their systems.

Does that help?

noblehatiki · July 18, 2025, 5:13pm

I understood what you meant. It just sounded like I was meant to glean something about how to proceed from your observation, and I wasn’t sure what that was.

leorize · July 18, 2025, 8:27pm

Since the issue is reproducible without XMP, I believe at this point you should file a bug with Fedora. The Fedora kernel maintainers would be able to help you nail this down much faster (they usually aren’t on the forums afaict).

It would also be helpful for you to list out the kernel versions you have tried, and whether any of them did not exhibit the problem, given that you said it’s a recent occurrence.

If you wanted to stress test your system further, stressapptest, mprime and y-cruncher are great tools for stressing both RAM and CPU. When it comes to stability testing, it’s recommended that you employ a wide range of tools, as certain faults might only show up in one tool but not another.

Topic		Replies	Views
Strange Happenings after latest update Ask Fedora gnome , nvidia , f41	5	271	February 28, 2025
BOOT_IMAGE=/boot/vmlinuz-5.6.0-rc5.git0.2.fc32.x86_64 crashed Ask Fedora	5	1621	April 5, 2020
Kernel 6.11 troubleshooting help Ask Fedora	1	178	October 31, 2024
Kernel problems Ask Fedora f36 , wayland , gnome , kernel	5	340	August 12, 2022
The state of Fedora and product Quality Ask Fedora	11	466	July 17, 2025

Stack-protector: Kernel stack is corrupted in __x64_sys_poll

Related topics