F43 bricked by hardware issue (btrfs IO failure)

I was gaming last night but then Steam crashed and I was forced to hard reboot the PC. A while later Firefox also crashed, and again I had to hard reboot. This second time I was also in the process of updating my OS to Fedora 44 (it had downloaded 19% at the time; I don’t think it had started the actual install). After the second reboot I was then getting a btrfs error on my main drive (sda3), with an IO failure.

I booted into an F43 liveUSB I had sitting around, and tried following the instructions here. I first tried mounting the sda3 drive. This didn’t work. I got mount: /mnt: can't read superblock on /dev/sda3. Dmesg was showing quite a few different errors. I’ve piped the dmesg and journalctl outputs into text files and can send parts of those through, but haven’t attached here because I don’t know if they are secure to just dump.

btrfs check /dev/sda3 got to section 3/8 and told me checksum verify failed, before spitting out huge amounts of errors, filling up the terminal. These couldn’t be piped into a file for some reason, so I’ve left them out.

I then tried restoring /dev/sda3 onto a spare 500GB HDD I had lying around, but this also didn’t work, because the computer froze. The live USB session then experienced a kernel panic, which I didn’t think would even be possible. On next boot the liveUSB then said:

error: ../../grub-core/loader/i386/efi/linux.c:159:can’t allocate kernel. out of memory.
error: ../../grub-core/kern/mm.c:552:out of memory

So I downloaded F44 and updated the LiveUSB; this has allowed me to open the system back up and continue the restore. So far it seems to be working. Fingers crossed that this process can finish before another kernel panic. I’m hoping to also be able to access /run/initramfs/rdsosreport.txt as well.

I do have snapshots on my NAS that I can restore most of my data from if worst comes to worst (I use syncthing to always have the NAS data identical to the PC, and QNAP’s snapshot function takes care of the rest). But this is not ideal; I would rather restore the existing data.

The SSD in question is being reported by my system as a Samsung SSD 750 EVO 250GB. These launched in October 2016 and are no longer in production. Is there an easy way I can check the state of the hardware? I’m still not exactly sure what caused the initial errors, and so far have just assumed that it was firefox or valve related. But if it’s a failing SSD, that could explain the system freezes too.

I’ll keep this thread updated with a resolution, but for the time being I just wanted to get some advice on how to get my computer working again and prevent this happening in future; it’s a very disruptive event.

Update: I was able to check the drive health for the 250GB Samsung SSD. The overall assessment is that the disk is okay.

I think I’ve found the problem. The restore process hasn’t quite finished but I’ve already copied 254GB off the SSD. Given that it’s a 250GB SSD, I think it’s safe to say that it was completely full. That explains why my computer would be experiencing regular freezes as well as a kernel panic.

I will keep working on this issue but if a full disk is the issue, I can fix that. Will update later.

After further investigation, it appears that the SATA cable is also a culprit. Nonzero CRC errors is not great. Dmesg is also showing that there have been over 100 power loss events over the lifecycle, which is apparently not good for an SSD. That probably reflects all the hard rebooting I’ve had to do with various issues over the years.

I have ordered a 2TB Nvme SSD and will copy over the data that I’ve restored from the old SSD.

Can you fix the title please as this is nothing to do with Firefox?

Sorry, I initially thought the Firefox crash was what caused the bricking, when the causal chain was actually in the opposite direction. Thanks to whoever updated the title.

I will mark this as resolved.