Frequent crash due to SSD error on lenovo thinkpad

Hi,

In recent months, I’ve been experiencing system crashes. This usually happens after closing and reopening the laptop lid, but I’ve also had random crashes where the computer freezes or fails to save a document.

After these crashes, when trying to shut down, I often see a “failed to unmount” error. About 25% of the time, Fedora then fails to boot. In these cases, I need to plug the laptop in, wait a few minutes, and try again to get it to start.

Today’s crash was particularly severe. When I opened my laptop, I was confronted with a “2100: M.2 not detected” error, which occurred after a crash. I had to plug the laptop in and wait 15 minutes before it would boot.

I’ve updated my firmware to the latest version and kept Fedora up to date. The next step would be to reinstall Fedora, but these repeated issues make me consider switching OS—or even returning to Windows—which is not ideal. I find Fedora much faster and lighter, and I want to support it. However, as a writer, I rely on my computer every single day.

Thank you for your help.

Your next step should be running checks on the disk to make sure it’s not failing. If it’s failing it doesn’t matter what OS you reinstall / move to if it’s going to eventually die anyway.

2100 errors occur for a few reasons. If the laptop has been dropped, kicked, bumped, looked at the wrong way, it’s possible that the m2 drive has become unseated and / or is sitting wonky. Follow the guides to reseat the drive and make sure it has a soild connection in its slot. Then try again. If it’s still having issues, perform SMART testing on the drive to see if it’s throwing any errors.

A couple of entertaining solutions for fixing 2100 errors from Reddit is: dropping your laptop… and if that won’t do you can always try slamming it… which apparently works quite well.

Since yours seems to happen after closing and reopening the laptop lid, it is most likely related to the following issue found here in a post from 4 years ago that links to a now 404’d post on the Lenovo forums.

Personally as a Thinkpad user too, I would back up important stuff, perform the SMART tests and depending on the results replace the drive. I’m pretty sure mine is still under manufacturer warranty, so I’d actually call them to come out and replace it for me.

1 Like

You should make sure you have a good backup of important data and then run S.M.A.R.T diagnostics or the SSD vendor’s diagnostics. Switching distros is unlikley to “fix” SSD errors, and if the SSD is failing, may push it to an early death.

I feel like my smart test is returning pretty good results:

Model Number:                       KXG6AZNV512G TOSHIBA
Serial Number:                      X92F71BTF9LL
Firmware Version:                   5108AGLA
PCI Vendor/Subsystem ID:            0x1179
IEEE OUI Identifier:                0x8ce38e
Total NVM Capacity:                 512 110 190 592 [512 GB]
Unallocated NVM Capacity:           0
Controller ID:                      0
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size/Capacity:          512 110 190 592 [512 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            8ce38e 05000e9142
Local Time is:                      Tue Sep 30 21:05:04 2025 EDT
Firmware Updates (0x14):            2 Slots, no Reset required
Optional Admin Commands (0x001f):   Security Format Frmw_DL NS_Mngmt Self_Test
Optional NVM Commands (0x005f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x0e):         Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg
Maximum Data Transfer Size:         512 Pages
Warning  Comp. Temp. Threshold:     78 Celsius
Critical Comp. Temp. Threshold:     82 Celsius
Namespace 1 Features (0x02):        NA_Fields

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     8.00W       -        -    0  0  0  0        1       1
 1 +     3.90W       -        -    1  1  1  1        1       1
 2 +     2.00W       -        -    2  2  2  2        1       1
 3 -   0.0500W       -        -    3  3  3  3     1500    1500
 4 -   0.0050W       -        -    4  4  4  4     6000   14000
 5 -   0.0030W       -        -    5  5  5  5    50000   80000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         2
 1 -    4096       0         1

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
Critical Warning:                   0x00
Temperature:                        27 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    23%
Data Units Read:                    87 060 836 [44,5 TB]
Data Units Written:                 87 084 234 [44,5 TB]
Host Read Commands:                 1 240 737 348
Host Write Commands:                2 012 538 056
Controller Busy Time:               4 458
Power Cycles:                       2 430
Power On Hours:                     20 783
Unsafe Shutdowns:                   251
Media and Data Integrity Errors:    0
Error Information Log Entries:      12
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               27 Celsius

Error Information (NVMe Log 0x01, 16 of 256 entries)
No Errors Logged

Self-test Log (NVMe Log 0x06, NSID 0xffffffff)
Self-test status: No self-test in progress
No Self-tests Logged

I edited your post and added the preformatted text tags to your smart output so it is much more readable.
Please always use those tags when copying data from your screen and pasting it here so it retains the on-screen formatting and is easily readable.

This is done by highlighting the text after pasting and then click the </> button on the toolbar above the text entry screen.

There is much less data there than we usually see.
Did you use the command as smartctl -a or smartctl -x ?
Some SSDs do not display more than you posted, but some show a lot more detail with the -x option.

It looks about right for nvme output from smartctl. He appears to have snipped the output from start of prompt to “START OF INFORMATION SECTION”. Below is my own output…

[brian@thinkpad ~]: sudo smartctl -x /dev/nvme0
[sudo] password for brian: 
smartctl 7.5 2025-04-30 r5714 [x86_64-linux-6.16.8-200.fc42.x86_64] (local build)
Copyright (C) 2002-25, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       SKHynix_HFS001TEJ9X162N
Serial Number:                      SYCAN03471060AP2P
Firmware Version:                   51730A10
PCI Vendor/Subsystem ID:            0x1c5c
IEEE OUI Identifier:                0xace42e
Controller ID:                      0
NVMe Version:                       1.4
Number of Namespaces:               1
Namespace 1 Size/Capacity:          1,024,209,543,168 [1.02 TB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            ace42e 003ae705c5
Local Time is:                      Wed Oct  1 03:10:59 2025 BST
Firmware Updates (0x16):            3 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x00df):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp Verify
Log Page Attributes (0x1e):         Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Pers_Ev_Lg
Maximum Data Transfer Size:         64 Pages
Warning  Comp. Temp. Threshold:     86 Celsius
Critical Comp. Temp. Threshold:     87 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     7.50W       -        -    0  0  0  0        5     305
 1 +   3.9000W       -        -    1  1  1  1       30     330
 2 +   1.5000W       -        -    2  2  2  2      100     400
 3 -   0.0500W       -        -    3  3  3  3      500    1500
 4 -   0.0050W       -        -    4  4  4  4     1000    9000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
Critical Warning:                   0x00
Temperature:                        46 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    10%
Data Units Read:                    11,913,106 [6.09 TB]
Data Units Written:                 44,041,534 [22.5 TB]
Host Read Commands:                 79,900,690
Host Write Commands:                728,969,641
Controller Busy Time:               17,276
Power Cycles:                       1,635
Power On Hours:                     4,320
Unsafe Shutdowns:                   58
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               40 Celsius
Temperature Sensor 2:               40 Celsius

Error Information (NVMe Log 0x01, 16 of 256 entries)
No Errors Logged

Self-test Log (NVMe Log 0x06, NSID 0xffffffff)
Self-test status: No self-test in progress
No Self-tests Logged

[brian@thinkpad ~]: 

And it returns much better results than the nvme tool does.

[brian@thinkpad ~]: sudo nvme smart-log /dev/nvme0
Smart Log for NVME device:nvme0 namespace-id:ffffffff
critical_warning                        : 0
temperature                             : 46 °C (319 K)
available_spare                         : 100%
available_spare_threshold               : 10%
percentage_used                         : 10%
endurance group critical warning summary: 0
Data Units Read                         : 11913178 (6.10 TB)
Data Units Written                      : 44041842 (22.55 TB)
host_read_commands                      : 79901279
host_write_commands                     : 728976844
controller_busy_time                    : 17276
power_cycles                            : 1635
power_on_hours                          : 4320
unsafe_shutdowns                        : 58
media_errors                            : 0
num_err_log_entries                     : 0
Warning Temperature Time                : 0
Critical Composite Temperature Time     : 0
Temperature Sensor 1                    : 40 °C (313 K)
Temperature Sensor 2                    : 40 °C (313 K)
Thermal Management T1 Trans Count       : 0
Thermal Management T2 Trans Count       : 0
Thermal Management T1 Total Time        : 0
Thermal Management T2 Total Time        : 0
[brian@thinkpad ~]: 

I have! Do you see anything in my output susceptible of alarming you?

The test ‘passed’, temps are low, hours are low. I don’t see any problem there.

What is the contents of your /etc/fstab file?

When and how did you install Fedora, ie, F41, Workstation install?

Hi MatH,
Thanks for helping me:) The content of my /etc/fstab file is the following:


#
# /etc/fstab
# Created by anaconda on Tue Jun 11 14:18:20 2024
#
# Accessible filesystems, by reference, are maintained under '/dev/disk/'.
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
#
# After editing this file, run 'systemctl daemon-reload' to update systemd
# units generated from this file.
#
UUID=5bd131d4-aea4-4e7e-b793-9456931f19f3 /                       btrfs   subvol=root,compress=zstd:1 0 0
UUID=2a3455c9-adaf-49f1-b507-0499f9595e14 /boot                   ext4    defaults        1 2
UUID=C407-0CA7          /boot/efi               vfat    umask=0077,shortname=winnt 0 2
UUID=5bd131d4-aea4-4e7e-b793-9456931f19f3 /home                   btrfs   subvol=home,compress=zstd:1 0 0

I have installed Fedora via a an usb key on my thinkpad carbon x1 gen 7 laptop, immediatly after buying the laptop refurbished, one and a half year ago. It is now up to date on version 42.

Toshiba drives are branded KIOXIA. https://apac.kioxia.com/en-apac/personal/software/ssd-utility.html. They provide an SSD Utility with firmware update support:

Every now and then we recommend you update your SSD’s firmware to enhance performance and stability. Now you can easily update right here in SSD Utility.

Some people have used Dell’s updates on systems from other vendors.

Note that btrfs requires periodic maintenance. See comments in:

https://discussion.fedoraproject.org/t/unallocated-vs-free/146381/7
Fedora provides BTRFS Assistant that can be set to run balance and scrub. I use the default: weekly balance, monthly scrub. There is also btrfs dynamic reclaim.

How can you update SSD firmware when the vendor’s Utility is only available for Windows?

I’ve tried updating the firmware via console line, didn’t find any missing update.

Was that with fwupdmgr or fwupdtool? Was your SSD listed under “Devices with no available firmware updates:”?

Check if KIOXIA participates in the Linux Vendor Firmware Service (LVFS) (if so, contact them about actually releasing firmware updates to LVFS). You can try searching for reports from Linux users who have installed the Dell firmware updates on non-Dell systems. Prior to LVFS, updates were sometimes installed by extracting the update file from a vendor package. I’ve done that myself, but before UEFI systems added complexity/security.

Note that fwupdtool --help has :

install FILE [DEVICE-ID|GUID]              Install a specific firmware on a device, all possible devices will also be installed once the CAB matches
install-blob FILENAME DEVICE-ID [VERSION]  Install a raw firmware blob on a device

along with other options to manipulate firmware files that may allow you to install updates using the Dell packages. You may be able to find someone who has already done this with a web search.

compared to

UUID=dd0af1e8-e42e-4e82-8ce6-929ed3cde132 /                       btrfs   subvol=root,compress=zstd:1 0 0
UUID=18314680-e859-4fde-993f-bdb43ed3ee7e /boot                   ext4    defaults        1 2
UUID=A633-D429          /boot/efi               vfat    umask=0077,shortname=winnt 0 2
UUID=dd0af1e8-e42e-4e82-8ce6-929ed3cde132 /home                   btrfs   subvol=home,compress=zstd:1 0 0

Exactly the same. So I can’t see any issue there.