Problem with boot, kernel and Btrfs

dontech · June 3, 2024, 4:28pm

Hello everyone! I’ll describe the issue right away.
There is Fedora 39 installed on a PC and I recently installed the latest kernel, 6.8.11. After the installation is complete I reboot and the kernel doesn’t load, basically after the Grub screen the monitor shows only the loading wording and also the disk activity led gives no signs. I reinstall the kernel but the problem persists. Doing some research I don’t find much.
Then I boot the latest working kernel and run the command sudo dmesg | grep -i failed and get this output:

[    7.035696] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[   12.617748] BTRFS warning (device nvme0n1p2): csum failed root 256 ino 6391800 off 36618240 csum 0xe8ad850d expected csum 0x75be784b mirror 1
[   12.617772] BTRFS warning (device nvme0n1p2): csum failed root 256 ino 6391800 off 36622336 csum 0x703efc6e expected csum 0xa68f5e81 mirror 1
[   12.617781] BTRFS warning (device nvme0n1p2): csum failed root 256 ino 6391800 off 36626432 csum 0xb202d783 expected csum 0x9d56c38b mirror 1
[   12.617789] BTRFS warning (device nvme0n1p2): csum failed root 256 ino 6391800 off 36630528 csum 0xe5548687 expected csum 0x3aaa0e25 mirror 1
[   46.505396] BTRFS warning (device nvme0n1p2): csum failed root 256 ino 6391800 off 36700160 csum 0xe8ad850d expected csum 0x75be784b mirror 1
[   46.505409] BTRFS warning (device nvme0n1p2): csum failed root 256 ino 6391800 off 36704256 csum 0x703efc6e expected csum 0xa68f5e81 mirror 1
[   46.505416] BTRFS warning (device nvme0n1p2): csum failed root 256 ino 6391800 off 36708352 csum 0xb202d783 expected csum 0x9d56c38b mirror 1
[   46.505421] BTRFS warning (device nvme0n1p2): csum failed root 256 ino 6391800 off 36712448 csum 0xe5548687 expected csum 0x3aaa0e25 mirror 1
[   46.505673] BTRFS warning (device nvme0n1p2): csum failed root 256 ino 6391800 off 36700160 csum 0xe8ad850d expected csum 0x75be784b mirror 1
[   46.505684] BTRFS warning (device nvme0n1p2): csum failed root 256 ino 6391800 off 36704256 csum 0x703efc6e expected csum 0xa68f5e81 mirror 1
[   46.505691] BTRFS warning (device nvme0n1p2): csum failed root 256 ino 6391800 off 36708352 csum 0xb202d783 expected csum 0x9d56c38b mirror 1
[   46.505697] BTRFS warning (device nvme0n1p2): csum failed root 256 ino 6391800 off 36712448 csum 0xe5548687 expected csum 0x3aaa0e25 mirror 1
[   46.505936] BTRFS warning (device nvme0n1p2): csum failed root 256 ino 6391800 off 36700160 csum 0xe8ad850d expected csum 0x75be784b mirror 1
[   46.505943] BTRFS warning (device nvme0n1p2): csum failed root 256 ino 6391800 off 36704256 csum 0x703efc6e expected csum 0xa68f5e81 mirror 1

From this I deduce that I have problems with the disk or the Btrfs filesystem. I do some more research and run some more terminal commands:

sudo btrfs device stats /
Output:

[/dev/nvme0n1p2].write_io_errs    0
[/dev/nvme0n1p2].read_io_errs     0
[/dev/nvme0n1p2].flush_io_errs    0
[/dev/nvme0n1p2].corruption_errs  20593
[/dev/nvme0n1p2].generation_errs  0

sudo btrfs scrub start -B /
Output:

Starting scrub on devid 1
scrub done for fa7f4375-9aca-4c2c-a122-8a0abaa53c18
Scrub started:    Sun Jun  2 11:50:42 2024
Status:           finished
Duration:         0:02:01
Total to scrub:   39.30GiB
Rate:             332.61MiB/s
Error summary:    csum=253252
    Corrected:      253232
    Uncorrectable:  20
    Unverified:     0
ERROR: there are 1 uncorrectable errors

From live Fedora sudo btrfs check /dev/nvme0n1p2
Output:

Opening filesystem to check...
Checking filesystem on /dev/nvme0n1p2
UUID: fa7f4375-9aca-4c2c-a122-8a0abaa53c18
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space tree
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
found 39998722048 bytes used, no error found
total csum bytes: 36739452
total tree bytes: 2296004608
total fs tree bytes: 2134851584
total extent tree bytes: 104857600
btree space waste bytes: 483752904
file data blocks allocated: 234120581120
  referenced 88654282752

From live Fedora sudo btrfs check -p --readonly --check-data-csum /dev/nvme0n1p2
Output:

	Opening filesystem to check...
	Checking filesystem on /dev/nvme0n1p2
	UUID: fa7f4375-9aca-4c2c-a122-8a0abaa53c18
	[1/7] checking root items                      (0:00:01 elapsed, 937917 items checked)
	[2/7] checking extents                         (0:00:06 elapsed, 131284 items checked)
	[3/7] checking free space tree                 (0:00:00 elapsed, 53 items checked)
	[4/7] checking fs roots                        (0:00:07 elapsed, 120988 items checked)
	mirror 1 bytenr 12749418496 csum 0xe8ad850d expected csum 0x75be784b996 items checked)
	mirror 1 bytenr 12749422592 csum 0x703efc6e expected csum 0xa68f5e81
	mirror 1 bytenr 12749426688 csum 0xb202d783 expected csum 0x9d56c38b
	mirror 1 bytenr 12749430784 csum 0xe5548687 expected csum 0x3aaa0e25
	mirror 1 bytenr 20133556224 csum 0x9be6b08c expected csum 0xa1912793623 items checked)
	mirror 1 bytenr 20133560320 csum 0x1631ae43 expected csum 0x45bad2e7
	mirror 1 bytenr 20133564416 csum 0xa1567b66 expected csum 0x9d52b848
	mirror 1 bytenr 20133568512 csum 0xd199cb0a expected csum 0x5d2a2c8c
	mirror 1 bytenr 20133572608 csum 0x0166da36 expected csum 0x6cccacdb
	mirror 1 bytenr 20133576704 csum 0xec5680d3 expected csum 0x8b348092
	mirror 1 bytenr 20133580800 csum 0x22c9df3a expected csum 0x98ef5185
	mirror 1 bytenr 20133584896 csum 0x7eb0913e expected csum 0xa4f57dfa
	mirror 1 bytenr 27345760256 csum 0xf38bbef5 expected csum 0x055f2878553 items checked)
	mirror 1 bytenr 27345764352 csum 0xe5fb6267 expected csum 0x6ade60b2
	mirror 1 bytenr 27345768448 csum 0x1b64811b expected csum 0xf5aac402
	mirror 1 bytenr 27345772544 csum 0x26c53775 expected csum 0x9a100b20
	mirror 1 bytenr 27345776640 csum 0x6d9fba29 expected csum 0xa02c129c
	mirror 1 bytenr 27345780736 csum 0xa256887e expected csum 0x94099a0a
	mirror 1 bytenr 27345784832 csum 0x3d538e26 expected csum 0x8e7a2967
	mirror 1 bytenr 27345788928 csum 0xfa0c1ed5 expected csum 0xb2bb6f52
	[5/7] checking csums against data              (0:01:44 elapsed, 265872 items checked)
	ERROR: errors found in csum tree
	[6/7] checking root refs                       (0:00:00 elapsed, 11 items checked)
	[7/7] checking quota groups skipped (not enabled on this FS)
	found 39648686080 bytes used, error(s) found
	total csum bytes: 36539540
	total tree bytes: 2150678528
	total fs tree bytes: 1989967872
	total extent tree bytes: 104415232
	btree space waste bytes: 454815130
	file data blocks allocated: 224884305920
	 referenced 86042050560

From these outputs I think there is some problem with the Btrfs filesystem on the NVMe.
Are these problems serious? Can they compromise the data? Do they indicate that the NVMe is corrupted? Could they be the cause of the kernel not booting?

I also created on the same disk a new Btrfs partition of the same size and run the same commands to compare the outputs and find the same errors. Apparently no errors are being reported, so perhaps I can rule out possible NVMe hardware problems.
Could it just be related to the single partition on which Fedora is installed?

Finally, I realized that I occasionally experience boot problems even with the 6.8.10 kernel running: basically, once I boot the system crashes before GDM appears and I can only interact with the tty shell. I have to reboot in order to access the system with the Gnome shell working.

Can anyone help me solve it?

barryascott · June 4, 2024, 2:39pm

I would suspect hardware error on the disk.
Are there disk errors reported in dmesg?

What is the output of smartctl -a /dev/nvme0n1p2?
How old the the ssd?

chrismurphy · June 4, 2024, 3:05pm

By default on Fedora Btrfs keeps two copies of file system metadata in different locations. The large corrected errors suggests that most of the corruption was in one copy of the metadata therefore it was possible to use the other copy and fix up the first copy. The uncorrectable is probably (guessing) 20 4KiB blocks in a single file, i.e. that’s data corruption, and since there’s only one copy of data, it’s not correctable. During the scrub, dmesg will show an error for each block, so it gets noisy quickly even if it’s just one file, but also one of the info messages will be a path to the file affected.

Btrfs will not hand over known corrupt data to user space. Instead it returns EIO (IO error) for each 4KiB block of data that doesn’t pass checksum verification, and what happens next depends on whatever program is trying to use that file. if it’s user data, the corruption wouldn’t affect boot. So yeah we’d need to see the entire boot log without filtering to see if there’s something else going on now that the scrub has fixed up most of the issues.

It’s probably easier for back and forth to just show up in https://matrix.to/#/#fedora:fedoraproject.org and ping cmurf but it sounds like the file system itself is OK, there’s just 1 file that’s corrupted and it’s bad luck that it’s (maybe) a file that’s needed for booting. But without complete logs I’m just guessing.

Corruption of this type is not typically a kernel bug, it’s very likely the SSD is dying. The typical way SSDs die, we see transient corruption of fs metadata (or data, which is more common because it’s a much much much bigger target) whereby the SSD returns garbage or zeros, and hence there’s a checksum mismatch error from Btrfs. The SSD itself doesn’t report an error in these cases, it just returns junk and Btrfs is designed to complain about this).

dontech · June 5, 2024, 1:32pm

If the disk was damaged at the hardware level, should I experience problems in the other partitions as well? Instead, all other Btrfs partitions are healthy and error-free.
In dmesg are the errors I wrote in the initial post.

The disk is practically new, with only a few hours on it.
This is the output of smartctl:

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        23 Celsius
Available Spare:                    100%
Available Spare Threshold:          1%
Percentage Used:                    0%
Data Units Read:                    2.122.731 [1,08 TB]
Data Units Written:                 618.241 [316 GB]
Host Read Commands:                 16.347.254
Host Write Commands:                12.331.067
Controller Busy Time:               67
Power Cycles:                       19
Power On Hours:                     37
Unsafe Shutdowns:                   4
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               23 Celsius
Temperature Sensor 2:               30 Celsius

Error Information (NVMe Log 0x01, 16 of 64 entries)
No Errors Logged

Read Self-test Log failed: Invalid Field in Command (0x002)

Instead, this is the output of sudo nvme smart-log /dev/nvme0n1:

Smart Log for NVME device:nvme0n1 namespace-id:ffffffff
critical_warning            : 0
temperature                : 24 °C (297 K)
available_spare                : 100%
available_spare_threshold        : 1%
percentage_used                : 0%
endurance group critical warning summary: 0
Data Units Read                : 2122731 (1.09 TB)
Data Units Written            : 618069 (316.45 GB)
host_read_commands            : 16347252
host_write_commands            : 12326822
controller_busy_time            : 67
power_cycles                : 19
power_on_hours                : 37
unsafe_shutdowns            : 4
media_errors                : 0
num_err_log_entries            : 0
Warning Temperature Time        : 0
Critical Composite Temperature Time    : 0
Temperature Sensor 1           : 24 °C (297 K)
Temperature Sensor 2           : 30 °C (303 K)
Thermal Management T1 Trans Count    : 0
Thermal Management T2 Trans Count    : 0
Thermal Management T1 Total Time    : 0
Thermal Management T2 Total Time    : 0

Can it be ruled out that it is a disk hardware problem?

I also add the Btrfs partition was cloned from another partition with Rescuezilla. Could this be the problem?

dontech · June 5, 2024, 1:41pm

Apparently instead the corruption probably affects booting even now that scrubbing has solved most of the problems, since in some cases the Gnome shell gets stuck before the GDM login screen appears. In this case I am forced to reboot via tty.

How can one view the boot log without filters?

It is strange that the SSD is already dying since it is practically new. Is it possible that it is defective?
Is it possible to check the SSD hardware further?

barryascott · June 5, 2024, 4:08pm

I need to see the output of smartctl -a for your device.
It will lists lots of info about the drive and may have evidence of failure.

gnwiii · June 7, 2024, 8:46pm

What version of Rescuezilla?
Rescuezilla 2.5 enhanced btrfs support mentions some problems with partclone and btrfs in previous versions.

dontech · June 14, 2024, 12:29pm

What I wrote in the reply to your message is all the output of smartctl -a.
If any details are missing the reason could be that the NVMe is connected to the motherboard via a PCIe adapter. This perhaps does not allow all the SMART data to be detected.

dontech · June 14, 2024, 12:32pm

From the Rescuezilla site, the latest version available is 2.4.2 and I used that.
Only now do I realize that version 2.5 is available on GitHub.

I also tried to repist the previously created backup but unfortunately it also turns out to be corrupt. What other solution can I take?

gnwiii · June 14, 2024, 1:25pm

Can you “redo from start” with the new version?

Look more information about the Rescuezilla 2.4.2 issue with brtfs in the hope someone has a way to rescue data affected by the bug.

dontech · June 14, 2024, 1:50pm

I’ll try to look up the information you suggested.

Only now restoring the original backup made with Rescuezilla, the partition still reports the errors I described in the first post, which were obviously nonexistent before.

barryascott · June 14, 2024, 4:09pm

The output is missing all the drive attribute values. Which is very strange, never seen that before. You did run as root?

chrismurphy · June 18, 2024, 4:10pm

Not necessarily. And I’m not sure there’s a way to determine the cause because flash tend to just return garbage or zeros instead of your data - rather than issuing error codes back to the kernel. Ergo, the NVMe often won’t spit out any errors at all, it just returns junk. The only way to know it’s junk is if you’re using a file system that checksums everything, like Btrfs. The problem there is Btrfs only knows there’s a csum mismatch which indicates the data changed. We don’t know why. There isn’t enough information.

The typical case is memory bitflips but this tends to be very transient and not affect such a large number of blocks. It does make me think it’s a defective storage device, in particular if you weren’t having such problems with the storage device prior to replacement.

Sources of more frequent bitflips includes CPU overclocking, and power supply issues.

So the difficult task is isolating the cause. I would remove the drive, taking static precautions of course, and make sure the contacts on both logic board and drive are intact and clean. Reseat it, make sure it’s seated correctly. I expect an incompletely seated drive simply wouldn’t work at all rather than produce transient corruption but shrug it’s easier and faster to do this than to set up an RMA on the drive and do a return.

But given that the drive is under some kind of warranty, try doing a return. I would do the return with the manufacturer rather than the company you bought it from, just because what you get back will be a known quantity run through some kind of testing metric by that manufacturer. But it’s fine to go through the original purchase company too.

Another consideration is check if there are firmware updates for this drive. The smartctl -i information will show firmware version, and you can check the manufacturer’s web site for something newer. I can’t say if that will fix this problem, but if there is a new version available it’s there for a reason.

Note that the device stats counter is very simple. It increments the count every time any such error is encountered. If there’s corruption in a single block, and that block is read 20,000 times in a row, the corruption_errs counter will increment to 20000. So you can’t tell if this is 20000 separate blocks each experiencing 1 error, or 1 block experiencing 20000 failed attempts. It also doesn’t tell you what percent of the time a block does return valid data - which would go to whether this is a persistent error or a transient error. You can infer this from dmesg but as you can see, a tiny number of corruptions will produce a lot of lines in dmesg. Btrfs is pretty verbose when it encounters problems. Also note that the counters don’t reset on their own. They continue to count up indefinitely, unless you reset them to zero with -z option.

chrismurphy · June 18, 2024, 4:34pm

Yes but it’s process of elimination. So you just have to tackle things one at a time until it makes sense to move on.

I’m not super familiar with Rescuezilla, what kinds of modes it has. If it’s file based copy, then no I don’t think it’s possible. If it’s a bit for bit (block) copy then it’s possible.

If you run btrfs check --readonly --check-data-csum on the source partition, are there errors? Are they the same errors?

No errors suggests it’s something in the transfer which implicates everything except the source drive: cpu, memory, connections, and the new destination drive. So checking the source file system does help eliminate the source drive as the source of the corruption but it doesn’t tell us where the corruption is coming from.

Before I commit to using any flash, whether it’s USB thumb drive or an expensive NVMe drive, I run f3 on it. Nifty tool to qualify flash in general.
https://oss.digirati.com.br/f3/

The way I use it is simple full disk format (the whole disk, one partition or even no partition) with a coin toss for the file system. It can be btrfs or it can be FAT or f2fs, doesn’t matter. Mount the file system, e.g. to /mnt and the command is:

sudo f3write /mnt
wait
sudo f3read /mnt

You can read the docs what this is doing, but it will determine a few things about the flash if there are problems - corruption, is it really the size that it’s reporting (the fake flash problem), are there read or write errors, and maybe some other things.

The write will fill the entire drive full so it will take some time. And the read will likewise read every block of every test file previously written.

If there are errors, it could still be the connection between logic board and drive. But if you can eliminate that (another drive in the same slot doesn’t have the problem) then it’s the drive. The drive itself has multiple components but it isn’t your task to find out if it’s the drive controller, DRAM, or flash that’s corrupting the data. The entire drive is subject to warranty so at the point you’re confident it’s “something” related to the drive, just get it replaced under warranty.

If there are no errors, next I would use blkdiscard to wipe the entire drive (every single cell on the drive will be subject to the drive firmware’s garbage collection routine and will be wiped). And then start over.

If you’re looking for an alternative to cloning with Rescuezila - maybe just rsync copy the files you want from the source to the Btrfs after you do a normal clean install of Fedora?

By the way, when you clone Btrfs file systems (block copy, bit for bit) the kernel will see two identical copies of the same Btrfs file system with the same UUID. They cannot be mounted at the same time or the kernel gets confused and it can cause corruption. Since a long time ago, Btrfs will not permit two Btrfs with the same fs UUID to be mounted at the same time. I’m not familiar with a work around that could compel the kernel to force the mounting of both file systems at the same time - but … shrug

I personally don’t do block copies of Btrfs partitions or drives except as a rescue project, e.g. ddrescue. Instead I regularly use a Btrfs feature called btrfs seed. It makes the source file system read-only mountable, you add a 2nd writable device (partition), remount the file system rw, then remove the 1st (readonly) device. And now Btrfs replicates the extents from the seed device to the sprout device. It’s usually faster than either file copy or block copy, and every single block (metadata and data) have their checksums verified on read, so the source is verified during the replication.

This doesn’t guarantee zero corruption cloning but it reduces the likelihood. To get really close to zero chance of corruption we also need ECC memory.

dontech · June 20, 2024, 2:33pm

Yes, I run the command as root.
I did some research on this, and apparently the most likely reason for the output being reduced is because the NVMe is connected via a PCIe adapter to the motherboard.

However below I report the output again:


smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.8.11-200.fc39.x86_64] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       Fanxiang S500Pro 256GB
Serial Number:                      FXS500Pro240511367
Firmware Version:                   SN12465
PCI Vendor/Subsystem ID:            0x1e4b
IEEE OUI Identifier:                0x000000
Total NVM Capacity:                 256.060.514.304 [256 GB]
Unallocated NVM Capacity:           0
Controller ID:                      0
NVMe Version:                       1.4
Number of Namespaces:               1
Namespace 1 Size/Capacity:          256.060.514.304 [256 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            000000 0240511367
Local Time is:                      Thu Jun 20 16:27:17 2024 CEST
Firmware Updates (0x1a):            5 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x001f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat
Log Page Attributes (0x06):         Cmd_Eff_Lg Ext_Get_Lg
Maximum Data Transfer Size:         128 Pages
Warning  Comp. Temp. Threshold:     90 Celsius
Critical Comp. Temp. Threshold:     95 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     6.50W       -        -    0  0  0  0        0       0
 1 +     5.80W       -        -    1  1  1  1        0       0
 2 +     3.60W       -        -    2  2  2  2        0       0
 3 -   0.7460W       -        -    3  3  3  3     5000   10000
 4 -   0.7260W       -        -    4  4  4  4     8000   45000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        29 Celsius
Available Spare:                    100%
Available Spare Threshold:          1%
Percentage Used:                    0%
Data Units Read:                    3.164.860 [1,62 TB]
Data Units Written:                 956.256 [489 GB]
Host Read Commands:                 22.456.734
Host Write Commands:                18.653.344
Controller Busy Time:               101
Power Cycles:                       39
Power On Hours:                     83
Unsafe Shutdowns:                   5
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               29 Celsius
Temperature Sensor 2:               34 Celsius

Error Information (NVMe Log 0x01, 16 of 64 entries)
No Errors Logged

Read Self-test Log failed: Invalid Field in Command (0x002)

dontech · June 20, 2024, 3:56pm

Thank you for all the information you provided!

I have formatted the whole NVMe disk and run the various Btrfs commands for checksum checking and I do not encounter any errors.
I have also followed the other advice you suggested and tried to isolate the cause: the NVMe disk is healthy and working properly, the problem was caused by the backup mishandled by Rescuezilla.

In fact, I tried restoring the backup created with Rescuezilla to the original disk and partition (which were free of csum errors) and even there I encounter errors after restoring.

Regarding the disk statistics counter, after running dmesg I found the corrupt files, deleted them, and the counter stopped increasing the number of corruptions.errors.
I also ran the btrfs commands to repair and rebuild checksums, so it seems that the boot problems seem to be gone but the csum errors remain.

dontech · June 20, 2024, 4:44pm

Rescuezilla makes a copy based on the blocks.

On the source partition it is error-free.

Thanks for suggesting F3, I was not familiar with it. I understand that running it will erase all the data on the disk, is that correct?
I will test the disk with F3.

At the moment I have only checked the disk by smartctl testing and connecting it to a Windows PC with CrystalDiskInfo. Both of these tools report no errors in the disk.

Definitely for future occasions I will avoid Rescuezilla to clone Btrfs partitions and use filesystem commands.
Only now with the corrupted filesystem I can’t complete commands like btrfs seed or send|receive without errors.

With rsync I copied the various system configuration files and the data I need, making an additional backup.

I am now left with two possible solutions:

try sending a snapshot of the system to a new partition and make it bootable. That way I can then test if it also has csum errors.
I am testing this solution but from a live session after mounting the partition I encounter an error with the grub installation: sudo: unable to allocate pty: No such device;
reinstall Fedora after formatting the NVMe. In this case I would like to be able to migrate all my current system configurations. How would I be able to do this? Also what Btrfs partitioning scheme would be recommended?

jrredho · June 20, 2024, 5:41pm

On the cloning business, I regularly use ddrescue for disk swaps and as my full backup method. For the latter, I know it’s crude, but it works for my dual-booted 1 NVMe SSD system with both a Bitlocker’d partition, and a LUKS-encrypted BTRFS partition.

The utility has been bomber both at whole disk and partition-only cloning jobs. Maybe try cloning your disk with that as a next step? I should note that ddrescue is not part of the default Workstation installation image.

dontech · June 22, 2024, 4:30pm

Thank you for the suggestion! On the next partition backup I will try ddrescue.

Now my problem is to try to repair the Btrfs volume/partition or restore the system to a new Btrfs partition.

computersavvy · June 23, 2024, 8:11pm

fsck is not a tool to use for btrfs AFAIK. There are btrfs tools to do similar tasks.

Topic		Replies	Views
BTRFS no longer mounts Ask Fedora btrfs , f39	47	4456	May 28, 2024
Moved partitions, system now unbootable Ask Fedora partitioning , btrfs	3	895	November 8, 2021
Btrfs woes Ask Fedora f38 , btrfs , intel	16	3015	September 23, 2023
Fedora no longer boots with any kernel (grub works): root file system/btrfs broken? File systems cannot be mounted by live systems Ask Fedora f38 , gnome	49	2516	June 20, 2023
Not boot not disks Ask Fedora f34	78	2589	June 13, 2022

Problem with boot, kernel and Btrfs

Related topics