Fedora CoreOS RAID1 Installation with Ignition Failing

I’ve been working on setting up Fedora CoreOS on my server and running into persistent issues that I haven’t been able to resolve. I’m hoping someone here can help me identify what I’m doing wrong and guide me in the right direction.

What I’m Trying to Achieve

I’m aiming to install Fedora CoreOS with the following setup:

  1. RAID1 Configuration for Boot and Root:
  • Mirrored boot, EFI, and root partitions across two SSDs for redundancy.
  • Configuring RAID1 for /boot and / and EFI.
  1. Separate /var Partition:
  • /var is configured on an NVMe drive (/dev/nvme0n1).
  • It is encrypted with LUKS and will eventually be formatted with ZFS after the installation with the /var partition from the root drive moved to the NVME drive after installation.
  1. Encrypted LUKS Containers for Additional Drives:
  • I plan on setting up the LUKS encryption on some additional drives (NAS-1, NAS-2, NAS-3) that I will arrange in a RAID-Z configuration after installation with ZFS.

The partitions are defined in my Ignition configuration file, which I’ve attached to this post for reference.

The Problem I’m Facing

RAID Assembly Fails During Installation:

The installer logs show errors like:

sdb4: Process '/usr/sbin/mdadm -If sdb4 --path pci-0000:01:00.1-ata-2.0' failed with exit code 1.
sdb3: Process '/usr/sbin/mdadm -If sdb3 --path pci-0000:01:00.1-ata-2.0' failed with exit code 1.
sda4: Process '/usr/sbin/mdadm -If sda4 --path pci-0000:01:00.1-ata-1.0' failed with exit code 1.
sda3: Process '/usr/sbin/mdadm -If sda3 --path pci-0000:01:00.1-ata-1.0' failed with exit code 1.

Then the installer times out waiting for those devices that never get properly made:

Ignition failed: failed to create filesystems: failed to wait on filesystems devs: device unit dev-disk-by\x2dlabel-boot.device timeout

Some things I’ve tried

  • I ran my butane file through the checker with the strict option and I’ve used ignition-validate to confirm that there aren’t any errors with formatting.

  • I’ve adjusted my ignition configuration in so many different ways that I’ve lost count

  • I’ve also read the Fedora CoreOS and Ignition documentation up and down many multiple times but there has to be something I’m missing here.

What I Need Help With

Correctly Configuring RAID in Ignition:

  • Is there something I’m missing in my Ignition file that’s causing the RAID setup to fail?

Here is my Butane configuration file (with mild redactions)

variant: fcos
version: 1.5.0

# Authentication
passwd:
  users:
    - name: core
      ssh_authorized_keys:
        - ssh-ed25519 SSH_KEYS_HERE

# Specify boot device is mirrored between two disks
#boot_device:
  # Mirrored boot disk on boot disk SSD
  #mirror: 
    #devices:
      # Boot Disk 1
      #- /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730 
      # Boot Disk 2
      #- /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F  
storage:
  disks: 
    # Boot Disk 1
    - device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730  
      wipe_table: true
      partitions:
        # EFI partition 1
        - label: esp-1
          size_mib: 1024 
          #type_guid: "c12a7328-f81f-11d2-ba4b-00a0c93ec93b"
        # Boot partition 1
        - label: boot-1
          size_mib: 1024
        # Root partition 1
        - label: root-1
    # Boot Disk 2
    - device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F  
      wipe_table: true
      partitions:
        # EFI parition 2
        - label: esp-2
          size_mib: 1024
          #type_guid: "c12a7328-f81f-11d2-ba4b-00a0c93ec93b"  
        # Boot partition 2
        - label: boot-2
          size_mib: 1024
        # Root parition 2
        - label: root-2
    # NVME Fast Storage used for /var in CoreOS
    - device: /dev/disk/by-id/nvme-WD_Red_SN700_1000GB_220443800077 
      wipe_table: true
      partitions:
        - label: var      
    # Seagate NAS Drive
    - device: /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN0TVG2 
      wipe_table: true
      partitions:
        - label: NAS-1
    # WD NAS Drive 1
    - device: /dev/disk/by-id/ata-WDC_WD40EFPX-68C6CN0_WD-WX62D4312RJ2 
      wipe_table: true
      partitions:
        - label: NAS-2
    # WD NAS Drive 2
    - device: /dev/disk/by-id/ata-WDC_WD40EFPX-68C6CN0_WD-WXH2D430RSK5 
      wipe_table: true
      partitions:
        - label: NAS-3
    # WD NVR Drive
    - device: /dev/disk/by-id/ata-WDC_WD22PURZ-85B4ZY0_WD-WX42AC2M58ZR 
      wipe_table: true
      partitions:
        - label: NVR-1
  raid:
    # Setup RAID device for boot partition
    - name: md-boot
      level: raid1
      devices:
        - /dev/disk/by-partlabel/boot-1
        - /dev/disk/by-partlabel/boot-2
    # Setup RAID device for root partition    
    - name: md-root
      level: raid1
      devices:
        - /dev/disk/by-partlabel/root-1
        - /dev/disk/by-partlabel/root-2       
  luks:
  - name: root
    device: /dev/md/md-root
    clevis:
      custom:
        needs_network: false
        pin: tpm2
        config: '{"pcr_bank":"sha256","pcr_ids":"7"}'
    wipe_volume: true
    #label: root
  - name: var
    device: /dev/disk/by-partlabel/var
    clevis:
      custom:
        needs_network: false
        pin: tpm2
        config: '{"pcr_bank":"sha256","pcr_ids":"7"}'
    wipe_volume: true
    label: var
  - name: NAS-1
    device: /dev/disk/by-partlabel/NAS-1
    clevis:
      custom:
        needs_network: false
        pin: tpm2
        config: '{"pcr_bank":"sha256","pcr_ids":"7"}'
    wipe_volume: true
    label: NAS-1
  - name: NAS-2
    device: /dev/disk/by-partlabel/NAS-2
    clevis:
      custom:
        needs_network: false
        pin: tpm2
        config: '{"pcr_bank":"sha256","pcr_ids":"7"}'
    wipe_volume: true
    label: NAS-2
  - name: NAS-3
    device: /dev/disk/by-partlabel/NAS-3
    clevis:
      custom:
        needs_network: false
        pin: tpm2
        config: '{"pcr_bank":"sha256","pcr_ids":"7"}'
    wipe_volume: true
    label: NAS-3
  - name: NVR-1
    device: /dev/disk/by-partlabel/NVR-1
    clevis:
      custom:
        needs_network: false
        pin: tpm2
        config: '{"pcr_bank":"sha256","pcr_ids":"7"}'
    wipe_volume: true
    label: NVR-1
  filesystems:
    # Primary EFI partition
    - path: /boot/efi
      device: /dev/disk/by-partlabel/esp-1
      format: vfat
      wipe_filesystem: true
      label: esp-1
      with_mount_unit: true
      mount_options: ["umask=0077"]

    # Secondary EFI partition for redundancy
    - path: /boot/efi2
      device: /dev/disk/by-partlabel/esp-2
      format: vfat
      wipe_filesystem: true
      label: esp-2
      with_mount_unit: true
      mount_options: ["umask=0077"]

    # Boot partition
    - path: /boot
      device: /dev/md/md-boot
      format: ext4
      wipe_filesystem: true
      label: boot
      with_mount_unit: true
    
    # Root parition      
    - path: /
      device: /dev/mapper/root
      format: ext4
      wipe_filesystem: true
      label: root
      with_mount_unit: true    
      
  files:
    # Define Hostname as CoreOS_Server
    - path: /etc/hostname
      mode: 0644
      contents:
        inline: CoreOS_Server
    # Define Keyboard Layout as US keymap
    - path: /etc/vconsole.conf
      mode: 0644
      contents:
        inline: KEYMAP=us
    # Configure Swap on ZRAM as 8GB using lz4 compression
    - path: /etc/systemd/zram-generator.conf
      mode: 0644
      contents:
        inline: |
          # This config file enables a /dev/zram0 device with the default settings
          [zram0]
          # Set the fraction of total memory for ZRAM (e.g., 50% of total RAM)
          # Adjust to fit your memory requirements
          zram-size = ram / 2

          # Optionally set the compression algorithm (e.g., zstd, lz4)
          compression-algorithm = lz4

          # Define swap priority (higher number has higher priority)
          swap-priority = 100

          # Max zram device limit to avoid too large allocation, useful for systems with lots of RAM
          max-zram-size = 8192M 
# Configure Grub password
grub:
  users:
    - name: root
    # Specify grub password hash for grub user 'root'
      password_hash: grub.pbkdf2.sha512.GRUB_HASH
                

I’ve already spent an embarrassingly long amount of time on this, so any advice or guidance would be greatly appreciated!

Thanks for your help!

The raid drivers will not be loaded when grub is loading the OS so raid for /boot and /boot/efi is not a good choice.

1 Like

Hello @coreoshlpplz and welcome to :fedora: !

For simplicity and easier troubleshooting, I would suggest starting with just replicating the boot disk. With the the following Butane config you should be able to mirror all the default partitions.

variant: fcos
version: 1.5.0

# Authentication
passwd:
  users:
    - name: core
      ssh_authorized_keys:
        - ssh-ed25519 SSH_KEYS_HERE

# Specify boot device is mirrored between two disks
boot_device:
  # Mirrored boot disk on boot disk SSD
  mirror:
    devices:
      # Boot Disk 1
      - /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
      # Boot Disk 2
      - /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F

storage:
  disks:
    # Boot Disk 1
    - device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
      partitions:
        # Override size of root partition on first disk, via the label
        # generated for boot_device.mirror
        - label: root-1
          size_mib: 10240
    # Boot Disk 2
    - device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F
      partitions:
        # Similarly for second disk
        - label: root-2
          size_mib: 10240

For reference, see Reconfiguring the root filesystem:

If you want to replicate the boot disk across multiple drives for resiliency to drive failure, you need to mirror all the default partitions (root, boot, EFI System Partition, and bootloader code). There is special Butane config syntax for this:

And the following Mirroring the boot disk onto two drives example.

There is also an Advanced example on the same page of the docs.

I’m not sure why you commented out the boot_device object in your configuration.

2 Likes

Thanks everyone for the friendly help! It goes a long way to making the Fedora Community feel friendly and inviting!

I greatly simplified my butane config and implemented your suggestions, but I still keep getting errors…

Here is my NEW Butane configuration file

variant: fcos
version: 1.5.0

# Authentication
passwd:
  users:
    - name: core
      ssh_authorized_keys:
        - ssh-ed25519 SSH_KEY_GOES_HERE (real key in actual file)

# Specify boot device is mirrored between two disks
boot_device:
  # Mirrored boot disk on boot disk SSD
  mirror:
    devices:
      # Boot Disk 1
      - /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
      # Boot Disk 2
      - /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F

storage:
  disks:
    # Boot Disk 1
    - device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
      partitions:
        # Override size of root partition on first disk, via the label
        # generated for boot_device.mirror
        - label: root-1
          size_mib: 10240
    # Boot Disk 2
    - device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F
      partitions:
        # Similarly for second disk
        - label: root-2
          size_mib: 10240
  raid:
    # Setup RAID device for root partition    
    - name: md-root
      level: raid1
      devices:
        - /dev/disk/by-partlabel/root-1
        - /dev/disk/by-partlabel/root-2       
  luks:
  - name: root
    device: /dev/md/md-root
    clevis:
      custom:
        needs_network: false
        pin: tpm2
        config: '{"pcr_bank":"sha256","pcr_ids":"7"}'
    wipe_volume: true
    #label: root
  
  filesystems:
    # Root parition      
    - path: /
      device: /dev/mapper/root
      format: ext4
      wipe_filesystem: true
      label: root
      with_mount_unit: true    
      
  files:
    # Define Hostname as CoreOS_Server
    - path: /etc/hostname
      mode: 0644
      contents:
        inline: CoreOS_Server
    # Define Keyboard Layout as US keymap
    - path: /etc/vconsole.conf
      mode: 0644
      contents:
        inline: KEYMAP=us
    # Configure Swap on ZRAM as 8GB using lz4 compression
    - path: /etc/systemd/zram-generator.conf
      mode: 0644
      contents:
        inline: |
          # This config file enables a /dev/zram0 device with the default settings
          [zram0]
          # Set the fraction of total memory for ZRAM (e.g., 50% of total RAM)
          # Adjust to fit your memory requirements
          zram-size = ram / 2

          # Optionally set the compression algorithm (e.g., zstd, lz4)
          compression-algorithm = lz4

          # Define swap priority (higher number has higher priority)
          swap-priority = 100

          # Max zram device limit to avoid too large allocation, useful for systems with lots of RAM
          max-zram-size = 8192M 

Here is the output of the failures from the rdsosreport.txt

cat rdsosreport.txt | grep fail

[    4.525263] localhost multipathd[483]: _check_bindings_file: failed to read header from /etc/multipath/bindings
[    7.126730] localhost (udev-worker)[834]: sdb4: Process '/usr/sbin/mdadm -If sdb4 --path pci-0000:01:00.1-ata-2.0' failed with exit code 1.
[    7.126730] localhost (udev-worker)[848]: sdb3: Process '/usr/sbin/mdadm -If sdb3 --path pci-0000:01:00.1-ata-2.0' failed with exit code 1.
[   10.362811] localhost (udev-worker)[834]: sda4: Process '/usr/sbin/mdadm -If sda4 --path pci-0000:01:00.1-ata-1.0' failed with exit code 1.
[   10.362961] localhost (udev-worker)[848]: sda3: Process '/usr/sbin/mdadm -If sda3 --path pci-0000:01:00.1-ata-1.0' failed with exit code 1.
[   31.206943] localhost ignition[1017]: disks: createLuks: op(15): [failed]   Clevis bind: exit status 1: Cmd: "clevis" "luks" "bind" "-f" "-k" "/tmp/ignition-luks-1260801658" "-d" "/run/ignition/dev_aliases/dev/md/md-root" "tpm2" "{\"pcr_bank\":\"sha256\",\"pcr_ids\":\"7\"}" Stdout: "" Stderr: "WARNING:esys:src/tss2-esys/api/Esys_Create.c:399:Esys_Create_Finish() Received TPM Error \nERROR:esys:src/tss2-esys/api/Esys_Create.c:134:Esys_Create() Esys Finish ErrorCode (0x00000921) \nERROR: Esys_Create(0x921) - tpm:warn(2.0): authorizations for objects subject to DA protection are not allowed at this time because the TPM is in DA lockout mode\nERROR: Unable to run tpm2_create\nCreating TPM2 object for jwk failed!\nUnable to perform encryption with PIN 'tpm2' and config '{\"pcr_bank\":\"sha256\",\"pcr_ids\":\"7\"}'\nError adding new binding to /run/ignition/dev_aliases/dev/md/md-root\n"
[   31.207074] localhost ignition[1017]: Ignition failed: failed to create luks: binding clevis device: exit status 1: Cmd: "clevis" "luks" "bind" "-f" "-k" "/tmp/ignition-luks-1260801658" "-d" "/run/ignition/dev_aliases/dev/md/md-root" "tpm2" "{\"pcr_bank\":\"sha256\",\"pcr_ids\":\"7\"}" Stdout: "" Stderr: "WARNING:esys:src/tss2-esys/api/Esys_Create.c:399:Esys_Create_Finish() Received TPM Error \nERROR:esys:src/tss2-esys/api/Esys_Create.c:134:Esys_Create() Esys Finish ErrorCode (0x00000921) \nERROR: Esys_Create(0x921) - tpm:warn(2.0): authorizations for objects subject to DA protection are not allowed at this time because the TPM is in DA lockout mode\nERROR: Unable to run tpm2_create\nCreating TPM2 object for jwk failed!\nUnable to perform encryption with PIN 'tpm2' and config '{\"pcr_bank\":\"sha256\",\"pcr_ids\":\"7\"}'\nError adding new binding to /run/ignition/dev_aliases/dev/md/md-root\n"
[   31.212704] localhost ignition[1017]: disks failed
[   31.343781] localhost systemd[1]: Dependency failed for ignition-complete.target - Ignition Complete.
[   31.351915] localhost systemd[1]: Dependency failed for initrd.target - Initrd Default Target.
[   31.397126] localhost systemd[1]: initrd.target: Job initrd.target/start failed with result 'dependency'.
[   31.397563] localhost systemd[1]: ignition-complete.target: Job ignition-complete.target/start failed with result 'dependency'.

I’ve been really struggling with this for some reason so your help has been greatly appreciated! Let me know if there is any other information that I can provide to help!

Thanks again!

maybe a clue:

1 Like

Yeah I was about ready to post about that right before you posted haha.

I saw that error and reset the TPM and ran the installation again. Still errors but at least a different error this time, so progress, right? :smile:

Here is the new rdsosreport.txt after clearing the TPM

cat rdsosreport.txt | grep fail

[    4.567999] localhost multipathd[479]: _check_bindings_file: failed to read header from /etc/multipath/bindings
[    7.090925] localhost (udev-worker)[831]: sdb4: Process '/usr/sbin/mdadm -If sdb4 --path pci-0000:01:00.1-ata-2.0' failed with exit code 1.
[    7.090926] localhost (udev-worker)[848]: sdb3: Process '/usr/sbin/mdadm -If sdb3 --path pci-0000:01:00.1-ata-2.0' failed with exit code 1.
[    9.232013] localhost (udev-worker)[831]: sda4: Process '/usr/sbin/mdadm -If sda4 --path pci-0000:01:00.1-ata-1.0' failed with exit code 1.
[    9.232013] localhost (udev-worker)[848]: sda3: Process '/usr/sbin/mdadm -If sda3 --path pci-0000:01:00.1-ata-1.0' failed with exit code 1.
[   28.748049] localhost ignition[1023]: disks: createFilesystems: op(19): op(1a): op(1b): op(1c): op(1e): [failed]   wiping filesystem signatures from "/run/ignition/dev_aliases/dev/md/md-root": exit status 1: Cmd: "wipefs" "-a" "/run/ignition/dev_aliases/dev/md/md-root" Stdout: "" Stderr: "wipefs: error: /run/ignition/dev_aliases/dev/md/md-root: probing initialization failed: Device or resource busy\n"
[   29.657728] localhost ignition[1023]: Ignition failed: failed to create filesystems: wipefs failed: exit status 1: Cmd: "wipefs" "-a" "/run/ignition/dev_aliases/dev/md/md-root" Stdout: "" Stderr: "wipefs: error: /run/ignition/dev_aliases/dev/md/md-root: probing initialization failed: Device or resource busy\n"
[   29.660652] localhost ignition[1023]: disks failed
[   29.772901] localhost systemd[1]: Dependency failed for ignition-complete.target - Ignition Complete.
[   29.804380] localhost systemd[1]: Dependency failed for initrd.target - Initrd Default Target.
[   29.827965] localhost systemd[1]: initrd.target: Job initrd.target/start failed with result 'dependency'.
[   29.828212] localhost systemd[1]: ignition-complete.target: Job ignition-complete.target/start failed with result 'dependency'.

Stderr: "wipefs: error: /run/ignition/dev_aliases/dev/md/md-root: probing initialization failed: Device or resource busy

This error line is a little curious, but I don’t know why this error is here. What could possibly be causing this?

Thanks for your continued help! :smile:

As mentioned, it will be much easier to troubleshoot if we go step by step.

Have you tried the suggested config and nothing more? Did it work? If so, what’s the next thing you want to set up?

So I just tried the suggested config and nothing more and the installer seems to boot up with systemd services starting but then it suddenly goes to a blank screen. I wondered if maybe the installer was still running in the background despite the blank screen so I waited a while for the installer to run but when I try rebooting there doesn’t seem to be any activity that has happened. Booting into a blank screen after the systemd services start hasn’t occurred before.

Hmm, that’s strange. I tested the suggested config on an x86_64 bare metal machine with two SSDs attached. The only difference is /dev/disk/by-id unique names.

So I was able to get past that black screen issue by editing grub on boot to include nomodeset. The installer boots and then ends at a login screen. I can ssh into the machine from there.

When I made the bootable flash drive CoreOS installer, I used the following command to embed the ignition file to the iso image and then wrote that iso to the flash drive.

coreos-installer iso ignition embed -f -i suggestedconfig.ign fedora-coreos-41.20241027.3.0-live.x86_64.iso

I haven’t gotten this far yet with the installer, so I’m a little unsure how to proceed from here. Do I run coreos-installer again from ssh? Where was the ignition file embedded to?

I just realized something. Am I completely messing this entire process up by embedding the ignition file into the iso? Does that ignition file embedded into the iso only apply to the first boot and that would explain why I am getting so many errors and I am struggling so much with this? :man_facepalming:

What machine are you installing Fedora CoreOS on?

The coreos-installer iso ignition embed command is primarily used for unattended installations by automatically starting coreos-installer on boot. For more information, see Customizing installation | CoreOS Installer and ISO-embedded Ignition configuration | CoreOS Installer.

Since this is a more advanced approach, I would suggest trying Installing from Live ISO. For reference, see Installing CoreOS on Bare Metal :: Fedora Docs.

I was able to get the suggested configuration installed by not embedding the ignition file within the iso and just using the http server as suggested in the documentation (as I probably should have from the beginning.)

The next thing that I want to setup is the root disk being encrypted and automatically decrypted at boot by the TPM.

I tried with the following configuration and it failed

variant: fcos
version: 1.5.0

# Authentication
passwd:
  users:
    - name: core
      ssh_authorized_keys:
        - ssh-ed25519 SSH_KEYS_GO_HERE

# Specify boot device is mirrored between two disks
boot_device:
  # Mirrored boot disk on boot disk SSD
  mirror:
    devices:
      # Boot Disk 1
      - /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
      # Boot Disk 2
      - /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F

storage:
  disks:
    # Boot Disk 1
    - device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
      partitions:
        # Override size of root partition on first disk, via the label
        # generated for boot_device.mirror
        - label: root-1
        #  size_mib: 10240
    # Boot Disk 2
    - device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F
      partitions:
        # Similarly for second disk
        - label: root-2
        #  size_mib: 10240    
  
  luks:
    - name: root
      label: luks-root
      device: /dev/md/md-root
      clevis:
        tpm2: true
      wipe_volume: true
  
  filesystems:
    - device: /dev/mapper/root
      format: ext4
      wipe_filesystem: true
      label: root

I also tried this configuration as well

variant: fcos
version: 1.5.0

# Authentication
passwd:
  users:
    - name: core
      ssh_authorized_keys:
        - ssh-ed25519 SSH_KEYS_GO_HERE

# Specify boot device is mirrored between two disks
boot_device:
  # Mirrored boot disk on boot disk SSD
  mirror:
    devices:
      # Boot Disk 1
      - /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
      # Boot Disk 2
      - /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F

storage:
  disks:
    # Boot Disk 1
    - device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
      partitions:
        # Override size of root partition on first disk, via the label
        # generated for boot_device.mirror
        - label: root-1
        #  size_mib: 10240
    # Boot Disk 2
    - device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F
      partitions:
        # Similarly for second disk
        - label: root-2
        #  size_mib: 10240    
  
  luks:
    - name: root
      label: luks-root
      device: /dev/md/md-root
      clevis:
        custom:
          needs_network: false
          pin: tpm2
          config: '{"pcr_bank":"sha1","pcr_ids":"7"}'
      wipe_volume: true
  
  filesystems:
    - device: /dev/mapper/root
      format: ext4
      wipe_filesystem: true
      label: root

Here is the errors from the log data

cat rdsosreport.txt | grep -E "error|warning|fail|sysroot"

[    4.394778] localhost kernel: GPT: Use GNU Parted to correct GPT errors.
[    4.408843] localhost multipathd[460]: _check_bindings_file: failed to read header from /etc/multipath/bindings
[   17.350916] localhost ignition-ostree-transposefs[1418]: Mounting /dev/disk/by-label/EFI-SYSTEM ro (/dev/sda2) to /sysroot/boot/efi
[   47.180454] localhost ignition[1476]: disks: createFilesystems: op(1b): op(1c): op(1d): op(1e): op(20): [failed]   wiping filesystem signatures from "/run/ignition/dev_aliases/dev/md/md-root": exit status 1: Cmd: "wipefs" "-a" "/run/ignition/dev_aliases/dev/md/md-root" Stdout: "" Stderr: "wipefs: error: /run/ignition/dev_aliases/dev/md/md-root: probing initialization failed: Device or resource busy\n"
[   54.812947] localhost ignition[1476]: Ignition failed: failed to create filesystems: wipefs failed: exit status 1: Cmd: "wipefs" "-a" "/run/ignition/dev_aliases/dev/md/md-root" Stdout: "" Stderr: "wipefs: error: /run/ignition/dev_aliases/dev/md/md-root: probing initialization failed: Device or resource busy\n"
[   54.815532] localhost ignition[1476]: disks failed
[   54.833067] localhost systemd[1]: Dependency failed for ignition-complete.target - Ignition Complete.
[   54.841013] localhost systemd[1]: Dependency failed for initrd.target - Initrd Default Target.
[   54.847564] localhost systemd[1]: initrd.target: Job initrd.target/start failed with result 'dependency'.
[   54.847832] localhost systemd[1]: ignition-complete.target: Job ignition-complete.target/start failed with result 'dependency'.
[   55.409483] localhost systemd[1]: sysroot.mount: Directory /sysroot to mount over is not empty, mounting anyway.
[   55.409957] localhost systemd[1]: sysroot.mount: Failed to load environment files: No such file or directory
[   55.409962] localhost systemd[1]: sysroot.mount: Failed to spawn 'mount' task: No such file or directory
[   55.409969] localhost systemd[1]: sysroot.mount: Failed with result 'resources'.
[   55.410171] localhost systemd[1]: Failed to mount sysroot.mount - /sysroot.

It all seems to work until I add the LUKS information to try to encrypt the drive, and then it fails.

Thanks again so much to everyone who has been helping me.

There is simplified Butane config syntax for configuring root filesystem encryption and pinning.

The following configures a mirrored boot disk with a TPM2-encrypted root filesystem.

variant: fcos
version: 1.5.0

# Authentication
passwd:
  users:
    - name: core
      ssh_authorized_keys:
        - ssh-ed25519 SSH_KEYS_HERE

# Specify boot device is mirrored between two disks
boot_device:
  # Encrypted root filesystem with a TPM2 Clevis pin
  luks:
    tpm2: true
  # Mirrored boot disk on boot disk SSD
  mirror:
    devices:
      # Boot Disk 1
      - /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
      # Boot Disk 2
      - /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F

storage:
  disks:
    # Boot Disk 1
    - device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
      partitions:
        # Override size of root partition on first disk, via the label
        # generated for boot_device.mirror
        - label: root-1
          size_mib: 10240
    # Boot Disk 2
    - device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F
      partitions:
        # Similarly for second disk
        - label: root-2
          size_mib: 10240

Awesome, that works the way I was wanting it! Thanks to everyone who has helped me in this issue. I now have CoreOS installed.

I have another issue I need help with, but I’m going to make a separate forum post because it’s a separate issue. Thanks again to everyone who helped me!

2 Likes

Hi,
I have a similar issue. Although maybe not the same. But the title is spot on, so I thought I might just continue this thread…

I’m trying to istall recent Fedora CoreOS on bare-metal using mdraid mirrored boot disks. Here’s the exact butane script fragment that I use:

variant: fcos
version: 1.5.0
boot_device:
  # Mirrored boot disk
  mirror:
    devices:
      # Boot Disk 1
      - /dev/disk/by-id/nvme-Samsung_SSD_990_PRO_with_Heatsink_1TB_S73JNJ0X904543N
      # Boot Disk 2
      - /dev/disk/by-id/ata-Samsung_SSD_870_EVO_2TB_S754NX0XB18434K

Note that these are not equal disks. NVMe is 1TB and SATA SSD is 2TB. I don’t know if this matters, but any hint may ring the bell to a particular person.

I’m using coreos-installer with ignition pulled from --ignition-url. What happens is that upon reboot, the machine boots into some incarnation of CoreOS that executes part of the ignition, then it reboots automatically into an incarnation that further sets up disks, etc. and ends up in a working system. The boot and root are mounted from /dev/md/md-boot and /dev/md/md-root and mdraid devices are OK, containing partitions from both disks.

All seems fine until 1st manual reboot. Which is unsuccessful. Investigating further I boot from Live ISO, mount the EFI vfat partition from the 1st boot disk with label esp-1 and see the following files in it:

./EFI
./EFI/BOOT
./EFI/BOOT/BOOTX64.EFI
./EFI/BOOT/grub.cfg
./EFI/BOOT/grubx64.efi
./EFI/BOOT/mmx64.efi
./EFI/BOOT/bootuuid.cfg

…but the EFI boot entry references a file in the directory that even does not exist (this is from another system, but the path is the same):

root@nk8s1:~# efibootmgr | grep Fedora
Boot0000* Fedora        HD(2,GPT,628464b5-3b9a-4063-a4c8-90297f4994b1,0x1000,0x3f800)/\EFI\fedora\shimx64.efi

What I did to fix this is that I removed those files and copied the whole EFI directory from the Live booted ISO to partitions of both boot disks that now have the following content:

./EFI
./EFI/BOOT
./EFI/BOOT/BOOTX64.EFI
./EFI/BOOT/fbx64.efi
./EFI/fedora
./EFI/fedora/BOOTX64.CSV
./EFI/fedora/grub.cfg
./EFI/fedora/grubx64.efi
./EFI/fedora/mmx64.efi
./EFI/fedora/shim.efi
./EFI/fedora/shimx64.efi

I also re-created the EFI boot entries with efibootmgr so I now have two - one for each disk.

My questions:

  • is this a known bug perhaps?
  • by just copying the whole EFI directory from Live Coreos ISO EFI partition, did I make something that will bite me in the future when I upgrade or rebase the CoreOS system?

This looks like a bug indeed. Could you copy/paste your post into an issue in the tracker: GitHub - coreos/fedora-coreos-tracker: Issue tracker for Fedora CoreOS ? Thanks

Here it is: Installing CoreOS on mdraid mirrored boot disks creates unbootable EFI boot partition content · Issue #1907 · coreos/fedora-coreos-tracker · GitHub

Regards, Peter

What I also noticed, but don’t know whether it is expected or not is that coreos-installer install command does nothing to the EFI bios boot manager (it doesn’t add an entry to be used at reboot). Does it rely on default entries that are usually already populated (like boot from disk, CD/ROM, etc…)?

That’s by design. The entry is added on first boot by shim.