I’ve been working on setting up Fedora CoreOS on my server and running into persistent issues that I haven’t been able to resolve. I’m hoping someone here can help me identify what I’m doing wrong and guide me in the right direction.
What I’m Trying to Achieve
I’m aiming to install Fedora CoreOS with the following setup:
RAID1 Configuration for Boot and Root:
Mirrored boot, EFI, and root partitions across two SSDs for redundancy.
Configuring RAID1 for /boot and / and EFI.
Separate /var Partition:
/var is configured on an NVMe drive (/dev/nvme0n1).
It is encrypted with LUKS and will eventually be formatted with ZFS after the installation with the /var partition from the root drive moved to the NVME drive after installation.
Encrypted LUKS Containers for Additional Drives:
I plan on setting up the LUKS encryption on some additional drives (NAS-1, NAS-2, NAS-3) that I will arrange in a RAID-Z configuration after installation with ZFS.
The partitions are defined in my Ignition configuration file, which I’ve attached to this post for reference.
The Problem I’m Facing
RAID Assembly Fails During Installation:
The installer logs show errors like:
sdb4: Process '/usr/sbin/mdadm -If sdb4 --path pci-0000:01:00.1-ata-2.0' failed with exit code 1.
sdb3: Process '/usr/sbin/mdadm -If sdb3 --path pci-0000:01:00.1-ata-2.0' failed with exit code 1.
sda4: Process '/usr/sbin/mdadm -If sda4 --path pci-0000:01:00.1-ata-1.0' failed with exit code 1.
sda3: Process '/usr/sbin/mdadm -If sda3 --path pci-0000:01:00.1-ata-1.0' failed with exit code 1.
Then the installer times out waiting for those devices that never get properly made:
Ignition failed: failed to create filesystems: failed to wait on filesystems devs: device unit dev-disk-by\x2dlabel-boot.device timeout
Some things I’ve tried
I ran my butane file through the checker with the strict option and I’ve used ignition-validate to confirm that there aren’t any errors with formatting.
I’ve adjusted my ignition configuration in so many different ways that I’ve lost count
I’ve also read the Fedora CoreOS and Ignition documentation up and down many multiple times but there has to be something I’m missing here.
What I Need Help With
Correctly Configuring RAID in Ignition:
Is there something I’m missing in my Ignition file that’s causing the RAID setup to fail?
Here is my Butane configuration file (with mild redactions)
variant: fcos
version: 1.5.0
# Authentication
passwd:
users:
- name: core
ssh_authorized_keys:
- ssh-ed25519 SSH_KEYS_HERE
# Specify boot device is mirrored between two disks
#boot_device:
# Mirrored boot disk on boot disk SSD
#mirror:
#devices:
# Boot Disk 1
#- /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
# Boot Disk 2
#- /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F
storage:
disks:
# Boot Disk 1
- device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
wipe_table: true
partitions:
# EFI partition 1
- label: esp-1
size_mib: 1024
#type_guid: "c12a7328-f81f-11d2-ba4b-00a0c93ec93b"
# Boot partition 1
- label: boot-1
size_mib: 1024
# Root partition 1
- label: root-1
# Boot Disk 2
- device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F
wipe_table: true
partitions:
# EFI parition 2
- label: esp-2
size_mib: 1024
#type_guid: "c12a7328-f81f-11d2-ba4b-00a0c93ec93b"
# Boot partition 2
- label: boot-2
size_mib: 1024
# Root parition 2
- label: root-2
# NVME Fast Storage used for /var in CoreOS
- device: /dev/disk/by-id/nvme-WD_Red_SN700_1000GB_220443800077
wipe_table: true
partitions:
- label: var
# Seagate NAS Drive
- device: /dev/disk/by-id/ata-ST4000DM004-2CV104_ZFN0TVG2
wipe_table: true
partitions:
- label: NAS-1
# WD NAS Drive 1
- device: /dev/disk/by-id/ata-WDC_WD40EFPX-68C6CN0_WD-WX62D4312RJ2
wipe_table: true
partitions:
- label: NAS-2
# WD NAS Drive 2
- device: /dev/disk/by-id/ata-WDC_WD40EFPX-68C6CN0_WD-WXH2D430RSK5
wipe_table: true
partitions:
- label: NAS-3
# WD NVR Drive
- device: /dev/disk/by-id/ata-WDC_WD22PURZ-85B4ZY0_WD-WX42AC2M58ZR
wipe_table: true
partitions:
- label: NVR-1
raid:
# Setup RAID device for boot partition
- name: md-boot
level: raid1
devices:
- /dev/disk/by-partlabel/boot-1
- /dev/disk/by-partlabel/boot-2
# Setup RAID device for root partition
- name: md-root
level: raid1
devices:
- /dev/disk/by-partlabel/root-1
- /dev/disk/by-partlabel/root-2
luks:
- name: root
device: /dev/md/md-root
clevis:
custom:
needs_network: false
pin: tpm2
config: '{"pcr_bank":"sha256","pcr_ids":"7"}'
wipe_volume: true
#label: root
- name: var
device: /dev/disk/by-partlabel/var
clevis:
custom:
needs_network: false
pin: tpm2
config: '{"pcr_bank":"sha256","pcr_ids":"7"}'
wipe_volume: true
label: var
- name: NAS-1
device: /dev/disk/by-partlabel/NAS-1
clevis:
custom:
needs_network: false
pin: tpm2
config: '{"pcr_bank":"sha256","pcr_ids":"7"}'
wipe_volume: true
label: NAS-1
- name: NAS-2
device: /dev/disk/by-partlabel/NAS-2
clevis:
custom:
needs_network: false
pin: tpm2
config: '{"pcr_bank":"sha256","pcr_ids":"7"}'
wipe_volume: true
label: NAS-2
- name: NAS-3
device: /dev/disk/by-partlabel/NAS-3
clevis:
custom:
needs_network: false
pin: tpm2
config: '{"pcr_bank":"sha256","pcr_ids":"7"}'
wipe_volume: true
label: NAS-3
- name: NVR-1
device: /dev/disk/by-partlabel/NVR-1
clevis:
custom:
needs_network: false
pin: tpm2
config: '{"pcr_bank":"sha256","pcr_ids":"7"}'
wipe_volume: true
label: NVR-1
filesystems:
# Primary EFI partition
- path: /boot/efi
device: /dev/disk/by-partlabel/esp-1
format: vfat
wipe_filesystem: true
label: esp-1
with_mount_unit: true
mount_options: ["umask=0077"]
# Secondary EFI partition for redundancy
- path: /boot/efi2
device: /dev/disk/by-partlabel/esp-2
format: vfat
wipe_filesystem: true
label: esp-2
with_mount_unit: true
mount_options: ["umask=0077"]
# Boot partition
- path: /boot
device: /dev/md/md-boot
format: ext4
wipe_filesystem: true
label: boot
with_mount_unit: true
# Root parition
- path: /
device: /dev/mapper/root
format: ext4
wipe_filesystem: true
label: root
with_mount_unit: true
files:
# Define Hostname as CoreOS_Server
- path: /etc/hostname
mode: 0644
contents:
inline: CoreOS_Server
# Define Keyboard Layout as US keymap
- path: /etc/vconsole.conf
mode: 0644
contents:
inline: KEYMAP=us
# Configure Swap on ZRAM as 8GB using lz4 compression
- path: /etc/systemd/zram-generator.conf
mode: 0644
contents:
inline: |
# This config file enables a /dev/zram0 device with the default settings
[zram0]
# Set the fraction of total memory for ZRAM (e.g., 50% of total RAM)
# Adjust to fit your memory requirements
zram-size = ram / 2
# Optionally set the compression algorithm (e.g., zstd, lz4)
compression-algorithm = lz4
# Define swap priority (higher number has higher priority)
swap-priority = 100
# Max zram device limit to avoid too large allocation, useful for systems with lots of RAM
max-zram-size = 8192M
# Configure Grub password
grub:
users:
- name: root
# Specify grub password hash for grub user 'root'
password_hash: grub.pbkdf2.sha512.GRUB_HASH
I’ve already spent an embarrassingly long amount of time on this, so any advice or guidance would be greatly appreciated!
For simplicity and easier troubleshooting, I would suggest starting with just replicating the boot disk. With the the following Butane config you should be able to mirror all the default partitions.
variant: fcos
version: 1.5.0
# Authentication
passwd:
users:
- name: core
ssh_authorized_keys:
- ssh-ed25519 SSH_KEYS_HERE
# Specify boot device is mirrored between two disks
boot_device:
# Mirrored boot disk on boot disk SSD
mirror:
devices:
# Boot Disk 1
- /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
# Boot Disk 2
- /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F
storage:
disks:
# Boot Disk 1
- device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
partitions:
# Override size of root partition on first disk, via the label
# generated for boot_device.mirror
- label: root-1
size_mib: 10240
# Boot Disk 2
- device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F
partitions:
# Similarly for second disk
- label: root-2
size_mib: 10240
If you want to replicate the boot disk across multiple drives for resiliency to drive failure, you need to mirror all the default partitions (root, boot, EFI System Partition, and bootloader code). There is special Butane config syntax for this:
And the following Mirroring the boot disk onto two drives example.
Thanks everyone for the friendly help! It goes a long way to making the Fedora Community feel friendly and inviting!
I greatly simplified my butane config and implemented your suggestions, but I still keep getting errors…
Here is my NEW Butane configuration file
variant: fcos
version: 1.5.0
# Authentication
passwd:
users:
- name: core
ssh_authorized_keys:
- ssh-ed25519 SSH_KEY_GOES_HERE (real key in actual file)
# Specify boot device is mirrored between two disks
boot_device:
# Mirrored boot disk on boot disk SSD
mirror:
devices:
# Boot Disk 1
- /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
# Boot Disk 2
- /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F
storage:
disks:
# Boot Disk 1
- device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
partitions:
# Override size of root partition on first disk, via the label
# generated for boot_device.mirror
- label: root-1
size_mib: 10240
# Boot Disk 2
- device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F
partitions:
# Similarly for second disk
- label: root-2
size_mib: 10240
raid:
# Setup RAID device for root partition
- name: md-root
level: raid1
devices:
- /dev/disk/by-partlabel/root-1
- /dev/disk/by-partlabel/root-2
luks:
- name: root
device: /dev/md/md-root
clevis:
custom:
needs_network: false
pin: tpm2
config: '{"pcr_bank":"sha256","pcr_ids":"7"}'
wipe_volume: true
#label: root
filesystems:
# Root parition
- path: /
device: /dev/mapper/root
format: ext4
wipe_filesystem: true
label: root
with_mount_unit: true
files:
# Define Hostname as CoreOS_Server
- path: /etc/hostname
mode: 0644
contents:
inline: CoreOS_Server
# Define Keyboard Layout as US keymap
- path: /etc/vconsole.conf
mode: 0644
contents:
inline: KEYMAP=us
# Configure Swap on ZRAM as 8GB using lz4 compression
- path: /etc/systemd/zram-generator.conf
mode: 0644
contents:
inline: |
# This config file enables a /dev/zram0 device with the default settings
[zram0]
# Set the fraction of total memory for ZRAM (e.g., 50% of total RAM)
# Adjust to fit your memory requirements
zram-size = ram / 2
# Optionally set the compression algorithm (e.g., zstd, lz4)
compression-algorithm = lz4
# Define swap priority (higher number has higher priority)
swap-priority = 100
# Max zram device limit to avoid too large allocation, useful for systems with lots of RAM
max-zram-size = 8192M
Here is the output of the failures from the rdsosreport.txt
cat rdsosreport.txt | grep fail
[ 4.525263] localhost multipathd[483]: _check_bindings_file: failed to read header from /etc/multipath/bindings
[ 7.126730] localhost (udev-worker)[834]: sdb4: Process '/usr/sbin/mdadm -If sdb4 --path pci-0000:01:00.1-ata-2.0' failed with exit code 1.
[ 7.126730] localhost (udev-worker)[848]: sdb3: Process '/usr/sbin/mdadm -If sdb3 --path pci-0000:01:00.1-ata-2.0' failed with exit code 1.
[ 10.362811] localhost (udev-worker)[834]: sda4: Process '/usr/sbin/mdadm -If sda4 --path pci-0000:01:00.1-ata-1.0' failed with exit code 1.
[ 10.362961] localhost (udev-worker)[848]: sda3: Process '/usr/sbin/mdadm -If sda3 --path pci-0000:01:00.1-ata-1.0' failed with exit code 1.
[ 31.206943] localhost ignition[1017]: disks: createLuks: op(15): [failed] Clevis bind: exit status 1: Cmd: "clevis" "luks" "bind" "-f" "-k" "/tmp/ignition-luks-1260801658" "-d" "/run/ignition/dev_aliases/dev/md/md-root" "tpm2" "{\"pcr_bank\":\"sha256\",\"pcr_ids\":\"7\"}" Stdout: "" Stderr: "WARNING:esys:src/tss2-esys/api/Esys_Create.c:399:Esys_Create_Finish() Received TPM Error \nERROR:esys:src/tss2-esys/api/Esys_Create.c:134:Esys_Create() Esys Finish ErrorCode (0x00000921) \nERROR: Esys_Create(0x921) - tpm:warn(2.0): authorizations for objects subject to DA protection are not allowed at this time because the TPM is in DA lockout mode\nERROR: Unable to run tpm2_create\nCreating TPM2 object for jwk failed!\nUnable to perform encryption with PIN 'tpm2' and config '{\"pcr_bank\":\"sha256\",\"pcr_ids\":\"7\"}'\nError adding new binding to /run/ignition/dev_aliases/dev/md/md-root\n"
[ 31.207074] localhost ignition[1017]: Ignition failed: failed to create luks: binding clevis device: exit status 1: Cmd: "clevis" "luks" "bind" "-f" "-k" "/tmp/ignition-luks-1260801658" "-d" "/run/ignition/dev_aliases/dev/md/md-root" "tpm2" "{\"pcr_bank\":\"sha256\",\"pcr_ids\":\"7\"}" Stdout: "" Stderr: "WARNING:esys:src/tss2-esys/api/Esys_Create.c:399:Esys_Create_Finish() Received TPM Error \nERROR:esys:src/tss2-esys/api/Esys_Create.c:134:Esys_Create() Esys Finish ErrorCode (0x00000921) \nERROR: Esys_Create(0x921) - tpm:warn(2.0): authorizations for objects subject to DA protection are not allowed at this time because the TPM is in DA lockout mode\nERROR: Unable to run tpm2_create\nCreating TPM2 object for jwk failed!\nUnable to perform encryption with PIN 'tpm2' and config '{\"pcr_bank\":\"sha256\",\"pcr_ids\":\"7\"}'\nError adding new binding to /run/ignition/dev_aliases/dev/md/md-root\n"
[ 31.212704] localhost ignition[1017]: disks failed
[ 31.343781] localhost systemd[1]: Dependency failed for ignition-complete.target - Ignition Complete.
[ 31.351915] localhost systemd[1]: Dependency failed for initrd.target - Initrd Default Target.
[ 31.397126] localhost systemd[1]: initrd.target: Job initrd.target/start failed with result 'dependency'.
[ 31.397563] localhost systemd[1]: ignition-complete.target: Job ignition-complete.target/start failed with result 'dependency'.
I’ve been really struggling with this for some reason so your help has been greatly appreciated! Let me know if there is any other information that I can provide to help!
So I just tried the suggested config and nothing more and the installer seems to boot up with systemd services starting but then it suddenly goes to a blank screen. I wondered if maybe the installer was still running in the background despite the blank screen so I waited a while for the installer to run but when I try rebooting there doesn’t seem to be any activity that has happened. Booting into a blank screen after the systemd services start hasn’t occurred before.
Hmm, that’s strange. I tested the suggested config on an x86_64 bare metal machine with two SSDs attached. The only difference is /dev/disk/by-id unique names.
So I was able to get past that black screen issue by editing grub on boot to include nomodeset. The installer boots and then ends at a login screen. I can ssh into the machine from there.
When I made the bootable flash drive CoreOS installer, I used the following command to embed the ignition file to the iso image and then wrote that iso to the flash drive.
coreos-installer iso ignition embed -f -i suggestedconfig.ign fedora-coreos-41.20241027.3.0-live.x86_64.iso
I haven’t gotten this far yet with the installer, so I’m a little unsure how to proceed from here. Do I run coreos-installer again from ssh? Where was the ignition file embedded to?
I just realized something. Am I completely messing this entire process up by embedding the ignition file into the iso? Does that ignition file embedded into the iso only apply to the first boot and that would explain why I am getting so many errors and I am struggling so much with this?
I was able to get the suggested configuration installed by not embedding the ignition file within the iso and just using the http server as suggested in the documentation (as I probably should have from the beginning.)
The next thing that I want to setup is the root disk being encrypted and automatically decrypted at boot by the TPM.
I tried with the following configuration and it failed
variant: fcos
version: 1.5.0
# Authentication
passwd:
users:
- name: core
ssh_authorized_keys:
- ssh-ed25519 SSH_KEYS_GO_HERE
# Specify boot device is mirrored between two disks
boot_device:
# Mirrored boot disk on boot disk SSD
mirror:
devices:
# Boot Disk 1
- /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
# Boot Disk 2
- /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F
storage:
disks:
# Boot Disk 1
- device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
partitions:
# Override size of root partition on first disk, via the label
# generated for boot_device.mirror
- label: root-1
# size_mib: 10240
# Boot Disk 2
- device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F
partitions:
# Similarly for second disk
- label: root-2
# size_mib: 10240
luks:
- name: root
label: luks-root
device: /dev/md/md-root
clevis:
tpm2: true
wipe_volume: true
filesystems:
- device: /dev/mapper/root
format: ext4
wipe_filesystem: true
label: root
I also tried this configuration as well
variant: fcos
version: 1.5.0
# Authentication
passwd:
users:
- name: core
ssh_authorized_keys:
- ssh-ed25519 SSH_KEYS_GO_HERE
# Specify boot device is mirrored between two disks
boot_device:
# Mirrored boot disk on boot disk SSD
mirror:
devices:
# Boot Disk 1
- /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
# Boot Disk 2
- /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F
storage:
disks:
# Boot Disk 1
- device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD8730
partitions:
# Override size of root partition on first disk, via the label
# generated for boot_device.mirror
- label: root-1
# size_mib: 10240
# Boot Disk 2
- device: /dev/disk/by-id/ata-CT250MX500SSD1_2307E6AD876F
partitions:
# Similarly for second disk
- label: root-2
# size_mib: 10240
luks:
- name: root
label: luks-root
device: /dev/md/md-root
clevis:
custom:
needs_network: false
pin: tpm2
config: '{"pcr_bank":"sha1","pcr_ids":"7"}'
wipe_volume: true
filesystems:
- device: /dev/mapper/root
format: ext4
wipe_filesystem: true
label: root
Here is the errors from the log data
cat rdsosreport.txt | grep -E "error|warning|fail|sysroot"
[ 4.394778] localhost kernel: GPT: Use GNU Parted to correct GPT errors.
[ 4.408843] localhost multipathd[460]: _check_bindings_file: failed to read header from /etc/multipath/bindings
[ 17.350916] localhost ignition-ostree-transposefs[1418]: Mounting /dev/disk/by-label/EFI-SYSTEM ro (/dev/sda2) to /sysroot/boot/efi
[ 47.180454] localhost ignition[1476]: disks: createFilesystems: op(1b): op(1c): op(1d): op(1e): op(20): [failed] wiping filesystem signatures from "/run/ignition/dev_aliases/dev/md/md-root": exit status 1: Cmd: "wipefs" "-a" "/run/ignition/dev_aliases/dev/md/md-root" Stdout: "" Stderr: "wipefs: error: /run/ignition/dev_aliases/dev/md/md-root: probing initialization failed: Device or resource busy\n"
[ 54.812947] localhost ignition[1476]: Ignition failed: failed to create filesystems: wipefs failed: exit status 1: Cmd: "wipefs" "-a" "/run/ignition/dev_aliases/dev/md/md-root" Stdout: "" Stderr: "wipefs: error: /run/ignition/dev_aliases/dev/md/md-root: probing initialization failed: Device or resource busy\n"
[ 54.815532] localhost ignition[1476]: disks failed
[ 54.833067] localhost systemd[1]: Dependency failed for ignition-complete.target - Ignition Complete.
[ 54.841013] localhost systemd[1]: Dependency failed for initrd.target - Initrd Default Target.
[ 54.847564] localhost systemd[1]: initrd.target: Job initrd.target/start failed with result 'dependency'.
[ 54.847832] localhost systemd[1]: ignition-complete.target: Job ignition-complete.target/start failed with result 'dependency'.
[ 55.409483] localhost systemd[1]: sysroot.mount: Directory /sysroot to mount over is not empty, mounting anyway.
[ 55.409957] localhost systemd[1]: sysroot.mount: Failed to load environment files: No such file or directory
[ 55.409962] localhost systemd[1]: sysroot.mount: Failed to spawn 'mount' task: No such file or directory
[ 55.409969] localhost systemd[1]: sysroot.mount: Failed with result 'resources'.
[ 55.410171] localhost systemd[1]: Failed to mount sysroot.mount - /sysroot.
It all seems to work until I add the LUKS information to try to encrypt the drive, and then it fails.
Thanks again so much to everyone who has been helping me.
Awesome, that works the way I was wanting it! Thanks to everyone who has helped me in this issue. I now have CoreOS installed.
I have another issue I need help with, but I’m going to make a separate forum post because it’s a separate issue. Thanks again to everyone who helped me!