PXE network boot with Fedora Server

The only thing that gets me into trouble, is that Fedora Server is set to dynamic IP resolution, while the router’s DHCP server was disabled and replaced by the Fedora Workstation DHCP server. The router’s DHCP server has no iPXE facilities.

Correct me if I’m wrong, but I think this is a better solution than any virtual machine would ever be.

I see no attach button, so I will just copy / paste the script “Fedora Server Chrooted.sh”:

ROOT=/mnt/dev/ecf8f3c4-fe33-44f8-97c3-a9e111f4d7af

cp -p /etc/resolv.conf ${ROOT}/etc/resolv.conf

mount -t proc /proc ${ROOT}/proc
mount -B /sys ${ROOT}/sys
mount -B /run ${ROOT}/run
mount -B /dev ${ROOT}/dev
mount -t tmpfs -m tmpfs ${ROOT}/dev/shm
mount -t devpts -m devpts ${ROOT}/dev/pts -o gid=5,mode=620

chroot ${ROOT} /bin/bash

umount ${ROOT}/tmp
umount ${ROOT}/dev/pts
umount ${ROOT}/dev/shm
umount ${ROOT}/dev
umount ${ROOT}/run
umount ${ROOT}/sys
umount ${ROOT}/proc

rm ${ROOT}/etc/resolv.conf

This looks sensible, I assume the unmounts are after all chroot activities are finished?
A DHCP client does not immediately looses it’s IP if the DHCP server goes down, so it might be not a big issue, unless you defined a very short lease time in the DHCP server.

Resolv.conf might be an issue if it is a symbolic link to systemd-resolved. Just create a /etc/resolv.conf within the chroot with a manually defined nameserver.

I thought so too. The mtab files look the same before and after. I’m a little worried about Fedora Workstation interfering with Fedora Server because the bind mounts, but we’ll see.

The other problem is resolved by cloning the wired connection (nmcli connection clone), configuring one as static, one as dynamic:

nmcli connection modify “Wired connection static” 802-3-ethernet.wake-on-lan “magic”

nmcli connection modify “Wired connection static” ipv4.method “manual”
nmcli connection modify “Wired connection static” ipv4.dns “192.168.2.1”
nmcli connection modify “Wired connection static” ipv4.addresses “192.168.2.7/32”
nmcli connection modify “Wired connection static” ipv4.gateway “192.168.2.1”

followed by:

nmcli connection modify “Wired connection static” connection.autoconnect-priority “0”
nmcli connection modify “Wired connection static” connection.autoconnect-retries “1”

nmcli connection modify “Wired connection dynamic” connection.autoconnect-priority “1”
nmcli connection modify “Wired connection dynamic” connection.autoconnect-retries “1”

Now we can choose, either chroot or reboot :slight_smile:

I hope I understand you correctly: you have two installations, Server and Workstation. You start server and do a chroot into workstation. (or the other way around). There is no workstation running, there is only one kernel, the server one. With chroot you start a bash shell and pretend that the directory of workstation is the root directory. From that moment on, all files accessed or started from this bash shell are from the workstation installation, but still under control of the server kernel. This includes “dnf update”, it’s a workstation update. But if you do not bind mount dev,proc,sys in the chroot, these are just empty directories because they are not populated by the kernel. /dev, /proc and /sys are no directories with files, they are placeholders for a virtual filesystem created during kernel startup. So you do not have to worry about interference between server and workstation, in contrast, any program accessing devices in the chroot will fail if it has no access to the upper /dev folder via the bind mount. Hope this clarifies.

In the final situation, server is running and providing DHCP and TFTP or HTTP, and the clients get their own copy of the kernel and initramfs. Via NFS they get the workstation partition of the server. The kernel distributed to the clients takes care of the local dev,proc,sys and then there should be no bind mounts on the server.

IIRC that could also include a simpler bind mount for /proc. Either way should work.
I may have been using it wrong, but I have also never mounted /dev/shm or /dev/pts since by default they are already mounted when you mount /dev.

This idea came from Installing the Gentoo base system - Gentoo Wiki, as I knew this to work.

If you say, it’s necessary to mount only /proc, /sys and /dev, and not needed to mount /run, /tmp, /dev/shm and /dev/pts, I can try that. No problem. I’m not sure either for now.

If you ask me, it’s indeed not necessary to mount /dev/pts, since it has to do with opening pseudo-terminals and is mostly gnome-terminal related.

Something like that. Two installations in dual-boot configuration. I boot Fedora Workstation by default, then chroot into Fedora Server to configure and update. Fedora Workstation is running the DHCP / TFTP server, Fedora Server serves as a template for all the network nodes.

I think /proc, /sys, /dev, and /run are needed, but not the others.

Ok, I understand, this confuses me a little bit. because I would do it the other way around. May be server and workstation only differ in default package selection, workstation with gnome and web browser and so on, server bare command line with cockpit for configuration… Could be that there are some optimizations in default setup under the hood, I do not know.

This is probably, because you are used to serve Fedora Workstation network nodes to a classroom full of students, while I use Fedora Server network nodes in a High Performance Cluster to do a single computational task in parallel.

Well, I’ve just tried with mounting /proc, /sys, /run, /dev and /tmp and executed ‘dnf distro-sync’ from within the chrooted shell.

Fedora Workstation does still boot up to the login prompt, but I’m not able to login any longer, nor with the root account, not with the user account. Any idea why that is?

OK, I understand. Number crunching on the server nodes booted diskless from the workstation. But no login, that’s bad. I have been in that situation, and SELinux was unhappy after restoring a backup. But I’ve no idea why, if you use it, SELinux would be unhappy after distrosync. Anyhow, you can try to boot with selinux disabled with some kernel argument. Otherwise the only possibility, I think, is inspecting logs chrooted from the other OS or live cd.

Alternative: You can boot into grub, add rd.break to the kernel opions, and the system stops in the initial ramdisk. Asks for the root pw, than you can create /mnt, mount (rw) one of the system partitions and chroot into it. May be there is another problem with the chroot update method: some rpms have scripts manipulating systemd services, may be they fail.
By the way: what about managing the server version from one of the clients with NFS-rw access. An initramfs created in that way reflects the client hardware, workstation keeps running and no chroot or qemu tricks.

Good thinking! SELinux was enabled on Fedora Server while running ‘dns distro-sync’ from the chrooted shell.

The Gentoo Installation Handbook mentioned this as well. I’ll check the details.

I’d already though that one up for myself. I can reboot Fedora Workstation into Fedora Server and then run ‘dnf distro-sync’, but that does mean taking down the cluster momentarily. The connection.autoconnect-priority automatically switches Fedora Server to static IP when Fedora Workstation is not running. Alternatively, I can open a SSH shell on one of the network nodes and run ‘dnf distro-sync’ on that client’s NFS mounted root.

Mr. Janssen,

I wanted to ask. Have you ever SEEN it running, one server and a number of diskless clients with their root directories mounted over NFS?

Whatever I do, I can’t get it to work. Perhaps you can have a look at this thread: systemd-resolved.service ignores information from both DHCP server and nfsroot kernel parameter when booting with iPXE · Issue #24095 · systemd/systemd · GitHub, and let me know what you think.

Best regards,
Mischa Baars.

No, one client and one server lacking hardware. But for sure there are issues to be solved, like hostname, /var/log, home directories. I did not use systemd-networkd as it was a Fedora client. if I remember correctly I even switched off NetworkManager to prevent it to fiddle with IP addresses and break the NFS connection because the connection is already there in the initrd. In that case, systemd-resolved does not get the DHCP info from NetworkManager, which is easily solved by entering the DNS server fixed in /etc/systemd/resolved.conf.

I’m only a simple user, so analyzing all those logs is too complicated for me. One thing triggers me: you compiled a kernel and used the standard way to generate a initrd. Are you sure the necessary dracut modules are in the initrd? When I looked into, these contain not only the kernel modules, but also scripts to mount the network root file system and so on. I got it running with standard kernel and additional dracut modules, only with pxe and not ipxe, but I do not think this makes difference if kernel and initrd are up.

Was “root=/dev/nfs” not obsolete and replaced by nfsroot? I hope it does not interfere.
Correction: I’ve only (from the journal)

fedora kernel: Command line: BOOT_IMAGE=vmlinuz-5.18.9-200.fc36.x86_64 ip=autoconf root=nfs:192.168.1.2:/mnt/sdb1 initrd=initramfs-5.18.9-200.fc36.x86_64.img

Obeviously, your server boot is wrong, that do not to knowledge the hardware.

you can try below code in terminal:

sudo ls /boot | grep vmlinuz
sudo grubby --set-default /boot/vmlinuz-<version>.<release>.<arch>

My F36 laptop uses EFI, so this requires other setup, but:

I copied the /tftpboot folder from debian to the laptop. (F36 kernel and initrd included, adapted PXE config file.
I copied the /etc/dnsmasq.conf from debian to the laptop with some adaptations.
Made a rw btrfs snapshot of the laptop root.
Exported the snapshot in nfs
Exported the home subvolume in nfs.

Laptop is now the “server”,
And could get laptop login screen via PXE on two clients with the snapshot as rootfs via nfs. There were still some things to fix, e.g I have to remove the hostname via the client and check /var/log, but in principle it works. It should be possible!

Details: because there were upgrades including kernel I started from the beginning:
My laptop now provides the F36 workstation OS. It’s a normal F36 workstation with BTRFS.

  1. Created a snapshot, mounted on /export. /export is exported via NFS. Effectively, this is a copy of the whole OS but takes no additional space on disk.
  2. Exported /home via NFS
  3. pxelinux and TFTP setup was from the debian system, including dnsmasq config.
  4. Created an initrd with the NFS module included and the -N dracut option, which means that it contains all drivers, a.k.a. rescue initrd. This comes in the tftpboot.
  5. Appropriate kernel in the tftpboot, adapted default PXE config file.
  6. Replaced in the exported filesystem /etc/fstab with a new one containing the nfs mounted / and /home, and /var/log on tmpfs.
  7. Booted a workstation via PXE
  8. On the workstation, used hostnamectl to set the hostname blank.
  9. Create a new initrd ON the workstation and transferred this in the tftpboot.

Remaining Problem: /var/log is not initialized causing auditd service to fail. No /var/log/audit
In principle the big initrd could left permanent, but is 3 times larger, if all workstations are equivalent the optimized one would be better.

Final result: effectively the classroom situation, tested with two physical workstations and a VM. Some I’m on 3 clients and It seems to work fine.

In your case, all work has to be done on the SERVER os. If the system where the workstation is running is different from the client systems, creation of the initrd should be done with the “-N” dracut option. If the clients contain a HDD, it could be an option to quickly install Fedora server on it, and prepare the initrd including the dracut nfs module there, and transfer the kernel and initramfs to the workstation’s tftpboot.

Remaining: trying to chainload via ipxe cdrom.
Tuning dnsmasq to offer the correct initrd for the VM or Physical.

I find it strange that your configuration does anything at all.

If I boot with the same kernel parameters the screen just blanks out and then nothing.

If you look at the kernel documentation’s admin-guid/nfs/nfsroot.nfs, then you’ll see that ‘autoconf’ is not a valid setting for ‘ip’. It might fall back to the default, which is ‘on’, but nothing is mentioned about falling back to the default.

If I google for root=nfs:// I get nothing. The kernel documentations says that NFS diskless nodes are booted with root=/dev/nfs and nfsroot=[:][,] without reference to any protocol.

I have it running now, without initrd. With initrd, I run into problems similar to:

What did help me out, was that that you advised to disable NetworkManager. I can now run a ‘systemctl suspend’ that comes back to life, when I push the power-button (or perhaps even send some magic wake-on-lan packets).

I needed to take care of some settings manually, like /etc/profile.d/ethtool.sh, and /etc/profile.d/route.sh, but things are really beginning to look the way they should now.

[root@tp07 profile.d]# mv /etc/resolv.conf /etc/resolv.conf.resolvectl && echo “nameserver 192.168.2.1” > /etc/resolv.conf

[root@tp07 profile.d]# cat ethtool.sh
ethtool -s eth0 wol g

[root@tp07 profile.d]# cat ./route.sh
route add -host 192.168.2.1 eth0
route del -net 192.168.2.0 netmask 255.255.255.0 eth0

Thanks for all the help!

Best regards,
Mischa Baars.

Please apologize for this nfs://, it’s this ubiquitous URL syntax but incorrect here.
Syntax is root=nfs:192.168.1.2:/mnt/sdb1. Also I remember that I had to look at ip=autoconf, but it works somehow. I have the impression that there are two ways, the kernel one with /dev/nfs and nfsroot, and the initramfs one by the dracut nfs module with root=nfs: I have with the dracut modules both NFS and liveCD via PXE running. Anyhow, good that your cluster works finally!

Thanks! Looking forward to programming it :slight_smile: