Ignition file is only partially read. How do I debug?

I’m installing CoreOS on bare metal. It installs from the live-iso and boots the new system happily. However, most of my ignition file gets ignored. What’s really confusing is that some of the file gets read – it creates my user account and adds my SSH keys. But I also provide a password-hash for that user and it does not get copied. I also define some disks, filesystems, files, directories, and systemd units, and none of them appear.

As far as I can tell, there’s no error, either. I look at the journald lines for ignition and it says that it created the user, set the ssh keys, and now it’s done.

I’m really puzzled, and I’m not sure what else I can investigate. I can see that /boot/ignition/config.ign matches what I expected (has all my configuration).

Can anyone suggest what I could do to figure out why most of the ignition file seems to be silently ignored?

My ignition file (with a few values masked):

/boot/ignition/config.ign

{"ignition":{"version":"3.1.0"},"passwd":{"users":[{"groups":["sudo","docker"],"name":"waisbrot","passwordHash":"$1$k3TL2wq/$a...","sshAuthorizedKeys":["ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDyQC...","ssh-rsa AAAAB3NzaC1yc2EAAAADAQ...","ssh-rsa AAAAB3NzaC1yc2EAAAADAQ...","ssh-rsa AAAAB3NzaC1yc2EAAAADAQ..."]}]},"storage":{"directories":[{"group":{"id":0},"overwrite":false,"path":"/k8s","user":{"id":0},"mode":777},{"group":{"id":0},"overwrite":false,"path":"/install","user":{"id":0},"mode":777}],"disks":[{"device":"/dev/disk/by-id/ata-ST750LX003...","partitions":[{"label":"var","number":1,"shouldExist":true,"sizeMiB":358400,"startMiB":0,"wipePartitionEntry":true},{"label":"local-config","number":2,"shouldExist":true,"sizeMiB":0,"startMiB":0,"wipePartitionEntry":true}],"wipeTable":true}],"files":[{"group":{"id":0},"overwrite":false,"path":"/etc/sudoers.d/00-waisbrot.conf","user":{"id":0},"contents":{"source":"data:,waisbrot%20%20%20%20ALL%3D(ALL%3AALL)%20NOPASSWD%3A%20ALL%0A"},"mode":600},{"group":{"id":0},"overwrite":false,"path":"/install/hello","user":{"id":0},"contents":{"source":"data:,hello%20world%0A"},"mode":666},{"group":{"id":0},"overwrite":false,"path":"/install/cni-plugins","user":{"id":0},"contents":{"source":"https://github.com/containernetworking/plugins/releases/download/v0.8.2/cni-plugins-linux-amd64-v0.8.2.tgz"},"mode":666}],"filesystems":[{"device":"/dev/disk/by-partlabel/var","format":"ext4","label":"var","path":"/var"},{"device":"/dev/disk/by-partlabel/local-config","format":"ext4","label":"k8sConfig","path":"/k8s"}]},"systemd":{"units":[{"contents":"[Unit]\nBefore=local-fs.target\n[Mount]\nWhere=/var\nWhat=/dev/disk/by-partlabel/var\n[Install]\nWantedBy=local-fs.target\n","enabled":true,"name":"var.mount"},{"contents":"[Unit]\nBefore=local-fs.target\n[Mount]\nWhere=/k8s\nWhat=/dev/disk/by-partlabel/local-config\n[Install]\nWantedBy=local-fs.target\n","enabled":true,"name":"k8s.mount"}]}}

I see at least a few problems with your config.

  1. file permissions mode values in Ignition aren’t what you would expect
  2. you can’t create directories under / (see migration notes). You’ll most likely want to put them in a subdirectory under /var/.

That being said, if you first create your config using fcct, it will probably make your life easier. Could you do that and see if it helps you. If it doesn’t get you all the way there paste your fcct yaml here and we can investigate further.

TIP: for filesystem mounts use with_mount_unit: true from FCCT. We have an open docs request to add it but there is an example in there: Document with_mount_unit from Butane · Issue #125 · coreos/fedora-coreos-docs · GitHub

Those were some great tips. Thanks! Sadly, it doesn’t seem to have changed the behavior of ignition (it still doesn’t do most of what I asked and it still doesn’t seem to produce any errors).

I’m actually writing my FCCT configuration in Jsonnet, but here’s a YAML translation (sorry for making you read JSON before), having fixed the mode-bug, moved the directories, and added with_mount_unit:

fcc.yaml
---
ignition: {}
passwd:
  groups: []
  users:
  - groups:
    - sudo
    - docker
    name: waisbrot
    password_hash: "$1$k3TL2wq/$..."
    ssh_authorized_keys:
    - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDyQCCAQ6SvwZ2BBgt8nOC34LPqJTPxtJ7GI61Z+...
    - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCvz6Mg3jaIxcP3hXXrQFkasLTKa+...
    - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDIBJhTuBaYdX3KHEGmamfk5jADrYQXsb/...
    - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDRoE+rth05uvzQ8nKcT2RWn+...=
storage:
  directories:
  - group:
      id: 0
    mode: 511
    overwrite: false
    path: "/var/docker"
    user:
      id: 0
  - group:
      id: 0
    mode: 511
    overwrite: false
    path: "/var/local-config"
    user:
      id: 0
  - group:
      id: 0
    mode: 511
    overwrite: false
    path: "/var/install"
    user:
      id: 0
  disks:
  - device: "/dev/disk/by-id/ata-ST750LX003-..."
    partitions:
    - label: docker
      number: 1
      should_exist: true
      size_mib: 358400
      start_mib: 0
      wipe_partition_entry: true
    - label: local-config
      number: 2
      should_exist: true
      size_mib: 0
      start_mib: 0
      wipe_partition_entry: true
    wipe_table: true
  files:
  - contents:
      inline: 'waisbrot    ALL=(ALL:ALL) NOPASSWD: ALL

'
    group:
      id: 0
    mode: 384
    overwrite: false
    path: "/etc/sudoers.d/00-waisbrot.conf"
    user:
      id: 0
  - contents:
      inline: 'hello world

'
    group:
      id: 0
    mode: 438
    overwrite: false
    path: "/var/install/hello"
    user:
      id: 0
  - contents:
      source: https://github.com/containernetworking/plugins/releases/download/v0.8.2/cni-plugins-linux-amd64-v0.8.2.tgz
    group:
      id: 0
    mode: 438
    overwrite: false
    path: "/var/install/cni-plugins"
    user:
      id: 0
  filesystems:
  - device: "/dev/disk/by-partlabel/docker"
    format: ext4
    label: docker
    path: "/var/docker"
    with_mount_unit: true
  - device: "/dev/disk/by-partlabel/local-config"
    format: ext4
    label: local-config
    path: "/var/local-config"
    with_mount_unit: true
  links: []
  raid: []
  trees: []
systemd:
  units: []
variant: fcos
version: 1.1.0

In case you’re curious, this is what I’m actually editing (I have verified that the above and below result in exactly the same .ign file from fcct)

fcc.jsonnet
local file = {
  overwrite: false,
  mode: std.parseOctal('666'),
  user: { id: 0 },
  group: { id: 0 },
};
local dir = {
  overwrite: false,
  mode: std.parseOctal('777'),
  user: { id: 0 },
  group: { id: 0 },
};

[
  {
    variant: 'fcos',
    version: '1.1.0',
    ignition: {
    },
    storage: {
      disks: [
        // Main HD: /dev/devices/by-id/ata-ST3200822A_...
        // 200G
        // This is the only one that can boot
        {
          // Secondary HD. Larger, so store files
          // 700G
          device: '/dev/disk/by-id/ata-ST750LX003...',
          wipe_table: true,
          partitions: [
            {
              number: 1,
              label: 'docker',
              start_mib: 0,
              size_mib: 358400,
              wipe_partition_entry: true,
              should_exist: true,
            },
            {
              number: 2,
              label: 'local-config',
              start_mib: 0,
              size_mib: 0,
              wipe_partition_entry: true,
              should_exist: true,
            },
          ],
        },
      ],
      raid: [],
      filesystems: [
        {
          path: '/var/docker',
          device: '/dev/disk/by-partlabel/docker',
          format: 'ext4',
          label: 'docker',
          with_mount_unit: true,
        },
        {
          path: '/var/local-config',
          device: '/dev/disk/by-partlabel/local-config',
          format: 'ext4',
          label: 'local-config',
          with_mount_unit: true,
        },
      ],
      files: [
        file {
          path: '/etc/sudoers.d/00-waisbrot.conf',
          contents: {
            inline: |||
              waisbrot    ALL=(ALL:ALL) NOPASSWD: ALL
            |||,
          },
          mode: std.parseOctal('600'),
        },
        file {
          path: '/var/install/hello',
          contents: {
            inline: |||
              hello world
            |||,
          },
        },
        file {
          path: '/var/install/cni-plugins',
          overwrite: false,
          contents: {
            local CNI_VERSION = 'v0.8.2',
            source: std.format('https://github.com/containernetworking/plugins/releases/download/%s/cni-plugins-linux-amd64-%s.tgz', [CNI_VERSION, CNI_VERSION]),
          },
        },
      ],
      directories: [
        dir {
          path: '/var/docker',
        },
        dir {
          path: '/var/local-config',
        },
        dir {
          path: '/var/install',
        },
      ],
      links: [],
      trees: [],
    },
    systemd: {
      units: [],
    },
    passwd: {
      users: [
        {
          local ssh_keys = importstr 'github.keys',
          name: 'waisbrot',
          ssh_authorized_keys: std.split(std.stripChars(ssh_keys, ' \n'), '\n'),
          password_hash: '$1$k3TL2wq/$...',
          groups: ['sudo', 'docker'],
        },
      ],
      groups: [],
    },
  },
]

Any suggestions how I could trace ignition’s behavior to see why it only does the SSH keys? Can I run it manually on the installed system to get more output? Add some verbosity flags to it somewhere?

Another tip :slight_smile: : fcct mode values are in the form you would expect as a user. So you can use something like 755 and it will get translated into decimal when you create the Ignition json with the fcct tool.

Maybe make that change ^^ and then re-run it again. After your system comes up share your Ignition logs. You can grab most of what’s relevant with sudo journalctl -t ignition.

fcct mode values are in the form you would expect as a user. So you can use something like 755 and it will get translated into decimal when you create the Ignition json with the fcct tool

No, as you pointed out above, you have to use a leading zero so YAML will interpret as an octal number. I prefer Ansible’s approach – they require the mode as a string and either interpret '755' as an octal number or recognize 'a=rx,u+w'. I like the symbolic mode, since I can never remember which bit does what without looking it up…

Anyway, looking at the logs for ignition, I see:

`sudo journalctl -t ignition`
-- Logs begin at Mon 2020-08-03 20:17:35 UTC, end at Mon 2020-08-03 22:01:30 UTC. --
Aug 03 20:17:38 localhost ignition[429]: Ignition 2.4.1
Aug 03 20:17:38 localhost ignition[429]: Stage: fetch-offline
Aug 03 20:17:38 localhost ignition[429]: reading system config file "/usr/lib/ignition/base.ign"
Aug 03 20:17:38 localhost ignition[429]: parsing config with SHA512: ff6a5153be363997e4d5d3ea8cc4048373a457c48c4a5b134a08a30aacd167c1e0f099f0bdf1e24c99ad180628cd02b767b863b5fe3a8fce3fe1886847eb8e2e
Aug 03 20:17:38 localhost ignition[429]: parsed url from cmdline: ""
Aug 03 20:17:38 localhost ignition[429]: no config URL provided
Aug 03 20:17:38 localhost ignition[429]: reading system config file "/usr/lib/ignition/user.ign"
Aug 03 20:17:38 localhost ignition[429]: parsing config with SHA512: 3b41f32d85ec65bb58d87a6e111b59e3dcb966d2aebba6135c8cd5be01931232e332e1ea04044eb7f02206b04b2bc098fe6c9ccd87ab355a8ea1ef70ed304480
Aug 03 20:17:38 localhost ignition[429]: fetch-offline: fetch-offline passed
Aug 03 20:17:38 localhost ignition[429]: Ignition finished successfully
Aug 03 20:17:38 localhost ignition[446]: Ignition 2.4.1
Aug 03 20:17:38 localhost ignition[446]: Stage: disks
Aug 03 20:17:38 localhost ignition[446]: reading system config file "/usr/lib/ignition/base.ign"
Aug 03 20:17:38 localhost ignition[446]: parsing config with SHA512: ff6a5153be363997e4d5d3ea8cc4048373a457c48c4a5b134a08a30aacd167c1e0f099f0bdf1e24c99ad180628cd02b767b863b5fe3a8fce3fe1886847eb8e2e
Aug 03 20:17:38 localhost ignition[446]: disks: disks passed
Aug 03 20:17:38 localhost ignition[446]: Ignition finished successfully
Aug 03 20:17:53 localhost ignition[531]: INFO     : Ignition 2.4.1
Aug 03 20:17:53 localhost ignition[531]: INFO     : Stage: mount
Aug 03 20:17:53 localhost ignition[531]: INFO     : reading system config file "/usr/lib/ignition/base.ign"
Aug 03 20:17:53 localhost ignition[531]: DEBUG    : parsing config with SHA512: ff6a5153be363997e4d5d3ea8cc4048373a457c48c4a5b134a08a30aacd167c1e0f099f0bdf1e24c99ad180628cd02b767b863b5fe3a8fce3fe1886847eb8e2e
Aug 03 20:17:53 localhost ignition[531]: INFO     : mount: mount passed
Aug 03 20:17:53 localhost ignition[531]: INFO     : Ignition finished successfully
Aug 03 20:17:55 localhost ignition[561]: INFO     : Ignition 2.4.1
Aug 03 20:17:55 localhost ignition[561]: INFO     : Stage: files
Aug 03 20:17:55 localhost ignition[561]: INFO     : reading system config file "/usr/lib/ignition/base.ign"
Aug 03 20:17:55 localhost ignition[561]: DEBUG    : parsing config with SHA512: ff6a5153be363997e4d5d3ea8cc4048373a457c48c4a5b134a08a30aacd167c1e0f099f0bdf1e24c99ad180628cd02b767b863b5fe3a8fce3fe1886847eb8e2e
Aug 03 20:17:55 localhost ignition[561]: INFO     : files: createUsers: op(1): [started]  creating or modifying user "core"
Aug 03 20:17:55 localhost ignition[561]: DEBUG    : files: createUsers: op(1): executing: "useradd" "--root" "/sysroot" "--create-home" "--password" "*" "--comment" "CoreOS Admin" "--groups" "adm,sudo,systemd-journal,wheel" "core"
Aug 03 20:17:57 localhost ignition[561]: INFO     : files: createUsers: op(1): [finished] creating or modifying user "core"
Aug 03 20:17:57 localhost ignition[561]: INFO     : files: createUsers: op(2): [started]  creating or modifying user "waisbrot"
Aug 03 20:17:57 localhost ignition[561]: DEBUG    : files: createUsers: op(2): executing: "useradd" "--root" "/sysroot" "--create-home" "--password" "*" "waisbrot"
Aug 03 20:17:57 localhost ignition[561]: INFO     : files: createUsers: op(2): [finished] creating or modifying user "waisbrot"
Aug 03 20:17:57 localhost ignition[561]: INFO     : files: createUsers: op(3): [started]  adding ssh keys to user "waisbrot"
Aug 03 20:17:57 localhost ignition[561]: INFO     : files: createUsers: op(3): [finished] adding ssh keys to user "waisbrot"
Aug 03 20:17:57 localhost ignition[561]: INFO     : files: op(4): [started]  relabeling 15 patterns
Aug 03 20:17:57 localhost ignition[561]: DEBUG    : files: op(4): executing: "setfiles" "-vF0" "-r" "/sysroot" "/sysroot/etc/selinux/targeted/contexts/files/file_contexts" "-f" "-"
Aug 03 20:17:57 localhost ignition[561]: INFO     : files: op(4): [finished] relabeling 15 patterns
Aug 03 20:17:57 localhost ignition[561]: INFO     : files: files passed
Aug 03 20:17:57 localhost ignition[561]: INFO     : Ignition finished successfully
Aug 03 20:17:58 localhost ignition[657]: INFO     : Ignition 2.4.1
Aug 03 20:17:58 localhost ignition[657]: INFO     : Stage: umount
Aug 03 20:17:58 localhost ignition[657]: INFO     : reading system config file "/usr/lib/ignition/base.ign"
Aug 03 20:17:58 localhost ignition[657]: DEBUG    : parsing config with SHA512: ff6a5153be363997e4d5d3ea8cc4048373a457c48c4a5b134a08a30aacd167c1e0f099f0bdf1e24c99ad180628cd02b767b863b5fe3a8fce3fe1886847eb8e2e
Aug 03 20:17:58 localhost ignition[657]: INFO     : umount: umount passed
Aug 03 20:17:58 localhost ignition[657]: INFO     : Ignition finished successfully

The only thing I notice here is that it says its reading from /usr/lib/ignition/ but I don’t appear to have that directory (anymore?) so I can’t confirm that the file in question is the one that I generated.

You’re right. In the example I should have used 0755. I might try to give your config a spin tomorrow and see what behavior I see.

1 Like

I did some more debugging, trying to run smaller and smaller ignition files. What I discovered is that the file that the installed system is running at startup is not the file that I’m passing to coreos-installer install!

Here’s what I’m doing to try to validate my install

  1. Boot from the live-CD
  2. Use fdisk to delete each partition from the destination device
  3. Fetch my ignition file with curl
  4. Use less tularemia.ign to verify that the file I’ve fetched (tularemia.ign) is the one that I expect (it is)
  5. Run sudo coreos-installer install -i tularemia.ign -n /dev/disk/by-id/ata-ST32...
  6. After install completes, sudo mount /dev/disk/by-id/ata-ST32...-part1 /mnt
  7. Use less /mnt/ignition/config.ign to verify that it’s the file I fetched
  8. Reboot into the new system

In the above case, I created the ignition file with a single SSH key. When the new system boots, I see that it creates my user with 4 SSH keys! This is my data, but from a previous installation attempt.

What I don’t understand is where Ignition is reading this data from. Journald, as I posted above mentions reading /usr/lib/ignition/user.ign but I can’t find this file.

What do the following commands show?

journalctl -q MESSAGE_ID=57124006b5c94805b77ce473e92a8aeb IGNITION_CONFIG_TYPE=base
journalctl -q MESSAGE_ID=57124006b5c94805b77ce473e92a8aeb IGNITION_CONFIG_TYPE=user
journalctl -t ignition

In the output of

sudo lsblk -o PATH,LABEL

do you have any partitions on other disks with the filesystem label boot? From another Fedora CoreOS install, maybe?

Good idea! Sadly, I checked and I do not.

So, I ended up doing the following kludge path:

  1. After the new system boots, boot back into the installer
  2. Mount the filesystem and edit (under /sysroot/ostree/boot.1/...) the /etc/groups to give my user (the only thing ignition creates for me) sudo powers
  3. Boot back into the installed system
  4. I can’t find the ignition binary on the installed system either, so I fetch that
    a. On another machine, clone it from github
    b. Modify the code so that it looks under /opt/ignition instead of /usr/lib/ignition (the latter doesn’t exist)
    c. Compile and copy it over
  5. Now I can run ignition interactively

Doing this, I got an error because I was missing wipe_filesystem: true. I also got tons of output that was not shown with any journalctl command. Once I added wipe_filesystem ignition was able to run interactively. I found that the systemd units produced by with_mount_unit: true didn’t work – the systemd-fsck template that it calls fails because it doesn’t instantiate the template. Maybe caused by the filesystem being read-only?

In summary, I’m puzzled that there isn’t a better way to understand what happened with ignition. It ran, but I can’t find the binary. It read a file (and logged the filename) but I can’t find the file. My mental point of comparison is cloud-init, which writes all of its logs to a dedicated file and keeps its binary on the system.

Ignition runs from the initramfs (and /usr/lib/ignition is a directory in the initramfs). Ignition isn’t in $PATH in the final booted system because running it by hand is potentially dangerous.

The problem you’re describing is an unusual one. I’d debug it like this:

  1. On the first boot of the system, add rd.break to the kernel command line. If you’re on the VGA console rather than serial, you’ll also need to remove console=ttyS0,115200 from the kernel command line (#567).
  2. Once you’re prompted for an emergency shell, hit Enter twice to get a shell prompt.
  3. cat /run/ignition.json to see the rendered config calculated by Ignition. /usr/lib/ignition/user.ign should have the unrendered config that matches what coreos-installer copied into /boot/ignition/config.ign.
  4. If the configs don’t line up, investigate the behavior of ignition-setup-user.service, which copies the config from /boot to /usr/lib/ignition, and the underlying script /usr/sbin/ignition-setup-user.