Can no longer log into machine via SSH after upgrade from Fedora 42 to 43

After upgrading my virtual server (effectively headless) from Fedora 42 to Fedora 43, I can no longer log in via SSH. When I try, the attempt takes a very long time, then I get the error message:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ED25519 key sent by the remote host is
SHA256:<censored>.
Please contact your system administrator.
Add correct host key in /home/guido/.ssh/known_hosts to get rid of this message.
Offending ED25519 key in /home/guido/.ssh/known_hosts:62
Host key for <censored> has changed and you have requested strict checking.
Host key verification failed.

Is there any reason Fedora would silently change the host key during this upgrade? Because this is exactly what it would look like if someone were to try to man-in-the-middle my connection. Also, why would there suddenly be a multi-minute delay?

As far as I can tell, the other services on the host have come up as usual.

While I can get a rescue console for this server, I have never set up a user with a known password, always just relying on ssh pubkey authentication.

Edit: Maybe relevant: Before the upgrade, I used to get this error when logging in via ssh:

** WARNING: connection is not using a post-quantum key exchange algorithm.
** This session may be vulnerable to "store now, decrypt later" attacks.
** The server may need to be upgraded. See https://openssh.com/pq.html

Edit 2: I just noticed the server has lost its IPv6 address in the upgrade for some reason. That would explain the delay on log in: ssh is taking some time to fall back to IPv4.

At this point, I would probably edit the kernel command line, add init=/bin/bash to get a console without having a password for any user, and retrieve the new public key. Then remove the old one from known_host (save it, just in case you need it to understand what happened), restart the server, and during the next SSH connection compare the one provided by the server to the one you retrieved via the rescue console. If they are the same, the connection is secure (no man in the middle), and you can comfortably inspect the logs to figure out what triggered the key regeneration.

Okay, I did that. Turns out it really was a changed host key.

Now I’m curious as to why that happened. Changing a host key is a big deal, I don’t think it should ever be done without conscious input from the system’s operator. I can see this throwing some security-conscious workflows for a loop, while rewarding bad practices. (I.e. just deleting the old host key when you see this error. I cannot count the times I’ve seen companies do that as SOP…)

Next step is to find out why the machine just completely dropped its IPv6 address…

According to journalctl, it was a service called “cloud-init” that did it, but it doesn’t seem to have logged a reason.

For my VM in the cloud the cloud-init is provided by my hosting company, not Fedora.
They do all sorts of things that are hosting-company related in cloud-init.

EDIT: the hosting company stuff may not be in cloud init after all…

Is this the case for you are well?

I wonder is the host key was using a weak cypher and got replaced on the upgrade?

Unfortunately, this is all too common when security starts getting in the way. As soon as there are all these annoying error messages, users either dismiss them without thinking or actively work around security. :frowning:

I guess that is it.

Just remember, If your work with root on your local machine, ssh connections are stored in /root/.ssh

Just in case you have a mix up about your local home and your local /root directory.

Honestly, no idea. This is the first time I’ve heard of cloud-init, and I’m still trying to figure out what exactly it is. It seems to come from Canonical. So far, I have no luck even finding documentation for it.

It sounds like something you would use to set up the internals of a cloud based virtual server, except Hetzner (my hosting provider) already has their own solutions for that.

Edit: The docs are at cloud-init 26.1 documentation

Systemd services have docs and point to them from the unit files.
For example:

systemctl cat cloud-init.target
# /usr/lib/systemd/system/cloud-init.target
# cloud-init.target is enabled by cloud-init-generator
# To disable it you can either:
#  a.) boot with kernel cmdline of 'cloud-init=disabled'
#  b.) touch a file /etc/cloud/cloud-init.disabled
#
# cloud-init.target is a  synchronization point when all cloud-init's initial
# system configuration tasks have completed. To order a service after cloud-init
# is done, add the directives as applicable:
#  After=cloud-init.target and Wants=cloud-init.target

There are a lot of cloud-init-*.service files installed on my VM.

This is a warning message that was implemented in OpenSSH 10.1. From the text, I am pretty sure this is client side. Also, because F43 has sshd 10.0 (not 10.1 or higher), this message must be from your >=10.1 client:

Potentially-incompatible changes
--------------------------------

 * ssh(1): add a warning when the connection negotiates a non-post
   quantum key agreement algorithm.

   This warning has been added due to the risk of "store now, decrypt
   later" attacks. More details at https://openssh.com/pq.html

   This warning may be controlled via a new WarnWeakCrypto ssh_config
   option, defaulting to on. This option is likely to control
   additional weak crypto warnings in the future.