Posted by JoachimSchipper 3 days ago
Maybe I'm missing something but SSH already has a built-in solution for this, key-certs. Just sign the server key with a private CA key you trust.
If the DNS record for the host has an SSHFP (SSH FingerPrint) record, SSH will compare it to the retrieved public key(s) and refuse the connection if there is a mismatch. It can be configured to require DNSSec for this, or to only reject if it gets a secure rejection (to prevent DoS).
It works perfectly, has no notable down sides (just add a DNS record when you generate the host's SSH key), and has been around for many years.
Just means an attacker also needs to mitm DNS if you MITM the host. Not trivial, but depending on setup might not be harder.
If set to `yes`, you get automatic trust-on-first-use (no user prompt) if you use DNSSec, and you get the current asking-the-user behavior if your DNSSec is broken or you are under attack.
Obviously it's more secure if you use DNSSec, because that way you can reflexively deny any request to manually verify a host key, but it provides value regardless.
You usually need some form of trusted communication with a new server until you can give it its real identity, in the form of host names and cryptographic keys. In an enterprise environment this can usually be done with some sort of isolated management or provisioning vlan. In a cloud environment, I've seen all sorts of more or less hacky solutions but since it depends a lot on specific details of your networks and APIs, bespoke solutions are fine.
A big way to deter them is to keep remote log files which, if analyzed, will reveal any attack.
For example, if both ssh-client and ssh-server kept a fingerprint of the session key in some append-only logfile, then a later administrator could compare the logfiles to know if an MITM happened.
Suddenly, nation state attackers won't be interested in MITM-ing at all.
Unfortunately it appears openssh doesn't even have an option to create such a logfile!! Why not??
If so, the legitimate server wouldn’t have anything in their logs that would help detect such an attack.
OpenSSH does log other telemetry though.
The client sends not only the public key, but also a signature, and that signature depends on the output from the key exchange, so it's "bound" to the shared keys negotiated between the client and the server. If the MITM server does separate key exchanges with the client (pretending to be the real server) and the server (pretending to be the real client), the signature won't match; if it forwards the key exchange between the real client and the real server, it won't be able to decrypt the packets.
That's the best thing about SSH public key authentication (and HTTPS client certificates): even when MITM can impersonate the server to the client (because the client didn't verify the host key), it can't impersonate the client to the real server.
MITM can take its public key and the client's public key and send the resulting signature to the server instead of forwarding what it received from the client.
Do pretty much the same exact thing: MITM PK + Server's PK -> Client. Now client has a signature as well. The signatures that client and server have are different but that is OK as long as MITM can see and change all communication.
It has been a while since I went through the details of the protocol, so I must be missing something. What is it?
> Client takes its own public key and the server's public key and creates this signature.
According to https://www.rfc-editor.org/rfc/rfc4252#section-7 client takes its own public key, the "session identifier", and a few other things, and creates this signature (using the private key corresponding to that public key). According to https://www.rfc-editor.org/rfc/rfc4253#section-7.2 that "session identifier" is a byproduct of the key exchange.
> MITM can take its public key and the client's public key and send the resulting signature to the server instead of forwarding what it received from the client.
That's not possible, since the MITM doesn't know the client's private key (and using a different public key will be rejected by the server).
> Do pretty much the same exact thing: MITM PK + Server's PK -> Client. Now client has a signature as well. The signatures that client and server have are different but that is OK as long as MITM can see and change all communication.
You're confusing the Diffie-Hellman Key Exchange with the Public Key Authentication Method. When you MITM the key exchange, the shared secrets the client and server have are different (one side has a secret derived from the client and MITM keys, the other side has a secret derived from the MITM and server keys), but that works as long as the MITM can see and change all communication (basically, decrypting it and encrypting it again).
But since the secrets are different, the session identifier is also different. The MITM can't forward the signature from the client since the server will fail to verify it due to the mismatch in the session identifier; the MiTM can't create a new signature with the client public key since it doesn't have the corresponding private key; and the MITM can't create a valid signature with its own public key (and the corresponding private key) since that key won't be in the authorized keys list for that user account in the server.
Fingerprints are derived from the certificates/private keys. Unless I don't understand some basic crypto, or SSH works in some obtuse way, I do not think it would be possible for the MITM attacker to present the server with the true client's fingerprint unless they also had had the client's private key.
If they forward the real key, so it matches the fingerprint, and you use it, they can't MITM the request because they can't read the contents.
See also: rsyslogd
I was answering this question from GP:
> Unfortunately it appears openssh doesn't even have an option to create such a logfile!! Why not??
The answer is because in Linux systems the logging logistics are handled at the system level, just like starting and running openssh itself. The answer to "why not" is because that's the logging system's job, not openssh's.
rsyslogd is one simple and direct way to distribute logs to remote machines.
syslog > /dev/lpt0 printer?
Assuming you trust that the host control panel (or API server) hasn't been hacked, which you are assuming anyway if you trust a host fingerprint given to you that way, that should be as secure. For a small bit of extra assurance, to protect from an extra very unlikely attack, generate a new key pair just for this VM's creation so that you know you aren't connecting to some other VM that happens to have a known public key of yours and you've been redirected to by DNS poisoning.
However, an easy attack in the same ballpark, is to accept the connection without any password or public key auth, and then accept agent forwarding, and ask that agent connection to authorize a connection to a target server, with the user's keys. Never forward your agent connection to an untrusted host. Though -- I imagine this pattern is common when setting up a new host -- trust the first connection, and forward your agent so you can pull resources (like git repos) from the new host to set it up ...
It's a neat little trick if you're often deploying VPS in shared cloud environments.
How to deploy secrets during bootstrap to a new virtual machine running in the Cloud that does not leave a trace in the infrastructure. And in a way that I can completely automate the deployment.
One answer is providing the secrets in cloudinit - but this leaves a trail on the host/provider's infrastructure, I do not know if those configs I paste into the portal then get saved off somewhere.
The other option (more secure) is having the keys/secrets generated on the host itself at first boot. But then this is difficult to automate as I would need to scrap them (even just the public parts) in a secure way. One option would be to have the public keys printed to the terminal/VNC - but this is much more trouble than it is worth to automate.
I'm not sure on a good solution. This is taking quite an adversarial security model though, assuming the host/provider is not completely trustworthy. Of course not owning the hardware means that the host/provider could be performing other attacks without my knowledge (copying memory, etc.)
2. Use certificates and your own CA.
3. Use the virtual serial console for first login.
4. Use cloudinit to add a custom software repo, then use that to install a custom package that does the initial work.
Yes you implicitly trust the public key on first login.... then just... immediately compare it with what's on your box?
Might as well seal your doors with duct tape to prevent ghosts from entering your home because this is equally effective.
I'm not confident you understand how crypto works.
You do realize the entire threat model here is a house of cards perched atop someone else's software hosted on someone else's hardware all of which you implicitly trust and discard in favor of some unlikely cloak and dagger interception scheme.
So you login the first time and they either match, or they don't. If they don't you start over. The end.
Ignore the fact that most people will probably use the box to host a poorly coded vulnerable service anyway.
someone who definitely understands how crypto works, describing the most basic possible MITM
Also I don't see anywhere that the script reloads the sshd daemon, which AFAIK is necessary to get it to start using the new host keys and stop using the deleted initial host key.
Unfortunately, ansible does everything over.... SSH. So if I spin up a new VM or host, I have to manually trust the certificate for the first connection, which is the whole point of this article. I always have console access so I can log in and check the pub key.
There are various ways around this, including the authors suggestion with cloud init. None are very helpful for new physical hosts, though. I'm leaning towards a feature step CA supports that lets hosts authenticate themselves with an X509 cert, which you can easily get with ACME. It's so easy that you could even do it over console on a new physical host.
What I really wish is that there were something like the ACME TLS-ALPN challenge but for SSH servers. They can already present a self signed certificate, so it would just be a matter of connecting all the plumbing.
Specifically, if you bind authentication to the connection, then an attacker who impersonates the server (in this case because it's the first connection, but in other settings because they have a fake certificate), then client authentication is not portable to another connection, so the attacker can't mount a classic MITM attack. However -- and this is a big however -- that doesn't mean that there aren't serious security problems. For example:
* If you use SSH to copy a secret such as an API key to the server, then the attacker still knows the API key.
* If you download some file (e.g., a script) from the server and then trust it, the attacker can use that to provide a malicious script.
That's much harder to pull off though, because you need to replicate the environment close enough so that the victim doesn't suspect anything. Do they put their config files in /var/lib or random docker volumes? Do they use docker compose or docker-compose, etc.
Human checkable fingerprints for pubkeys/hashes dont really work. None of the schemes i've seen hold up under somebody willing to spend compute to get a near-enough collision to fool most people most of the time.
But we can take those random bits and transform them and feed them into a seeded image generation LLM, and then have a person remember/compare the deterministic output image.
You might even make the case its the perfect machine to create memorable-2-human image artifact from random data.
I can't offhand think of anything that an LLM image generator would do to improve the process; it'd be an interesting research task. You'd need a way to transform the 256-bit hash into LLM input in a way that would maximize the perceptual difference in generated images. The problem is that it's absolutely critical that two different implementations work the same, which means the spec would need to specify the exact set of model weights to use.