Since Linux 6.9, LUKS suspend stopped wiping disk-encryption keys from memory

Posted by IngoBlechschmid 3 hours ago

Since Linux 6.9, LUKS suspend stopped wiping disk-encryption keys from memory(mathstodon.xyz)

242 points | 119 comments

kokada 1 hour ago|

While it is certainly an interesting bug, I kinda feel that the title is click bait? Because this `cryptsetup luksSuspend` from what I understood is not really officially supported but an extension done in Debian, so if anything this regression only affected Debian? I am not sure if you can blame the kernel for something that is not supported or even widely tested.

I still find this impressive, and it is nice that we now have a test (NixOSTests BTW are awesome, I agree with OP) to avoid this regression from coming back. But from the title it seems to be a widespread issue, not something that affects only one Distro.

IngoBlechschmid 1 hour ago|

Sorry, aimed for a technically precise title and didn't want to bait clicks.

Yes, this does not affect people on stock configurations for the plain reason that they wouldn't expect the volume key to be safe during suspend anyway.

Debian's solution was ported to several (most?) other distributions and I guess quite a few people maintained private ports.

The thread-keyring(7) manpage promises: "A thread keyring is destroyed when the thread that refers to it terminates." For their key upload (from userspace to kernelspace) mechanism, the cryptsetup project relied on this property; but kernel 6.9 introduced a regression invalidating this property.

bitbasher 2 hours ago||

I don't see any other way? When you sleep (suspend to RAM), everything is stored in RAM and is encrypted but the master key is present in kernel memory (if I recall correctly).

However, if you hibernate (suspend to disk) the entire contents of RAM (including the master key) is written/encrypted to disk and the RAM is cleared.

When you wake the machine up you have to re-enter the passphrase to decrypt the master key to re-load disk contents back to memory.

IngoBlechschmid 2 hours ago||

Yes, if you simply suspend your laptop on most stock Linux distributions, then everything including the master key is still kept in memory. But Debian pioneered the (optional) cryptsetup-suspend addon. This issues a luksSuspend command which is supposed to wipe the key from memory, and on resume asks you to resupply your passphrase.

Up to kernel 6.8, this worked as described; starting with kernel 6.9, it silently didn't.

herywort 1 hour ago|||

So you would still be asked for a passphrase, even though it's already available?

IngoBlechschmid 1 hour ago||

Exactly. Cryptsetup wouldn't know about the extra copy of the volume key in kernel memory. Which is why, dramatically, it appeared secure ("surely I wouldn't be asked to resupply the passphrase if the volume key is still in memory, right?").

pedrocr 1 hour ago||

It was still more secure than the default if I understand this correctly. On resume from suspend the laptop would still be locked by the encryption key and without access to the disk even if you can somehow circumvent the lock. The only insecurity was that somewhere in the kernel memory the key still exists so if you can somehow extract that from the live system you can unlock it.

IngoBlechschmid 48 minutes ago||

Yes, you are right: LUKS encryption protests your data at rest. An attacker which steals your disk can only gain little, like the information that you have used LUKS (unless you put your LUKS headers elsewhere, separated from the disk) and perhaps disk and disk sector usage statistics.

naturalmovement 2 hours ago|||

FYI: VeraCrypt is not the defacto encryption software for Windows.

IngoBlechschmid 1 hour ago||

Oh, which one is it?

(You don't mean BitLocker, right?)

naturalmovement 1 hour ago||

It absolutely is and they have most the enterprise market.

IngoBlechschmid 1 hour ago|||

Okay, yes, sure. It definitely is the most-used encryption software for Windows.

But I would never trust it a second, being proprietary and known for issues. You likely know that, but for the benefit of others:

38C3 - Windows BitLocker: Screwed without a Screwdriver https://media.ccc.de/v/38c3-windows-bitlocker-screwed-withou... https://www.youtube.com/watch?v=5eNtT2p12cM

noinsight 1 hour ago|||

If you’re at all serious about security and not user convenience, you deploy BitLocker with a PIN instead of TPM only. And then a whole class of vulnerabilities goes away.

bri3d 1 hour ago||||

The issues you linked with BitLocker are obvious properties of BitLocker-with-SecureBoot-only architecture. If you configure Linux that way, you get similar issues (for example, it's pretty easy to mis-configure TPM sealed disk encryption on Linux to still allow a recovery shell, which will run with the disk unsealed).

BitLocker with a password (the equivalent of the LUKS configuration in question) does not share these issues.

veeti 12 minutes ago||

Bitlocker with a password has always felt like a second class citizen to me. You have to dig into a bunch of group policies to use it. Maybe most people don't even realize it exists.

saidnooneever 1 hour ago|||

veracrypt lost their drivers license so afaik you should avoid it since it cannot update its drivers any longer. didnt see any news about them reacquiring that license

snailmailman 1 hour ago||

Assuming this is what you are referring to, it was resolved within a few days. The incident being resolved just didn't make headlines. https://sourceforge.net/p/veracrypt/discussion/general/threa...

nacs 1 hour ago|||

Reminder that by using Bitlocker, you're using a closed source encryption for which Microsoft will happily hand out your recovery key on request.

https://www.forbes.com/sites/thomasbrewster/2026/01/22/micro...

andrewpiroli 1 hour ago|||

Only if you store your key with Microsoft, which is not required or the default if you're using a local account which I assume most privacy sensitive people are.

gruez 1 hour ago||

Not to mention that unless the bitlocker activation flow changed recently, it specifically asks you how to store your backup keys, with a choice given been local options (eg. usb drive, printing it off, etc.) and saving it to your microsoft account.

briHass 1 hour ago||||

Bitlocker can use keys that are local only, but the default for home editions of Windows was to use the online account to back it up.

'Happily' is also a stretch, as they really don't have a choice if served a valid court order.

If you want encryption that is safe from the US government, keys need to be stored in your head. Anything physical is subject to court orders.

john_strinlai 1 hour ago||||

for enterprises, where this doesn't really matter, bitlocker is great.

dijit 1 hour ago||

if by "great" you really mean "fine".

It's still brittle, awkward and puzzlingly awful UX despite being the literal standard for the platform.

Compare it to any of the actively maintained alternatives, Filevault for MacOS (which is wonderful and never sends your key to be kept somewhere else) or LUKS on Linux.. heck, even Veracrypt is actually easier to understand and more robust.

IrishTechie 1 minute ago|||

We have more issues with FileVault than we do with BitLocker, the latter being a fleet 5 times larger than the former. I find both “fine” for enterprise.

john_strinlai 1 hour ago||||

>if by "great" you really mean "fine".

no, i mean great.

managing a fleet of 100+ laptops with bitlocker is a breeze. its so seemless that the users don't even realize its enabled (i.e. no UX issues, at all).

on the other hand, i am not managing 100+ laptops that use veracrypt. sounds absolutely awful. i've never managed an apple fleet, so i can't speak to that, and will take your word on it.

for personal use, i do not recommend bitlocker (or windows, really), but for already-windows enterprises? absolutely

dijit 1 hour ago|||

Flicking a button to turn something on is not what I'm talking about, that's normally the easy part of any setup, and I judge people harshly who only take that aspect of something into consideration when discussing systems.

Brittle is what happens when you haven't logged on to the machine in 60 days, trust with AD is broken, TPM has a glitch and wipes the in device key and forces you into recovery... or god forbid you service the laptop and now you have to enter recovery mode.

Then you're in a nightmare, trying to give someone a super long passphrase over the phone is a not-too-uncommon occurance.

That's assuming you have a good policy for storing the recovery keys. Too loose and they're handed out to everyone, sort of defeating the purpose: too strict and you need the IT department (or specific members), and its still predicated on the notion that you have a policy for it... Given that Admins are a dying breed... I don't think this is workable.

If you compare with Filevault on MacOS: which tracks the credentials of the logged in user; there's no "issue" if the device loses trust because ultimately you always use the real unlock key: not something cached in a "secure storage".

bri3d 1 hour ago|||

Having dealt with FileVault in this context, it's also frustrating; it's really common to have it fail to follow the logged-in user's credentials, and if you use any kind of federated login, you will frequently get users with FileVault passwords that are either ahead of or behind their system login password.

I think both approaches are valid trade-offs and I think that the default Secure Boot BitLocker configuration, for all its architectural tradeoffs, can probably be credited for an enormous amount of data loss mitigation originating from used hard drives alone.

john_strinlai 1 hour ago|||

maybe i am missing something, but how did veracrypt solve all of the admin and policy issues you’re bringing up? (specifically for large enterprise fleets)

dijit 56 minutes ago||

If you use your key every day you tend not to forget it.

If I as an admin give you your key: it is “leaked” effectively.

john_strinlai 54 minutes ago|||

>If you use your key every day you tend not to forget it.

hoping users don’t forget their password is a very weak policy.

specifically, the policy and admin points you brought up above, how does veracrypt solve them?

dcrazy 50 minutes ago|||

Have you never gone on vacation and forgotten your daily-use password upon return?

akerl_ 1 hour ago|||

Managing an Apple fleet is similarly fine, and that includes using any of the MDM tooling that also does key escrow on enterprise Filevault devices.

dcrazy 54 minutes ago||||

FileVault absolutely has an optional iCloud Keychain escrow. That’s how the “unlock with Apple Account” feature works. Apple doesn’t have the keys for iCloud Keychain, but it is still stored in iCloud.

j16sdiz 1 hour ago||||

> Filevault for MacOS (which is wonderful and never sends your key to be kept somewhere else)

Did you read the documentation?

https://support.apple.com/guide/mac-help/protect-data-on-you...

"iCloud account: Click “Allow my iCloud account to unlock my disk” if you already use iCloud. Click “Set up my iCloud account to reset my password” if you don’t already use iCloud."

https://developer.apple.com/documentation/devicemanagement/f...

"FileVault Full Disk Encryption (FDE) recovery keys are, by default, sent to Apple if the user requests them. Only one payload of this type is allowed per system."

dijit 1 hour ago||

Can.

If you click "Allow my iCloud account to unlock my disk", your recovery key is escrowed to Apple, tied to your Apple Account.

If you don't select that option it never does.

I should have said "without your explicit permission", but I assumed we were all adults and understood that.

The main point is that it's using your account password to unlock, the recovery key is for if you forget your account password.

dcrazy 52 minutes ago||

No, you were just plain wrong. You said “never”, when in reality BitLocker and FileVault both have optional escrow.

Arainach 1 hour ago|||

Veracrypt is more difficult to set up - whether on one machine or a fleet. Bitlocker is a few buttons in the UI, configurable via Group Policy, and so much more.

What is brittle or awkward?

dijit 1 hour ago||

"PLEASE ENTER YOUR BITLOCKER RECOVERY KEY"

Where is it?

A) Uploaded to microsoft

B) Somewhere in EntraID?

C) Somewhere in our onprem AD?

D) Written down on a scrap of paper when I set up the laptop

the fact that they never ask for the passphrase is a weakness of the system. Because now you have an extremely difficult situation as soon as you're off the happy path.

It's also like 64 characters alphanumeric with no capability to copy/paste.

Compare it to Vera/Filevault where the access key is the users passphrase. In MacOS it's literally your account password, which follows along with your in-OS account credentials.

philipallstar 1 hour ago||||

Does that mean it's not the de facto standard on Windows?

naturalmovement 1 hour ago|||

So exactly like FileVault?

dist-epoch 1 hour ago|||

Both Intel/AMD CPUs produced in the last 5 years or so support full transparent (to the OS) memory encryption. So cold boot attacks are a thing of the past if you enable this feature (it's typically disabled because it reduces RAM speed by about 0.5%).

m3047 3 minutes ago|||

Recent news is that this isn't shipping on some consumer-grade CPUs from AMD. There, made it explicit enough there's no room for conversation. Here's the link:

https://arstechnica.com/security/2026/06/users-cry-foul-afte...

tredre3 7 minutes ago|||

The impact on performance is more along the lines of 1-2% on AMD (though it likely varies by generation (I did extensive benchmarking on Renoir wrt throughput/latency/gpu). But yes small enough to be insignificant unless you run LLMs or game on the iGPU. I imagine that it also uses marginally more power.

AMD also has a second encryption mode where the OS decides what gets transparently encrypted, it doesn't have to be everything. But that mode is poorly documented (or at least the documentation isn't accessible to peasants like me)

crypttales 1 hour ago||

[dead]

johnathan101 1 hour ago||

This is one of those regressions that's easy to miss because everything still "works." Security bugs often don't announce themselves.

IngoBlechschmid 1 hour ago|

Right! Which is why integration tests for these kinds of features are all the more important.

It was also fun to write, and enabled git-bisecting to isolate the specific kernel refactoring which introduced this bug: https://github.com/NixOS/nixpkgs/pull/532499

CodesInChaos 2 hours ago||

I don't have to re-enter my boot password after Sleep, so obviously the encryption key is still in memory.

wrs 2 hours ago|

Obviously your distro isn’t using cryptsetup-luksSuspend.

unethical_ban 1 hour ago||

Correct.

The point being made is: If one isn't re-entering their passphrase after suspend, how are they surprised that the encryption keys are somewhere in memory during suspend?

edit: I see now that the prompt was being given and the keys still resided in memory.

ksbd-pls-finish 1 hour ago|||

Because debian users with luks-suspend have to re-enter their boot password after sleep.

weaksauce 1 hour ago||||

> The point being made is: If one isn't re-entering their passphrase after suspend, how are they surprised that the encryption keys are somewhere in memory during suspend?

If that was the case for the people using the debian extra secure extension that should have wiped the memory clean then someone would have found this bug much earlier than two years. Their password was required to be re-entered even though the key was still in memory somewhere.

akerl_ 1 hour ago||||

The reason this bug is unexpected is that the user is expecting to have to enter their password (because they expect the key to be wiped on suspend), and then _they are_ asked for their password. But there was a copy of the key elsewhere in kernel memory that was never cleared.

unethical_ban 14 minutes ago||

Ah, my bad. Yes, if the user was being presented with the prompt on wake, I see the problem.

I have never had that setup so I was confused.

killerstorm 1 hour ago|||

Well, potentially a key might be stored in TPM. But I don't think that's better

joshuaissac 9 minutes ago||

I would hope that it is harder to get into the TPM than into the RAM.

tombert 40 minutes ago||

I don't think this bothers me.

The only reason that I do the disk encryption is so that I don't have to worry about people going through my laptop to steal tax documents and/or credit card stuff when I sell the laptop. I of course also wipe the laptop too, but I figure that if the data is encrypted at the drive level then there's very little risk of anyone being able to use some kind of forensics tool and recover data.

chazeon 9 minutes ago||

But if you do this, don't you have to enter two passwords each time you wake? One for LUKS, one for the system login?

polotics 7 minutes ago|

Well yes and I don't see how this can be avoided.

boutell 2 minutes ago||

https://xkcd.com/538/

(No, no, I take this stuff seriously too, but it had to be said)

bbminner 1 hour ago||

I am far from a security expert, but from the number of "we missed a single line C check across files during refactoring" critical security bugs discovered on a regular basis these days, the whole premise of a "giant secure open source C codebase" seems questionable. It is not specific to C of course, but invariants are arguably even harder to enforce and track consistently (esp under changes to code) in C. Unsure if FP with invariants encoded in types is a practically feasible scalable solution either. Model checking? [LLM] fuzzing? Fewer primitives with clear boundaries? Is that how seLinux was "checked"?

fsddfsdfssdf 42 minutes ago||

While I can see the shortcomings of C and generally don't recommend it for new projects I don't see this particular bug as a good example of something Rust's borrow checker or some other language's type system will catch. I don't think even static analyzers can catch this.

It's basically something like this:

original: DoTheThing()

new: DoTheThingSlightlyDifferentButKeepMyCredentialsAlive()

fix: DoTheThingSlightlyDifferentButDoInFactNOTKeepMyCredentialsAlive()

In my experience a substantial portion of gnarly bugs come down to a violation of a high-level system invariant and those do not strike me as something that can be automated. Even with something like Lean you can prove your program satisfies certain properties but you need to have thought about those properties in the first place. The proof doesn't discover the invariant for you.

If you'd had thought about the relevant security property you could have written a regression test for it which is not hard. IMO the really hard part isn't expressing the implementation safely, but it's the realization that this was a property the implementation needed to preserve.

bbminner 9 minutes ago|||

I agree re Rust vs C - this is not (only) a language issue. What would (roughly) the invariant be here?

In another thread comment below i argue that maybe the system (OS) itself is so complex that it lacks clear contract / the contract evolves too quickly over time (as other parts of the code need to change the given piece of code to extend it to their use case) and that defies clear encoding?

Or we lack easy enough means to describe specs? I tried reading jepsen spec earlier today and despite it being an "integration test" of sorts, it is far from "simple".

Can an entire OS or a system of comparable complexity be decomposed into objects simple enough that their entire intended behavior (with all edge cases) can be explained in a paragraph of human text + half a screen of dense behavioral "spec" - if i do X and do Y, Z should come out / hold _no matter what happens in-between_. Or that's what asserts + fuzzing is effectively supposed to do? Is there a clear distinction between invalid input and failed invariant in typical C code? I guess error code vs seg fault?

estebank 26 minutes ago|||

This is in effect a state machine, and when you have a type system more complex than C's you can encode state transitions in the type system (either by having state transitions explicitly return a new return type or by using sum types). You still need to architect the system to encode the invariants in types. No language will fix all logic bugs for free. But you can leverage language features to reduce their number.

fsddfsdfssdf 18 minutes ago||

> You still need to architect the system to encode the invariants in types.

That's the problem though, right? If it's pointed out we all agree the "do not keep credentials alive" is a property that should hold and we can leverage whatever the environment offers to help preserve it. I fully agree modern languages have amazing support for this, but in C you can still run tests. Let's just say I don't think the language's inability to express logic of this kind held all those involved back from testing for it. I personally find "we just didn't think of it" much more likely.

That said, I am not a fan of C and recommend leveraging whatever fantastic modern tooling is available to you.

WhitneyLand 48 minutes ago|||

The premise of a secure open codebase is fine.

The problem is being more auditable does not automatically make it more audited.

There have to be enough people with skill taking enough time to work on it.

pixl97 38 minutes ago||

If you think open source is bad, wait till you see enterprise code. I'm talking full auth bypass due to the stupidest crap. You can do that in any language if you have fools working on the code base.

danudey 28 minutes ago|||

Even security code. Fortinet, a vendor whose entire thing is security for your network, is consistently getting caught out with default passwords, backdoors, etc.

https://community.spiceworks.com/t/hard-coded-password-backd...

This sort of thing leads to every kind of exploit, like

https://www.linkedin.com/pulse/half-worlds-fortinet-firewall...

620gelato 27 minutes ago|||

I explicitly make sure services I lead have Integration tests in CI pipeline to validate the "negative paths" against all APIs with missing, invalid, un-authorised identities, expired, un-authenticated tokens. Of course that still doesn't cover every surface, but even that gets sideways glances from some folks who think we should just test happy paths and why we're testing for access controls in Integration tests.

pjdesno 26 minutes ago|||

To translate to Rust, it would have been "we missed a single line Rust check"...

This is a bug involving intersecting concerns and a deficit of cross-domain knowledge. It probably would have been the same in Lisp or assembly language.

russdill 43 minutes ago|||

The lesson here is that if a feature (at a minimum) does not have a associated test case, it is not actually a feature.

fsddfsdfssdf 30 minutes ago||

Yes, I agree. I find the addition of the regression test the true long-term fix. The code is just an opaque incantation that may or may not preserve some property we find worth preserving and we have no way of knowing it keeps preserving it over time as other parts of the system change.

The test actually proves it and while it too can change it has more staying power because it's expressed at a higher level of abstraction ("random arcane weird C shit" in the case of code versus "does this property hold" in the case of a regression test).

bbminner 2 minutes ago||

I have not looked into this specific issue, but are we sure that a regression here could have been avoided via a localized test? Maybe issues seem to arise from A implementing a feature with tests. B seeing that A lacks some functionality and adding it (potentially with tests), C seeing this (extra) functionality in A, and using in unintended ways not covered by tests (or in an unintended environment) + multiply by many layers of this A-B-C story up and down the stack.

moritzwarhier 1 hour ago|||

The whole premise of a "giant secure open source C codebase" seems questionable

Because code review is sometimes not much different from an idealized version of the halting problem, where you would have access to a formalized version of a specification.

In other words, there is no strict definition of what is a security issue.

bbminner 22 minutes ago||

On the other hand, it is (both halting and spec adherence) are checkable under compute and space constraints though? :) I'd say the biggest hurdle are means to describe the spec in way that is easy enough for a human to produce to make it feasible.

Not a DB person either, but things like TLA+ seem very hard to write even with LLMs. Behavioral tests with an enumerable number of random paths to take (aka model checking - eg jepsen) seem more feasible. Although you can't check internal properties of the system (string `pass` or any of it's copies or parts are not held anywhere in memory at any point between lines A and B) unless we can check that two memory dumps are indistinguishable with different pass strings (assuming we abstracted away storage devices in a test environment).. Also not sure if it's "easy enough" to write such tests either.

Maybe the reason is that OS domain objects / primitives are too complex and not "isolatable" enough / lack a clear contract at all? (Hence multi file refactorings that break invariants.)

lazide 1 hour ago|||

In open source, someone (many, many) someone’s can at least check.

Closed source…..

Twirrim 51 minutes ago||

Not sure why you're getting downvoted, this is the entire point of open source.

Does such a bug exist in Windows? OSX? Who checks? If someone finds the key in memory, can they tell what conditions might be causing it and where?

Their only recourse under those situations is to hand it off to the OS Vendor and trust that what they implement does solve the problem, and trust that it wasn't a deliberate back-door that is now being replaced by another back-door.

charcircuit 48 minutes ago||

Security researchers find security bugs in closed source operating systems all of the time.

lazide 43 minutes ago||

Yup, it’s just harder to know for sure.

pixl97 36 minutes ago||

Oh, and large companies quite often fix these horrific issues silently, especially in online services where the customer can't diff bins. We're talking auth bypasses and RCE's that you'll never know about.

deepsun 26 minutes ago||

"Million eyeballs" argument was always kinda meh.

fpoling 1 hour ago||

On my laptop with Fedora I just configured Linux to hibernate to disk after 15 minutes of suspend. Powering memory off ensures that bugs like this Debian-specific would not matter.

Plus what Debian extension to Linux tooling does although nice in theory, but in practice if one really worries about cold-boot attacks, then all keys and important documents has to be wiped out from memory, not only LUKS keys.

So hibernating is really the only proper way to protect against cold boot.

IngoBlechschmid 1 hour ago||

> So hibernating is really the only proper way to protect against cold boot.

I agree; or resurrecting FridgeLock: https://www.sec.in.tum.de/i20/publications/fridgelock-preven...

killerstorm 1 hour ago||

Hmm, where does it get a key to decrypt memory on resume?

AFAIK it's practical only if you make use of TPM. And if you do, you're basically at mercy of TPM.

teravor 1 hour ago||

    > where does it get a key to decrypt memory on resume?

you enter it...

teravor 1 hour ago|

on the subject of encryption keys and memory there is something you can do:

- if your CPU supports it, enable memory encryption.

- if your TPM module supports this look for MemoryOverwriteRequestControl & MemoryOverwriteRequestControlLock (/sys/firmware/efi/efivars/) and toggle them. make sure that your computer always reboots and never powers off. memory will always be wiped on boot.

someothherguyy 48 minutes ago|

https://trustedcomputinggroup.org/wp-content/uploads/TCG-PC-...

More comments...