Since Linux 6.9, LUKS suspend stopped wiping disk-encryption keys from memory

2 hours ago (mathstodon.xyz)

I am far from a security expert, but from the number of "we missed a single line C check across files during refactoring" critical security bugs discovered on a regular basis these days, the whole premise of a "giant secure open source C codebase" seems questionable. It is not specific to C of course, but invariants are arguably even harder to enforce and track consistently (esp under changes to code) in C. Unsure if FP with invariants encoded in types is a practically feasible scalable solution either. Model checking? [LLM] fuzzing? Fewer primitives with clear boundaries? Is that how seLinux was "checked"?

  • The whole premise of a "giant secure open source C codebase" seems questionable

    Because code review is sometimes not much different from an idealized version of the halting problem, where you would have access to a formalized version of a specification.

    In other words, there is no strict definition of what is a security issue.

  • In open source, someone (many, many) someone’s can at least check.

    Closed source…..

    • Not sure why you're getting downvoted, this is the entire point of open source.

      Does such a bug exist in Windows? OSX? Who checks? If someone finds the key in memory, can they tell what conditions might be causing it and where?

      Their only recourse under those situations is to hand it off to the OS Vendor and trust that what they implement does solve the problem, and trust that it wasn't a deliberate back-door that is now being replaced by another back-door.

While it is certainly an interesting bug, I kinda feel that the title is click bait? Because this `cryptsetup luksSuspend` from what I understood is not really officially supported but an extension done in Debian, so if anything this regression only affected Debian? I am not sure if you can blame the kernel for something that is not supported or even widely tested.

I still find this impressive, and it is nice that we now have a test (NixOSTests BTW are awesome, I agree with OP) to avoid this regression from coming back. But from the title it seems to be a widespread issue, not something that affects only one Distro.

  • Sorry, aimed for a technically precise title and didn't want to bait clicks.

    Yes, this does not affect people on stock configurations for the plain reason that they wouldn't expect the volume key to be safe during suspend anyway.

    Debian's solution was ported to several (most?) other distributions and I guess quite a few people maintained private ports.

    The thread-keyring(7) manpage promises: "A thread keyring is destroyed when the thread that refers to it terminates." For their key upload (from userspace to kernelspace) mechanism, the cryptsetup project relied on this property; but kernel 6.9 introduced a regression invalidating this property.

I don't see any other way? When you sleep (suspend to RAM), everything is stored in RAM and is encrypted but the master key is present in kernel memory (if I recall correctly).

However, if you hibernate (suspend to disk) the entire contents of RAM (including the master key) is written/encrypted to disk and the RAM is cleared.

When you wake the machine up you have to re-enter the passphrase to decrypt the master key to re-load disk contents back to memory.

  • Yes, if you simply suspend your laptop on most stock Linux distributions, then everything including the master key is still kept in memory. But Debian pioneered the (optional) cryptsetup-suspend addon. This issues a luksSuspend command which is supposed to wipe the key from memory, and on resume asks you to resupply your passphrase.

    Up to kernel 6.8, this worked as described; starting with kernel 6.9, it silently didn't.

  • Both Intel/AMD CPUs produced in the last 5 years or so support full transparent (to the OS) memory encryption. So cold boot attacks are a thing of the past if you enable this feature (it's typically disabled because it reduces RAM speed by about 0.5%).

I don't have to re-enter my boot password after Sleep, so obviously the encryption key is still in memory.

  • Obviously your distro isn’t using cryptsetup-luksSuspend.

    • Correct.

      The point being made is: If one isn't re-entering their passphrase after suspend, how are they surprised that the encryption keys are somewhere in memory during suspend?

      4 replies →

On my laptop with Fedora I just configured Linux to hibernate to disk after 15 minutes of suspend. Powering memory off ensures that bugs like this Debian-specific would not matter.

Plus what Debian extension to Linux tooling does although nice in theory, but in practice if one really worries about cold-boot attacks, then all keys and important documents has to be wiped out from memory, not only LUKS keys.

So hibernating is really the only proper way to protect against cold boot.

on the subject of encryption keys and memory there is something you can do:

- if your CPU supports it, enable memory encryption.

- if your TPM module supports this look for MemoryOverwriteRequestControl & MemoryOverwriteRequestControlLock (/sys/firmware/efi/efivars/) and toggle them. make sure that your computer always reboots and never powers off. memory will always be wiped on boot.

> Except that, for more than two years, the encryption key remained resident in memory across suspend, leaving it there for the taking by anyone who seized the still-powered laptop.

I don't get it. Obviously, the laptop is locked when it resumes, how is that key "for the taking by anyone"? I'm not saying it is impossible to read out RAM from a locked laptop, but surely not by "anyone".

  • Anyone with physical access. I think it is understandable from the phrase.

    There is a common misconception about how lock-screens in general work - they usually just prevents using the current hardware and software as it is to access the current OS. But the disk encryption is the main thing that prevents modification and other kind of access to actual data. And if the disk encryption key is lying in the memory, then effectively, the disk encryption is bypassed if someone can access the machine physically and assuming that there are no sufficient tampering protections in place for that machine.

    • Anyone with physical access, significant tools, and experience. The FBI has people who can pull data out of memory after freezing the RAM but the average laptop thief doesn’t so how serious this is depends significantly on your threat model. If you’re not a major criminal, bitcoin whale, or intelligence target this is almost certainly academic.

      1 reply →

    • > Anyone with physical access. I think it is understandable from the phrase.

      Sorry, I'm probably dense, I still don't get it. You steal a laptop, you open it, the screen is locked with a password/fingerprint whatever. How do you read out the RAM from that laptop?

      3 replies →

  • There are attacks that allow dumping RAM if the device is powered on though and you have physical access. Depending on config it may be very easy (just plug in a dumper over Thunderbolt on USB C and do direct memory access) or hard (freeze and swap physical RAM to an unlocked machine).. but the idea was defense-in-depth here; a well configured device should both be hard to dump RAM on and it should not give encryption keys if an attacker succeeds.

[flagged]

  • And I don't use GUIs, but it doesn't mean I have to be a jerk to people who are happy when their GUI gets better :-).

  • > That's a you problem. I shutdown my machine when I'm not using it.

    "We designed the antennas correctly, you're holding the phone the wrong way."

    • It's not a good analogy. Something is still on in suspend. Good you can control Linux kernel, but what about all other chips which may be an attack vector?

    • Except shutting down and hibernate are two actions the user can literally select from the same menu.

  • I shutdown mine too but only because suspend is still a crapshoot on linux

    • There will always be more suspend/resume bugs to work through. It varies a lot per device. I feel it's necessary to paint the picture for people who are curious what it means for it to be a crapshoot, so indulge me while I share my experiences.

      For work I have a ThinkPad T16 Gen 4 with the newer AMD gfx1151 iGPU. Works great. I have yet to witness any issues with suspend/resume. I suspect this is the case because it is running Ubuntu with Lenovo's own support package. Theoretically, from firmware to kernel, this is all tested and validated by Lenovo, like what certainly happens with every Windows laptop and all of the components that go into them.

      I also have a gen 1 Framework 16. I have seen it crash on suspend, but it is pretty rare, so I've just shrugged it off for now. It would be hard to debug, I don't see it every month despite using the thing every day.

      All of my desktops currently have perfectly reliable suspend resume, you can slam it all day and all night. The last time I ran into issues was a use-after-free issue in AMDGPU. Pretty alarming, although to be clear it never hit any LTS or vendor kernels that I am aware of. I hit it because I prefer to run the latest kernel on my personal machines.

      I have certainly owned laptops where suspend basically didn't work, or it would not stay suspended. I think this mainly went away when I started specifically picking laptops for Linux support.

      For Intel iGPUs and dGPUs, the track record has been flawless for me. I have a few of the new Battlemage cards that default to the xe kernel driver and those have been working very well as expected. So that's nice.

      I don't think this situation will be fixed until more hardware vendors are taking part in validating their stuff on desktop Linux and keeping track of the kernels. The current Linux model seems to be just dealing with whatever the vendors crap out for Windows, often full of weird ACPI behaviors and buggy firmware. It's not to say that the fault of the problems don't often lie with code in the Linux kernel, but they do not seem to wish to be bug-compatible with Windows and I think that is perfectly reasonable, so for problems that come from essentially broken firmware, it simply is going to need vendors to actually fix their shit.

      (And that includes AMD. The drivers are good in some regards, but it's hard to ignore AMD's stability issues even still. At this rate, more of the long outstanding AMD driver issues will get resolved by Claude than AMD engineers... Like with Panel Self Refresh on 7040 iGPU, apparently.)

  • I am too lazy for that, and I hate that after boot you need to launch everything again.

    • Suspend to (encrypted) swap might be a good middle ground between you and grandparent. Suspend to memory will (at best) protect your LUKS volume key, but other sensitive data remains.

      A couple of years ago, three security researchers from the TU Munich implemented a prototype for also encrypting (most) parts of the memory just before suspend, to address this limitation; but as far as I know, it was not upstreamed or developed further: https://www.sec.in.tum.de/i20/publications/fridgelock-preven...

Definitely not a symptom of Linux being a hodgepodge of code thrown together from a thousand different sources and no one person could tell you how it all fits.

  • Bugs happen in all code. The difference is, anyone can fix stuff in open source. Closed source bugs are out of control and must be worked around. Usually by switching to OSS

  • I wonder if you think other OSes are any different?

    TempleOS is the only thing that comes to mind that doesn't fit your description and it's not practically useful.

    Any sufficiently large codebase is a mix of ideas and concepts implemented by different people with different priorities over a large timespan and if you can fit the entire thing in your head it's not very interesting or complex.

    • The *BSDs, Mac, and Windows all keep critical code in the same tree as the OS.

      Something like disk encryption would be immediately visible.

      So you don't have this mess of 80 different distros with 60 different versions of systemd, 20 that don't use it, a million kernel versions and it's all thrown together in a Costco-sized trash bag and we call the output "Linux".

      2 replies →

  • Of course it's (indirectly) a symptom of that.

    What's the alternative? Proprietary closed-source operating systems owned by corps who can be compelled to insert covert backdoors?

    If BSD was as popular as Linux it would have the exact same problems.