← Back to context

Comment by OtherShrezzing

2 years ago

I believe that is the generalisable version of the attack. You're not looking to learn the sound of arbitrary keyboards with this attack, rather you're looking to learn the sound of specific targets.

For example, a Twitch streamer enters responses into their stream-chat with a live mic. Later, the streamer enters their Twitch password. Someone employing this technique could reasonably be able to learn the audio from the first scenario, and apply the findings in the second scenario.

Finally, a real security weakness to cite when making fun of people for their mechanical keyboard. Time to start recording the audio of Zoom calls with some particularly loud typers...

  • I used to work in an office space with an independent contractor whose schtick was that he was a genius. The affectations around his genius-ness included casually bringing up Mensa meetings, dropping magazines like Foreign Affairs and academic journals around the office, and his fucking keyboard.

    The keyboard had custom switches that were very loud. And he typed fast - it was like living on a gun range. Everyone in the office probably would have chipped in for a hitman, but alas, the CTO, whose office had a solid door, was “inspired” that the mechanical feedback helped fuel inspiration in boy wonder.

    Had we thought of the security risks of the keyboard, I would have brought good scotch to the infosec dude while expressing my concerns.

    • Somewhat tangential: clicky switches, like Cherry Blues, tend to click twice for each stroke. I think this leads to people assuming there are twice as many strokes going on. Tactile switches tend to only click once (when they bottom out). So, fancy keyboards can make people sound faster than they are.

      3 replies →

    • > it was like living on a gun range

      Thanks for this metaphor. I know off at least one guy, to which this metaphor could be applied as well.

  • Mechanical keyboard user here. Most of us use mechanical keyboards because they're a lot more fun to type on. That's it. Because if you're not having fun, what's the point?

  • Not according to the article.. Microphones are sensitive enough to mount the attack on quieter keyboards.

    • Microphones are surprisingly sensitive. I can listen to music in my closed-back headset at a regular volume. My desk mic can pick this up. Without boosting the audio it's barely audible that there's music, but after adding some gain you get almost the full song profile (and background noise).

      I can even pick out some of my breathing from the recording.

      If I turn on noise suppression and noise gate it's fine.

      1 reply →

  • I'll just have to add significantly more background clickity clacks as obfuscation.

    • My thought was to run psyops all the time.

      "Just need to type in my password." He says a little too loudly to nobody. Then just type in the honeypot password and login with the real one that you entered with a virtual keyboard a few minutes ago.

      Meanwhile you've got a prerecorded keyboard going concurrently that decodes to "I know what you're trying to do. Clever but not clever enough."

      And I guess you might as well have a special keyboard that you only use for typing in passwords while you're at it.

  • It’s so fascinating to watch this play out live. Once again, an ambitious kid can implement software hacks that are very funny when used for a joke, but also have massive real-world implications.

I guess more reason to just use a password manager to autofill your password?

  • Only if it doesn't only rely on a master password

    • A nice thing about master passwords though is that since you don't have to type them in as often, they can be very long. 95% accuracy probably isn't good enough to reliably reproduce a sentence-length master password, at least if it's only captured once.

      11 replies →

    • Doesn't everybody not require only a password?

      Offline you need the database which isn't public.

      Online you usually need something else on new machines to get at the true master password.

    • [insert yubikey plug]

      I don't use one but I know people who swear by them.

      Also this is an extremely obvious result. Typing is obviously a form of "penmanship", it was well known that telegraph operators could identify each other by how they tapped out Morse code in the 1800s.

      People have been able to do this based upon key stroke latency and even identify people based on habitual mouse patterns for decades.

      Audio recordings work as yet another reliable proxy? Shocked!!

      I am amazed that people can do such obvious things and get published, have articles written on them... I need to get in on that, sounds easy

      I can make a web demo. You turn on the microphone type a couple things into a box on the web browser.

      Then you go to a different window and continue typing and then the model predicts What you are typing. As long as it's proper grammar you can get to effectively 100% accuracy. It'll appear to be spooky magic.

      I just might take the time.

      2 replies →

  • Or just use 2fa

    • If you have 2FA and one part of it is easily figured out, then you have one factor authentication.

      If you cared enough about the authentication in the first place to bother with 2FA, then I guess it seems like the reduction there is still something to be worried about, right?

      Lots of “two factor authentication” schemes seem to involve just getting a text or something, so, not very secure at all. Of course, this is bad 2FA, but it is popular.

      3 replies →

    • Now that I know about the existence of this generation of acoustic attacks I would like to have the possibility to insert a second "master password" different from the main one, that instead of letting me directly access to my passwords just allows me to use fingerprint to get them. Guess if it's already possible

I think maybe you wouldn't even need to see the keystrokes. Given enough examples of just audio, I wonder if you could work out the keys using the statistical letter patterns in language.

And there are therefore millions of hours of video that could be attack surface area already in the wild

for a few years I've used rtx voice to remove keyboard typing and other background noise