Deep Learning Acoustic Attacks: Can AI Hear Your Keystrokes?
Deep Learning Acoustic Attacks: can AI hear your keystrokes?
Can machine learning algorithms detect which keys youre pressing on a laptop based off of their sound?
What researchers have named acoustic side channel attacks caught my eye recently and I needed to read more as to what this could mean going forward.
I remember a time when I would think we would live in a future where our passwords would be cadence based or sound based, instead of typing a sequence of letters, numbers, and special characters. (Think playing a sequence in a specific cadence on Guitar Hero) This definitely strikes a resemblance.
Let’s get into the study.
Maryam Mehrnezhad, researcher and professor from the Royal Holloway, University of London, Eshan Toreini, University of Surrey, and Joshua Harrison from Durham University, authored this study. The study was conducted, using a Macbook Pro as the test device, pressing each of the 36 keys (0-9, a-z).
The sounds of the keys were recorded both with a smartphone placed near the keyboard, and over a Zoom call.
At the conclusion of the study, the models detected the keys being pressed on the laptop keyboard with more than a 90% accuracy (crazy), just based on sound recordings.
When testing with a smartphone’s microphone, the researchers classified laptop keystrokes with an accuracy of 95%. When using Zoom for recording, the accuracy was 93%.
An interesting tidbit in this is that, as a defense mechanism against these kinds of attacks, they recommend leveraging the shift key. Due to the shift key being quieter in comparison to other keys in standard keyboards.
“It’s very hard to work out when someone lets go of a shift key,” said Harrison.
Another interesting piece is the 2% difference in accuracy between the two methods.
According to the paper, keystroke isolation was more challenging with Zoom due to noise suppression. Think about all the times you’ve used Noise Suppression features to dim out your kids in the other room, your dog, etc.
One thing to note is that this kind of research is not new, with this paper dated back in 2004. However, specifics here (No usage of Language Models) and the accuracy is novel.
This really makes you think from a threat model perspective.
When typing a password, say at an office or any public place, people will generally guard their keyboard or use password masking with *, but will do little to obfuscate the keyboard's sound.
Mitigations
Due to the attack vector being your password, you can negate any chance at being a victim to this attack by removing the password altogether. You can do this upon login by leveraging TouchID/Windows Hello.
We have yet another reason to adopt the strongest form of two factor authentication.
FIDO, for example, is becoming the standard at the enterprise level, with many companies having migrated over to this model of two-factor authentication.
Here are some examples of companies doing so and their learnings. If billion dollar companies are doing this to protect their employees, don’t you think you should as well?
Discord
Figma
Google
Aside from using security keys, use strong passwords. Until we are fully phased out of passwords it is important to not use simple guessable ones in practice.
In summary put in place the following to be protect yourself
Use TouchID/Windows Hello to Login to your laptop
Use security keys as your 2nd factor when possible
Use strong passwords/passphrases
Here’s a link to the study in its entirety
https://arxiv.org/abs/2308.01074
That being said, not all is bad when it comes to AI and Cybersecurity, And these don’t always have to be opposing forces.
Here’s an example of a use case that is in production.
Think of it as Threat Analysis and Response being supercharged. Although performing these kinds of automation actions are not new, being done so in this manner is.
As we tread into this interconnected future, we need to stay on top of technological advances and leverage the technology to safeguard our data.