This is a revisit to the previous posts about YARA and its capabilities.
Tools Deep Dive: YARA Part ll
This is a follow up to the previous post about YARA, the versatile pattern matching tool.
If you haven’t read any of parts I through part III, a good mental model for what YARA can do is the following.
What regex is for text, YARA is for files
What Sigma is for logs, YARA is for files
It really is a versatile tool, which you can do much more than scan for malware.
You can use YARA for any of the following
detecting a specific sample of malware
detecting a specific class of malware
scanning email attachments
testing for certain strings in memory
utilities that need to verify file formats
Even when it’s not about malware, YARA can be useful for most things file related.
When it comes to doing any meaningful amount of testing or running YARA at scale, you’re only as good as your data.
As mentioned previously, some ideas for building a malware corpus
VirusTotal
AnyRun
Vx-Underground
This last option has some large datasets if you go to the Yearly Archives
Building “Goodware” AKA Clean samples
Microsoft, or Ubuntu ISO’s
Empty files
These are just some ideas to form your datasets.
Here’s an example rule
rule UserCreationThenDeletion {
meta:
events:
$create.target.user.userid = $user
$create.metadata.event_type = "USER_CREATION"
$delete.target.user.userid = $user
$delete.metadata.event_type = "USER_DELETION"
$create.metadata.event_timestamp.seconds <=
$delete.metadata.event_timestamp.seconds
match:
$user over 4h
condition:
$create and $delete
}
This rule looks for users that have been created and then deleted within a 4 hour time period.
Deep Dive on a Rule
Let’s take a closer look at a sample rule.
This is a rule looking to match on malware that masquerades itself to look as a legitimate Windows system binary. This would be a MITRE Technique of Masquerading
In this case, the rule is looking to match on a specific sample of the threat actor Ferocious Kitten.
This rule is looking to do the following
defines strings for bitsadmin commands
looking for a count of 0 signatures (unsigned)
looking for any of the bitsadmin defined strings
In the strings section, you can see it’s looking for the 2 bitsadmin commands in the contents of the file, in this case any of the two.
It’s then looking for the file to be unsigned in the condition section.
Now, there could be signed malware of course, but this is a simple example rule to demonstrate the concept.
Let’s take a look at a different rule this time looking to detect behavior, rather than matching on a specific sample.
This one is has a bit more content to it, but in a nutshell is looking to do the following
defining strings for legitimately Microsoft signed files
a regex pattern to match on a given file path
a PE header check
printing to stdout file name
excluding the previously defined strings for Microsoft signed files
Is this rule perfect? No, but it will catch many instances of malware masquerading itself rather than the one previously shown.
Best Practices
Some good practices to keep in mind.
A good way to limit the scope of your rule to a specific filetype, is with the header.
Take this condition from the above example rule
condition:
uint16(0) == 0x5a4d
This is checking for the PE MZ header in Windows files.
Or take this one, checking for the ELF header in Linux.
condition:
uint32(0) == 0x464C457F
For more on this, see this resource
https://www.optiv.com/insights/source-zero/blog/selective-yara-scanning-whats-your-type
This accomplishes two things: you narrow the scope of your rule for higher fidelity, and your rule doesn’t alert on itself (Yes, that’s a thing).
If you’re testing across a large dataset, you don’t want to be matching a string on filetypes that you are not even looking for.
Another best practice is to use modules when possible. For example, in the above example
import "pe"
would be the first line of the rule.
And then when calling it, pe.version_info will access the version information for that PE (Portable Executable) file.
Using modules is a clean and effective way to get more of the capabilities out of YARA for your rules.
This makes rule writing easier.
New Beginnings
There are always new developments being made in Security, and YARA is no exception.
Here are two big adaptations of YARA.
This is a detection and threat hunting language used for Google Security Operations (formerly Chronicle).
It is used for analyzing large volumes of log data (rather than file data) through pattern matching.
It then is used to create detections from the findings in the log data.
A lot of rich features, but tied to Google Security Operations.
This was developed by the original YARA team at VirusTotal. It has been tested across millions of files and is ready for production use.
They’re encouraging users to switch to YARA-X as no new features will be added to YARA.
Some new features for YARA-X are better error reporting and performance. This will come in handy when working with large files or filesystems (recursive)
What I Read This Week
Finding vulnerabilities in modern web apps using Claude Code and OpenAI Codex
Findings on how well these AI Agents are at finding vulnerabilities and their false positive rate, across 11 open source projects
“Running the exact same prompt on the exact same codebase multiple times often yielded vastly different results”
Sounds about right. I’ve found that most people that praise AI tools as replacing humans, haven’t actually gotten in the weeds with them
Malvertising Campaign on Meta Expands to Android, Pushing Advanced Crypto-Stealing Malware
75 localized ads part of a Mobile malware campaign
One of the ads was even paired with an image of a Labubu
The links redirect to a malicious cloned version of the app
Supply Chain Security Alert: Popular Nx Build System Package Compromised with Data-Stealing Malware
More of the same, with a new twist. Supply chain attacks, but this time with LLM agents
“The first known case where malware harnessed developer-facing AI CLI tools”
Threat Intelligence Case Study: Dissecting a Multi-Stage Phishing Campaign Against YouTube Creators
It’s always good to read how someone dealt with a scam attempt and get that perspective
Some notes from the walkthrough, sometimes you can’t replace intuition.
You can have the tools at your disposal but when something feels off, you have to recognize it, even if the phishing lure is convincing
The Ongoing Fallout from a Breach at AI Chatbot Maker Salesloft
More on last week’s story and its continued aftermath
The massive data haul is reported to include credentials such as AWS keys, VPN credentials, Salesforce and Snowflake creds
Wrapping Up
In this post, we refreshed on YARA’s capabilities, use cases for the tool, its new developments, and dove into the details behind a rule and what it would detect.
See you in the next one.