Nuts and Bolts of Detection Engineering: Open Source Edition

Sep 19, 2024

We’re going to walk through what Detection Engineering could look like, with open source options along the way.

Before diving into the technical aspects, let’s lay the groundwork. It’s good to have a strong foundation, before diving too deep into the weeds.

Define clear objectives: Outline what you want to achieve with your program. This could include reducing false positives or expanding coverage of MITRE ATT&CK techniques, etc.

Identify key stakeholders: Involve relevant teams such as Security analysts, Incident Response, Threat Intelligence, and IT operations(for response processes) to ensure comprehensive input.

Create a detection lifecycle: Develop a process for creating, testing, implementing, and maintaining detections. This could look like the following.

Implement a Detection Development Process

A structured approach to developing detections is crucial for consistency and efficiency.

Threat modeling: Analyze potential threats specific to your organization and prioritize detection efforts accordingly.

Research and design: Investigate threat actor TTPs and design detections around what can effectively identify them.

Development: Write the actual detection logic, whether it's a SIEM rule, EDR query, or a custom script.

Testing: Validate the detection against both true positive and true negative scenarios to ensure accuracy.

Deployment: Implement the detection in your production environment.

Tuning and maintenance: Continuously monitor and adjust detections based on performance and emerging threats. This will be a continuous part of the journey.

There have been many pieces written on this and its benefits. Here’s one blog post for reference.

Leverage Open-Source Tools

Several open-source tools can significantly enhance your detection engineering efforts.

Let’s take a look at one for each step of a detection workflow, from data ingest to case management.

Data Quality

The effectiveness of your detections relies heavily on the quality and consistency of your log data. At the end of the day, if you don’t have the right data, the rest of the job is tough/impossible to do.

Implement a centralized logging solution: When going the open-source route, you can use tools like ELK Stack (Elasticsearch, Logstash, Kibana) to aggregate and normalize logs from various sources.

Here’s what the flow of the data could look like.

https://www.elastic.co/guide/en/beats/libbeat/current/beats-reference.html

Data parsing and enrichment pipelines: Some options are to use Logstash or Apache Nifi to stream, parse, and normalize your log data before it reaches your SIEM or detection engine. You can also opt to normalize fields once in the SIEM through custom parsers. (Regex, anyone)

Either way, one of the reasons to do this would be able to correlate different logs in detections.

Detections

Sigma (https://github.com/SigmaHQ/sigma)

A generic signature format that allows you to describe relevant log events in a straightforward manner. Sigma rules can be converted into various SIEM or log management formats.

Here’s a blog post I wrote about Sigma and everything it can do for you.

How to use: Develop Sigma rules for common attack patterns and convert them to your specific SIEM or log management solution's format. Start with the Sigma repo’s “rules” directory for inspiration.

MISP (https://www.misp-project.org/)

An open-source threat intelligence platform for sharing, storing, and correlating Indicators of Compromise (IOCs).

How to use: Integrate MISP with your SIEM or EDR solution to automatically update your detections with the latest threat intelligence.

There are arguments for having a threat feed directly feeding into your tools or not, but I’ll save that for another post.

TheHive (https://github.com/TheHive-Project/TheHive)

A scalable, open-source security incident response platform that can be integrated with other security tools. They have had changes in recent years to further scale, going with a freemium model, but still have the previous version open source.

How to use: Once you have your data in, and are writing detections, you can use TheHive to manage your workflow, from initial alert triage to investigation and response. Case management is something every security team needs, and the security community has held TheHive highly for years on this.

Implement C/CD Deployment

As alluded to earlier, treat your detection rules as code by implementing a Continuous Integration/Continuous Deployment (CI/CD) pipeline.

Version control: Use Git to manage your detection rules, allowing for collaboration and version tracking.
Deployment: Use tools like Jenkins or Puppet to implement CI/CD and automate the deployment of new or updated detection rules across your environment.

Establish Metrics and Continuous Improvement

To ensure you’re tracking the right things, here are some ideas for metrics, with the objective of continuous improvement.

Define key metrics: Examples include false positive rate, mean time to detect (MTTD), or coverage of MITRE ATT&CK techniques.
Regular review sessions: Conduct periodic reviews of your detection performance and discuss your strategy accordingly. This could be bi-weekly, at the end of each on call handoff, or whatever makes the most sense for your team.
Encourage feedback: Create a feedback loop with your analysts to continuously improve detection quality.
- Note: Some companies operate in a more flat structure, where everyone is both analyst and engineer. In that case, the feedback loop concept remains the same.

Here’s a post on what kind of metrics we can focus on.

Documentation

This is one that isn’t always as popular, as some like to say certain things are self-documenting.

But we’ve all been there, working an incident, staring at unfamiliar data and systems., and piecing it together bit by bit. If only there was more documentation.

By having adequate documentation you’ll save others, and your future self a lot of resources. Detailed runbooks and repo readme’s fall under this category.

Conclusion

Building robust detection is an ongoing process that requires dedication, technical expertise, and a commitment to continuous improvement.

Leveraging open-source tools allows for flexibility and gives you the capabilities needed to detect and respond.

The key to success lies between automation and human expertise. The robots aren’t taking over just yet.

With persistence and the right approach, you'll be on your way to building a top detection engineering program that keeps your organization on top of potential threats.

Signal Edge

Sep 25Edited

Thanks for raising the point about building a DE framework around FOSS ! There's a lot of good stuff out there that can be coupled into something custom and efficient. Some field notes:

- For CI/CD, Threat Modelling, Threat Intelligence Ingestion, Detection Modelling, and the actual framework for developing in detection-as-code - OpenTIDE is at the moment the only FOSS option - if not you need to build everything from the ground up. https://code.europa.eu/ec-digit-s2/opentide/coretide. Disclaimer : I am the maintainer of OpenTIDE.

- I would not recommend The Hive as an OS project. In the past it was a very easy recommendation, but they went full commercial with version 5, and the open source project is now effectively abandoned. In Western EU (since the Hive was very very much a French/Belgium project), there has been interest to still develop a FOSS option, and there is https://dfir-iris.org/ as the main alternative. At this point however, I would probably recommend to use a ticketing interface and whatever low/no-code open source framework possible. There is also Tracecat as a open source/commercial option (https://tracecat.com/), or https://github.com/Admyral-Security/admyral. Still very early though. Catalyst SOAR is another option https://github.com/SecurityBrewery/catalyst, which looks very promising if a little light on the actual automation.

- If you recommend Elastic, I would then not necessarily point just to Sigma, but also to https://github.com/elastic/detection-rules . A lot of it should be good starting content, even if it's not as turn key as paying for Elastic SIEM and enabling them.

- If you need a single platform, it's hard not to recommend Security Onion. https://github.com/Security-Onion-Solutions/securityonion . It comes with so many tools in a single system AND it actually has a case management interface.

- MISP is a fantastic project, but very hard to manage for many many teams. Most larger units will use a commercial TIP, which interfaces with MISP. Other FOSS TIPs like https://yeti-platform.io/ , https://github.com/OpenCTI-Platform/opencti or others may work better.

Expand full comment

1 reply by Danny

1 more comment...

Danny's Newsletter

Discussion about this post