Home

Awesome

Detection and Response Pipeline

✨ A compilation of suggested tools for each component in a detection and response pipeline, along with real-world examples. The purpose is to create a reference hub for designing effective threat detection and response pipelines. 👷 🏗

Join us, explore the curated content, and contribute to this collaborative effort.

Contents

Main Components of a Detection & Response Pipeline:

  1. 📦 Detection-as-Code Pipeline
  2. 🪵 Data Pipeline
  3. ⚠️ Detection and Correlation Engine
  4. ⚙️ Response Orchestration and Automation
  5. 🔍 Investigation and Case Management

💡 Real-world Examples

📑 Additional Resources

Detection-as-Code Pipeline

Tool / ServicePurpose
GitHubDetection content development
GitLabDetection content development
GiteaDetection content development
AWS CodeCommitDetection content development
GitHub ActionsCI/CD pipeline
GitLab RunnerCI/CD pipeline
DroneCI/CD pipeline
AWS CodePipelineCI/CD pipeline

Resources

Data Pipeline

Tool / ServicePurposeDeployment
SubstationData movement and transformationSelf-hosted (Open Source)
VectorData movement and transformationSelf-hosted (Open Source)
TenzirData movement and transformationSelf-hosted (Open Source)
Fluent BitData movement and transformationSelf-hosted (Open Source)
LogstashData movement and transformationSelf-hosted (Open Source)
AirbyteData movement and transformationSelf-hosted (Open Source) and Cloud
Cribl StreamData movement and transformationSelf-hosted (Free), Hybrid and Cloud
TarsalData movement and transformationCloud
KafkaStream processingSelf-hosted (Open source) and Cloud (Confluent)
Amazon Kinesis Data StreamsStream processingCloud
Apache SparkStream and batch processingSelf-hosted (Open source)
DatabricksStream and batch processingCloud
Google Cloud DataFlowStream and batch processingCloud
Apache FlinkStream and batch processingSelf-hosted (Open source)
Apache NiFiStream and batch processingSelf-hosted (Open source)
Apache BeamStream and batch processingOpen source; Self-hosted or cloud-based runner
FaustStream and batch processingSelf-hosted (Open source)

Detection and Correlation Engine

In addition to the stream and batch processing tools mentioned in the data pipeline section, the following tools can be used for data analysis and detection.

Tool / ServiceDescription
Elasticsearchwith ElastAlert2 or Elastic $ecurity
OpenSearchwith ElastAlert2 or OpenSearch Security Analytics
Amazon Kinesis Data AnalyticsStreaming data analysis in real time using Apache Flink
MatanoOpen source security lake platform for AWS
ksqlDBSQL-Based Streaming for Kafka
StreamAlertReal-time data analysis and alerting framework

Response Orchestration and Automation

Tool / ServiceDescriptionDeployment
n8nA free and source-available workflow automation toolSelf-hosted (Source available) and Cloud
ShuffleA general purpose security automation platformSelf-hosted (Open source) and Cloud
TinesNo-code automation for security workflowsSelf-hosted ($) and Cloud
TorqNo-code hyperautomation for security workflowsCloud

Investigation and Case Management

Tool / ServiceDescriptionDeployment
DFIR IRISOpen-Source Collaborative Incident Response PlatformSelf-hosted (Open source)
TheHiveOpen Source and Free Security Incident Response PlatformSelf-hosted (Open source)
GitHubGitHub issues can be used for case management. Check out the video in the Resources section.Cloud
Jira Service ManagementIT service management platform with incident management featuresCloud
Tines CasesCloud
Torq Case ManagementCloud

Resources:

Real-world Examples

Please note that this information is extracted from public blog posts and conference talks, and may not be comprehensive or reflect the current state of the companies' pipelines. Some examples may focus on specific components, such as the correlation engine, rather than covering the entire pipeline. These examples are intended as starting points, so please view them as informative rather than definitive solutions.

If you have additional information or insights about any of the examples included here and have permission to share them, we encourage you to contribute by sending a pull request to enhance or add more details.

#Technologies / ComponentsNoteReferences
0• Databricks <br>• Apache Spark <br>• Delta Lake <br>• Scala"Apple must detect a wide variety of security threats, and rises to the challenge using Apache Spark across a diverse pool of telemetry. Some of the home-grown solutions we’ve built to address complications of scale: <br>1. Notebook-based testing CI – Previously we had a hybrid development model for Structured Streaming jobs wherein most code would be written and tested inside of notebooks, but unit tests required export of the notebook into a user’s IDE along with JSON sample files to be executed by a local SparkSession. We’ve deployed a novel CI solution leveraging the Databricks Jobs API that executes the notebooks on a real cluster using sample files in DBFS. When coupled with our new test-generation library, we’ve seen 2/3 reduction in the amount of time required for testing and 85% less LoC. <br>2. Self-Tuning Alerts – Apple has a team of security analysts triaging the alerts generated by our detection rules. They annotate them as either ‘False Positive’ or ‘True Positive’ following the results of their analysis. We’ve incorporated this feedback into our Structured Streaming pipeline, so the system automatically learns from consensus and adjusts future behavior. This helps us amplify the signal from the rest of the noise. <br>3. Automated Investigations – There are some standard questions an analyst might ask when triaging an alert, like: what does this system usually do, where is it, and who uses it? Using ODBC and the Workspace API, we’ve been able to templatize many investigations and in some cases automate the entire process up to and including incident containment. <br>4. DetectionKit – We’ve written a custom SDK to formalize the configuration and testing of jobs, including some interesting features such as modular pre/post processor transform functions, and a stream-compatible exclusion mechanism using foreach Batch."1. Scaling Security Threat Detection with Apache Spark and Databricks by Josh Gillner (Apple Detection Engineering) <br> 2. Threat Detection and Response at Scale by Dominque Brezinski (Apple)
1• Kafka <br>• Apache Spark <br>• Apache Hive <br>• Elasticsearch<br>• GraphQL<br>• Amazon S3<br>• Slack<br>• PagerDutyA SOCless Detection Team at Netflix by Alex Maestretti (Netflix)
2• Kafka <br>Apache Samza <br>• Microsoft Sentinel? <br>KQL <br>• Azure Pipelines and Repos for CI/CD pipeline <br>• Jira <br>• ServiceNow <br>• Serverless functionshigh-level strategy <br> <br> Simplified data collection pipeline <br>(Re)building Threat Detection and Incident Response at LinkedIn by Sagar Shah and Jeff Bollinger (Linkedin)
3go-audit <br>• Elasticsearch <br>ElastAlert[0]"We send the events to an Elasticsearch cluster. From there we use ElastAlert to query our incoming data continuously for alert generation and general monitoring."Syscall Auditing at Scale by Ryan Huber (Slack)
4• Kafka <br> • Jupyter notebook <br> • Python <br> • osquery, Santa, and OpenBSM/Audit for MacOS monitoring"Alertbox was the first project we built to start cutting down on our triage time. The goal was to move our alert response runbooks into code, and have them execute before we even begin the triage process. <br> Think of Forerunner as the glue between Alertbox and Covenant. When an alert fires, Alertbox calls out a RPC service called Forerunner. This service returns a Jupyter notebook corresponding to the alert. Alertbox then embeds the URL of this Jupyter notebook into the alert ticket. In the background, Forerunner also runs this alert notebook asynchronously." <br> 1. How Dropbox Security builds tools for threat detection and incident response by Dropbox DART <br> 2. MacOS monitoring the open source way by Michael George (Dropbox) <br> 3. [OLD] Meet Securitybot: Open Sourcing Automated Security at Scale by Alex Bertsch (Dropbox) and Distributed Security Alerting by Ryan Huber (Slack)
5StreamAlert<br>BinaryAlert- "StreamAlert is a serverless, real-time data analysis framework which empowers you to ingest, analyze, and alert on data from any environment, using data sources and alerting logic you define. Computer security teams use StreamAlert to scan terabytes of log data every day for incident detection and response."<br>- "BinaryAlert is an open-source serverless AWS pipeline where any file uploaded to an S3 bucket is immediately scanned with a configurable set of YARA rules. An alert will fire as soon as any match is found, giving an incident response team the ability to quickly contain the threat before it spreads."1. StreamAlert: Real-time Data Analysis and Alerting by Airbnb Eng <br> 2. BinaryAlert: Real-time Serverless Malware Detection by Austin Byers (Airbnb)
6• ELK stack <br> • Kafka <br>KSQL <br>ES-Hadoop <br> • ElastAlert[0] <br> • Apache Spark <br> • Jupyter notebook <br> • GraphFrames"The Hunting ELK or simply the HELK is one of the first open source hunt platforms with advanced analytics capabilities such as SQL declarative language, graphing, structured streaming, and even machine learning via Jupyter notebooks and Apache Spark over an ELK stack. This project was developed primarily for research, but due to its flexible design and core components, it can be deployed in larger environments with the right configurations and scalable infrastructure." <br> The Hunting ELK project by Roberto Rodriguez
7• AWS Kinesis Firehose <br> • AWS Kinesis Data Analytics Application <br> • AWS Lambda <br> • AWS S3 <br> • AWS Athena <br> • AWS Simple Notification Services"In this example, various AWS serverless application services are used together to create a detection pipeline that is capable of near-realtime detection. The pipeline requires no administrative overhead of servers or container infrastructure, enabling a detection and response team to focus on threat detection capabilities." <br> Building a Serverless Detection Platform in AWS Pt. I: Endpoint Detection by Brendan Chamberlain
  1. ElastAlert is no longer maintained. You can use ElastAlert2 instead.

Additional Resources

License

CC0

To the extent possible under law, Adel "0x4D31" Karimi has waived all copyright and related or neighboring rights to this work.