Home

Awesome

ThreatHunting-Keywords

๐ŸŽฏ List of keywords for ThreatHunting sessions

image

Table of Contents

What is Threat Hunting ?

Threat hunting is a proactive and iterative approach to detecting malicious activities within an organization's network or systems that may have bypassed automated security measures. Unlike reactive investigations triggered by security alerts, threat hunting is driven by threat intelligence (TI)-driven checks and hypotheses derived from systematic and opportunistic analysis. These hypotheses help hunters uncover unknown threats, potential threats, or known threats that may have evaded security detections, as well as vulnerabilities or indicators of compromise (IoCs) that automated systems might miss or exclude. The process also focuses on identifying precursors to alerts/dashboards and improving SOC/triage workflows while also contributing to shadow asset inventory management and escalates low/mid-fidelity events that require further investigation. The primary goal is to identify the tactics, techniques, and procedures (TTPs) used by threat actors, enhancing the organizationโ€™s ability to preemptively detect and mitigate potential attacks.

Threat Hunting Lifecycle in SOC Operations

My process suggestion to organizing partially automated threat hunting sessions to maintain high-quality detection rules within a SOC

image

Files

For the blueteam:

The ThreatHunting-Keywords Lists can be valuable for Threat Hunters, SOC and CERT teams for static analysis on SIEM as it assists in identifying threat actors (or redteamers ๐Ÿ˜†) using default configurations from renowned exploitation tools in logs. It differs from IOC feeds in its enduring relevance: the keywords here have no 'expiration dates' and can detect threats years after their inclusion, they are flexible accepting wildcard and non sensitive case matches and only focused on default keywords.

Primarily designed for Threat Hunting, this list can be useful in complex scenarios. Whether you have access to a SIEM you don't manage, with unparsed data, or if you're part of a SOC team with a well-managed SIEM, the examples provided here can expedite the process of detecting malicious activity without the necessity to parse anything. If your logs are already parsed, this list can be used to match fields within your data, potentially transforming into a detection rule based on the keyword type category you select, provided the false positive rate is sufficiently low.

โš ๏ธ Not everything can be added in this list, we do not make complex behavior detections here, only simple keywords detections in fields or raw logs, aimed to detect default configurations

โš ๏ธ A lot of tools in the list have dedicated detection rules, correlating events with thresholds and unique process relations... we will not cover all the possible detections for a tool here, only keywords detections

If you're part of a Security Operations Center (SOC) and are managing hundreds of detection rules that rely solely on simple keyword detections without any field or event correlation, consider rethinking your approach. In my opinion, these should not constitute individual detection rules. Instead, they might be better suited to a consolidated list like this one, although implementation might be more challenging if you're not using a platform like Splunk.

This approach encourages the creation of high-quality, purposeful rules while keeping your simple field keyword detections organized and manageable in one place. The end result? One comprehensive detection rule that covers all of them. This streamlines your process and optimizes your detection capabilities.

For Incident Responders, you can use this list during your investigation on raw logs or files to quickly identify known exploitation tools with the yara rules Yara Rules, a powershell script or by quickly ingesting your logs in splunk with Splunk4DFIR

For the redteam:

To evade detection by simple keyword detection, it is critical to recompile and rename all custom strings, class or function names, variable names, argument names, executable names, default user-agents, certificates, or any other strings that could potentially be associated with the tools you are using during your operation. Employ the most common names for everything to blend in with normal traffic. Scripts located here can assist you in identifying some of these.

However, if you're developing public "red team tools", consider aiding the blue team by using distinct names. Employ a default configuration with an exotic port, custom certificates, unique user-agents, specific function names and arguments that aren't common. This assists in creating a clear signature that can be used for simple keyword detections, so the blueteam can at least easily detect the script kiddies.

Content of the Threat Hunting Keywords File:

Use the List to hunt with Splunk:

transforms.conf

[threathunting-keywords]
batch_index_query = 0
case_sensitive_match = 0
filename = threathunting-keywords.csv
match_type = WILDCARD(keyword)

Example use cases with threathunting-keywords:

image

Hunt all the keywords in raw logs ๐Ÿ˜ฑ

`myendpointslogs` 
| lookup threathunting-keywords keyword as _raw OUTPUT keyword as keyword_detection metadata_keyword_type metadata_tool metadata_description metadata_tool_techniques metadata_tool_tactics metadata_malwares_name metadata_groups_name metadata_category	metadata_link	metadata_enable_endpoint_detection metadata_enable_proxy_detection metadata_comment
| search metadata_description!="" AND metadata_enable_endpoint_detection=1
| stats count earliest(_time) as firsttime latest(_time) as lasttime values(_raw) as raw by metadata_keyword_type keyword_detection index sourcetype 
| convert ctime(*time)

Send the job to background and keep the job ID.

image

Filter the result

| loadjob 1684146257.1495958
| search
  NOT (keyword_detection IN ("fixme","fixme","fixme")) 
  NOT (metadata_keyword_type IN ("fixme","fixme"))
  NOT (raw IN ("fixme","fixme","fixme"))

Exclude the required keywords, raw text or keyword types. if if decide to exclude greyware tool keyword type (legitimate tools keywords that are abused by attackers) because this environment has too many results for this kind of tools, we have two options:

`myendpointslogs` 
| lookup threathunting-keywords keyword as _raw OUTPUT keyword as keyword_detection metadata_keyword_type metadata_tool metadata_description metadata_tool_techniques metadata_tool_tactics metadata_malwares_name metadata_groups_name metadata_category	metadata_link	metadata_enable_endpoint_detection metadata_enable_proxy_detection metadata_comment
| search metadata_description!="" metadata_keyword_type="offensive tool keyword" metadata_enable_endpoint_detection=1
| stats count earliest(_time) as firsttime latest(_time) as lasttime values(_raw) as raw by metadata_keyword_type keyword_detection index sourcetype 
| convert ctime(*time)

I added a metadata_keyword_type="offensive tool keyword" to only focus on offensive tools that i am sure are used by malicious actors

| loadjob 1684146257.1495958
| search metadata_keyword_type="offensive tool keyword"

So that was our Use case to search on raw logs in endpoint logs, if we want to search the keywords for network logs (anything that can log a query or an url), we simply change it to:

`mynetworklogs` 
| lookup threathunting-keywords keyword as _raw OUTPUT keyword as keyword_detection metadata_keyword_type metadata_tool metadata_description metadata_tool_techniques metadata_tool_tactics metadata_malwares_name metadata_groups_name metadata_category	metadata_link	metadata_enable_endpoint_detection metadata_enable_proxy_detection metadata_comment
| search metadata_description!="" AND metadata_enable_proxy_detection=1
| stats count earliest(_time) as firsttime latest(_time) as lasttime values(_raw) as raw by metadata_keyword_type keyword_detection index sourcetype 
| convert ctime(*time)

Now it is the same as the first search but i changed the datasource for mynetworklogs and added metadata_enable_proxy_detection=1 to match the relevant keywords for networklogs (better have proxy and DNS logs for this)


Hunt the keywords in other fields ๐Ÿ™‚ (url,process,commandline,query...):

Match only on url field:

`mynetworklogs` url=*
| lookup threathunting-keywords keyword as url OUTPUT keyword as keyword_detection metadata_keyword_type metadata_tool metadata_description metadata_tool_techniques metadata_tool_tactics metadata_malwares_name metadata_groups_name metadata_category	metadata_link	metadata_enable_endpoint_detection metadata_enable_proxy_detection metadata_comment
| search metadata_description!="" AND metadata_enable_proxy_detection=1
| stats count earliest(_time) as firsttime latest(_time) as lasttime values(url) as url by src_ip metadata_keyword_type keyword_detection index sourcetype 
| convert ctime(*time)

Match only on query field:

`mynetworklogs` query=*
| lookup threathunting-keywords keyword as query OUTPUT keyword as keyword_detection metadata_keyword_type metadata_tool metadata_description metadata_tool_techniques metadata_tool_tactics metadata_malwares_name metadata_groups_name metadata_category	metadata_link	metadata_enable_endpoint_detection metadata_enable_proxy_detection metadata_comment
| search metadata_description!="" AND metadata_enable_proxy_detection=1
| stats count earliest(_time) as firsttime latest(_time) as lasttime values(query) as query by src_ip metadata_keyword_type keyword_detection index sourcetype 
| convert ctime(*time)

Match on multiple fields at the same time, example for endpoint logs:

`myendpointslogs` 
| eval myfields=mvappend(service, process, process_command, parent_process, parent_process_command, grand_parent_process, grand_parent_process_command, file_path, file_name)
| lookup threathunting-keywords keyword as myfields OUTPUT keyword as keyword_detection metadata_keyword_type metadata_tool metadata_description metadata_tool_techniques metadata_tool_tactics metadata_malwares_name metadata_groups_name metadata_category	metadata_link	metadata_enable_endpoint_detection metadata_enable_proxy_detection metadata_comment
| search metadata_description!="" AND metadata_enable_endpoint_detection=1
| stats count earliest(_time) as firsttime latest(_time) as lasttime values(process) values(service) values(process_command) values(file_name) values(file_path) values(parent_process) values(parent_process_command) values(grand_parent_process) values(grand_parent_process_command) by metadata_keyword_type keyword_detection index sourcetype 
| convert ctime(*time)

Speed:

If the speed is a concern or you're planning to implement this as a scheduled detection rule, you might want to consider splitting the lookup into diffent lookups by choosing the metadata_keyword_type or metadata_tool column you want to use.

Note that filtering using the search command after the |lookup doesn't expedite the search process. If you want to concentrate on a specific portion of the lookup without dividing it, you should use the |inputlookup command along with the where clause. While this method may consume more CPU resources, it generally results in faster execution. For more details, check out the Splunk documentation on inputlookup: https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Inputlookup

With ELK:

pngwing com

If you are working with the Elastic Stack, there is a lot of restrictions for lists (you cannot use special caracters, spaces ...), you have 3 options:

Dashboard Example

image

Splunk4DFIR

Another example of using the project csv files with splunk to hunt in DFIR artifacts and logs: https://github.com/mf1d3l/Splunk4DFIR image

Other awesome lists for detection

I keep some relevant artefacts in separated lists, these lists are more precise and can be used in detection rules, they are available in this github repo you will find:

Check out these Guides to use some of the lists:

... more here

DFIR Hunt for keywords in files (No SIEM)

After conducting a thorough review of various tools, I discovered that ripgrep significantly outperforms its competitors when it comes to rapidly matching an extensive list of regex patterns against each line of a large log file or even multiple files simultaneously. It proved to be the most efficient solution for handling massive amounts of data, providing unparalleled speed and flexibility.

Hunt for evil in log file(s) with Ripgrep and the 'only_keywords_regex.txt' list

rg.exe -f .\only_keywords_regex.txt .\EvtxECmd_Output.csv --multiline --ignore-case

image

image

Better option for very large files (on windows):

DFIR_hunt_in_file.ps1

powershell -ep Bypass -File .\DFIR_hunt_in_file.ps1 -patternFile "only_keywords_regex.txt" -targetFile "C:\Users\mthcht\collection\20230406154410_EvtxECmd_Output.csv" -rgPath "C:\Users\mthcht\Downloads\ripgrep-13.0.0-x86_64-pc-windows-msvc\ripgrep-13.0.0-x86_64-pc-windows-msvc\rg.exe"

content of the powershell script (included in the repo):

param (
    [Parameter(Mandatory=$true)]
    [string]$patternFile,
    [Parameter(Mandatory=$true)]
    [string]$targetFile,
    [Parameter(Mandatory=$true)]
    [string]$rgPath
)

Start-Transcript -Path "$PSScriptRoot\result_search.log" -Append -Force -Verbose

$totalLines = (Get-Content $patternFile | Measure-Object -Line).Lines
$currentLine = 0
Get-Content $patternFile | ForEach-Object {
    $currentLine++
    Write-Host "Searching for pattern $currentLine of $totalLines : $_"  
    & $rgPath --multiline --ignore-case $_ $targetFile | Write-Output 
}

Stop-Transcript -Verbose

The result of the search will be in result_search.log in the same directory as the script.

image

Better option for verylarge files (on linux):

todo

Hunt for evil in file only with powershell and the 'only_keywords.txt ' list (slower not recommmanded)

In powershell it's much slower but if you still want to do it this way, you can use the script below, it will tell you the line number matched and the corresponding keyword:

powershell.exe -ep Bypass -File .\hunt_keywords_windows.ps1 -k .\only_keywords.txt -f .\EvtxECmd_Output.csv

<details>
param(
    [Parameter(Mandatory=$true)]
    [string]$file,
    
    [Parameter(Mandatory=$true)]
    [string]$kw
)

$Keywords = Get-Content $kw
$result = @()

foreach ($Keyword in $Keywords) {
    $SearchTerm = $Keyword.Replace("*", ".*")
    $SearchTerm = [Regex]::Escape($SearchTerm).Replace("\.\*", ".*")
    
    $reader = New-Object System.IO.StreamReader($file)
    $lineNumber = 0
    while (($line = $reader.ReadLine()) -ne $null) {
        $lineNumber++
        if ($line -match $SearchTerm) {
            $result += New-Object PSObject -Property @{
                'Keyword' = $Keyword
                'LineNumber' = $lineNumber
                'Line' = $line
            }
        }
    }
    $reader.Close()
}

$result | Out-GridView
Read-Host -Prompt "Press Enter to exit"
</details>

YARA Rules

image

All the detection patterns of this project are automatically exported to yara rules in ThreatHunting-Keywords-yara-rules

Some hunting example with the yara rules: 2023-10-20 20_23_59-(1) mthcht on X_ _The #ThreatHunting Keywords project is slowly progressing, alm

2023-10-20 20_14_17-C__Users_Public_Pictures

2023-10-21 11_14_15-Editing ThreatHunting-Keywords-yara-rules_README md at main ยท mthcht_ThreatHunti 2023-10-21 11_12_44-

Quick datatable to search for keyword

https://mthcht.github.io/ThreatHunting-Keywords/ image

False positives

Contribute and add your false positives to the expected false positives list

SIGMA Rules

Check out the lookup translated in SIGMA rules i usually update it at the same time :)

image

MITRE ATT&CK technique mapping

with splunk addon https://splunkbase.splunk.com/app/5742 image

Coverage for 2242 tools (updated the 2024/08/30): image

splunk search:

<details>
| inputlookup threathunting-keywords.csv
| stats count by metadata_tool metadata_tool_techniques
| makemv delim=" - " metadata_tool_techniques
| mvexpand metadata_tool_techniques
| stats count by metadata_tool_techniques

and use this splunk visualization: https://splunkbase.splunk.com/app/5742

image image

</details>

Tools matrix

Splunk dashboards (this is just one example; a wide variety of filters can be applied using the available fields in the file):

Threat Actor Groups by tools in this project

splunk xml dashboard example:

<details>
<form version="1.1">
  <label>tools matrix</label>
  <description>tools_matrix</description>
  <fieldset submitButton="false" autoRun="true">
    <input type="multiselect" token="category" searchWhenChanged="true">
      <label>tool categories</label>
      <fieldForLabel>metadata_category</fieldForLabel>
      <fieldForValue>metadata_category</fieldForValue>
      <search>
        <query>| inputlookup threathunting-keywords.csv
| stats count by metadata_category
|  fields - count</query>
        <earliest>-24h@h</earliest>
        <latest>now</latest>
      </search>
      <choice value="*">all</choice>
      <prefix>metadata_category IN (</prefix>
      <suffix>)</suffix>
      <initialValue>*</initialValue>
      <valuePrefix>"</valuePrefix>
      <valueSuffix>"</valueSuffix>
      <delimiter> ,</delimiter>
    </input>
    <input type="multiselect" token="groups" searchWhenChanged="true">
      <label>groups name</label>
      <choice value="*">ALL</choice>
      <prefix>metadata_groups_name IN (</prefix>
      <suffix>)</suffix>
      <initialValue>*</initialValue>
      <valuePrefix>"</valuePrefix>
      <valueSuffix>"</valueSuffix>
      <delimiter>,</delimiter>
      <fieldForLabel>metadata_groups_name</fieldForLabel>
      <fieldForValue>metadata_groups_name</fieldForValue>
      <search>
        <query>| inputlookup threathunting-keywords.csv
| stats count by metadata_groups_name
|  fields - count
| eval metadata_groups_name = split(metadata_groups_name, " - ")
| mvexpand metadata_groups_name
|  dedup metadata_groups_name</query>
        <earliest>-24h@h</earliest>
        <latest>now</latest>
      </search>
    </input>
  </fieldset>
  <row>
    <panel>
      <viz type="sankey_diagram_app.sankey_diagram">
        <search>
          <query>| inputlookup threathunting-keywords.csv
| search metadata_groups_name!=N/A $category$
| stats count as detection_patterns by metadata_groups_name metadata_tool
| eval metadata_groups_name = split(metadata_groups_name, " - ")
| mvexpand metadata_groups_name
| search $groups$</query>
          <earliest>-24h@h</earliest>
          <latest>now</latest>
        </search>
        <option name="drilldown">none</option>
        <option name="height">728</option>
        <option name="refresh.display">progressbar</option>
        <option name="sankey_diagram_app.sankey_diagram.colorMode">categorical</option>
        <option name="sankey_diagram_app.sankey_diagram.maxColor">#3fc77a</option>
        <option name="sankey_diagram_app.sankey_diagram.minColor">#d93f3c</option>
        <option name="sankey_diagram_app.sankey_diagram.numOfBins">6</option>
        <option name="sankey_diagram_app.sankey_diagram.showBackwards">false</option>
        <option name="sankey_diagram_app.sankey_diagram.showLabels">true</option>
        <option name="sankey_diagram_app.sankey_diagram.showLegend">true</option>
        <option name="sankey_diagram_app.sankey_diagram.showSelf">false</option>
        <option name="sankey_diagram_app.sankey_diagram.showTooltip">true</option>
        <option name="sankey_diagram_app.sankey_diagram.styleBackwards">false</option>
        <option name="sankey_diagram_app.sankey_diagram.useColors">true</option>
      </viz>
    </panel>
  </row>
</form>
</details>

image

image

image

๐Ÿค Contributing

Contributions, issues and feature requests are welcome!

If you want me to add a tool to the list, create a issue with this template:

<details>

Tool Name:

``

Please provide the name of the tool.

Official Website or Source Code Link:

``

Provide a link to the tool's official website or source code repository (GitHub, GitLab, etc.). If documentation is available, please include it.

Tool Description:

``

Describe the tool's purpose, functionality, and notable features. If you're unsure, leave this blank, and I'll review the tool in more detail.

Known Usage by Malicious Actors (if applicable):

``

If you have information on known or potential misuse of this tool by malicious actors, please share it here.

Tool Classification:

Please choose the most appropriate category for the tool:


</details>

Propose changes to the list with a PR (provide false positives feedbacks, logs sample if you can), if a keyword is generating too many false positives in too many environments we can delete it)

I will decide whether a tool is worth adding to the list. Tools that are widely used and recognized in the community are more likely to be included than obscure or new ones