Home

Awesome

EQUEL - an Elasticsearch QUEry Language

The projects was motivated by usage of Elasticsearch and Kibana for log analysis in incident response and as tool in web application security testing. Both are great tools for this purpose, but Kibana exposes only a fraction of the power of Elasticsearch and is missing some features that would make log analysis much easier.

This project aims to create a query language for Elasticsearch with the following goals:

Note: EQUEL is neither Splunk SPL nor SQL. It's not the idea to "emulate" one of both.

Requirements

EQUEL was developed with Python 3. It depends on the following packages:

Tested versions are given in parentheses. Other versions may work, too. All these modules are installable via pip.

EQUEL Expressions

Overview

Elasticsearch DSL queries and aggregations are expressed in one EQUEL expression. Furthermore, post-processing and output formats can be added to such an expression.

Generally, EQUEL expressions are built as follows:

search | searchmodifier_1 | ... | aggregation_1 | ... | postprocessing_1 | ... | output_1 | ...

Each part in an EQUEL expression is called subexpression. All EQUEL expression begin with exactly one search subexpression that may be an Elasticsearch query string. The search is followed by an arbitrary number of search modifiers (e.g. sorting, field filtering), aggregations, post-processing instructions and outputs. Searches, search modifiers and aggregations are handled completely by Elasticsearch, postprocessing and output is EQUEL functionality. The output of the last search, aggregation or post-processing module is fanned out to an arbitrary number of output modules that can be stored/shown parallel.

General Subexpression Syntax

Subexpressions are built as follows:

type verb switch_1 ... parameter_1=value_1 ...

The type defines the type of the rule: filter, agg, postproc and output. The first subexpression from a new type must be prepended with this keyword for disambiguation reasons. A verb refers to a plugin, which is a piece of code that follows some interface conventions. The first subexpression is recognized as Elasticsearch query string if it doesn't starts with a whitelisted verb or shortcut character (+ and - are not whitelisted, see below).

A subexpression can be expressed as a shortcut. Each rule type class can define a plugin that handles such shortcut expressions. A shortcut expression is prefixed with one of these characters: :&<>!#+- (default should be the colon) and contains a single quoted or unqouted value. The prefix is passed to the plugin and can be used as behavior modifier. Currently, shortcuts are defined as follows:

Shortcut example:

&"EventID:4624 LogonType:3" | agg :ComputerName | :TargetUserName

Parameter values may be quoted with single (') or double (") quotes. Unquoted values end at the next token separator character (spaces and newlines). Some parameter support nested subexpressions, which are placed in parenthesis. Examples:

query_string query=foo
query_string query="foo bar"
query_string query='foo bar'
nested path=nested.field query=(:"nested.field.foo:bar AND foobar")

The list is another possible value type and can contain values or nested search expressions. Lists are enclosed in square brackets [ and ]. Search expressions inside the list must additionally be enclosed in parenthesis.

Examples:

multi_match query=foobar fields=[foo, bar, bla, blubb]
filter terms field=[value, "multiple words", '"double quoted value"']
nested path=response.header query=(match response.header.name=content-security-policy)

Unnamed lists are also supported:

:"EventID:4625" | sort [ComputerName,TargetDomainName,TargetUserName]

Sub Expression Types

Search Expression

The verbs match to the query clauses from Elasticsearch Query DSL. Some shortcuts and syntactic elements were added for simplification of EQUEL expressions, these are:

Aggregation Expression

Multiple aggregation expressions are applied to each other in order of their appearance. An aggregation expression follows the syntax stated above and can be prefixed with a nesting specification and end with a name assignment:

... | [agg <target>] ... [as <name>] | ...

The target name can be used to refer to a particular aggregation as nesting target from other aggregations or from postprocessing and output expressions. Further aggregations are nested into the previous aggregation. If the agg keyword is used in a later aggregation after the first one, it must be followed by an already defined aggregation name. The new aggregation is then nested into the specified aggregation instead of the default behavior.

An aggregation is named aggi if no name is specified. i is an incrementing counter starting with 1.

Postprocessing and Output Expressions

These expressions simply follow the syntax stated above.

Time Ranges

EQUEL can restrict the generated Elasticsearch query to a given time range. Arrow is used as library for date/time parsing and calculations. Therefore, all absolute dates and times must be given in formats supported by Arrow. In addition EQUEL implements some useful relative expressions:

Time ranges can be specified in different ways:

Implementation Architecture

Plugins

EQUEL is implemented in a plugin architecture. Plugins can register for a verb which is then used in EQUEL expressions to address the plugin. The following plugin types are specified according to the expression syntax and semantics defined above:

Parameters are passed as follows to the plugin:

There are two special cases for plugin names:

Generally a plugin object is instantiated on usage and the apply method is called as follows for the two known plugin types:

API

Usage

There are different ways to integrate EQUEL into an existing software project. Generally, the main class is equel.engine.EQUELEngine. It can be instantiated with the server name (by default localhost), an index pattern (*) and a timeout value as parameter. An EQUEL expression is then parsed with the instance method .parseEQUEL(equel) or a file with .parseEQUELFile(filename). This returns an EQUELRequest object instance that can be converted into Elasticsearch DSL with .jsonQuery(). .execute() performs the request against the server and index defined in the EQUELEngine object. Obviously, usage of .jsonQuery() only supports queries and aggregations.

The method .execute() returns an EQUELResult object (or throws an exception in case of an error), which contains EQUELOutput instances in the .outputs property. An EQUELOutput object behaves like a dictionary and contains all generated output streams. An output has a type (text, HTML or image) that can be used by further output processing to handle the output properly.

Extension

Credits