Awesome

Elasticsearch Exporter

Prometheus exporter for various metrics about Elasticsearch, written in Go.

Installation

For pre-built binaries please take a look at the releases. https://github.com/prometheus-community/elasticsearch_exporter/releases

Docker

docker pull quay.io/prometheuscommunity/elasticsearch-exporter:latest
docker run --rm -p 9114:9114 quay.io/prometheuscommunity/elasticsearch-exporter:latest

Example docker-compose.yml:

elasticsearch_exporter:
    image: quay.io/prometheuscommunity/elasticsearch-exporter:latest
    command:
     - '--es.uri=http://elasticsearch:9200'
    restart: always
    ports:
    - "127.0.0.1:9114:9114"

Kubernetes

You can find a helm chart in the prometheus-community charts repository at https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-elasticsearch-exporter

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install [RELEASE_NAME] prometheus-community/prometheus-elasticsearch-exporter

Configuration

NOTE: The exporter fetches information from an Elasticsearch cluster on every scrape, therefore having a too short scrape interval can impose load on ES master nodes, particularly if you run with --es.all and --es.indices. We suggest you measure how long fetching /_nodes/stats and /_all/_stats takes for your ES cluster to determine whether your scraping interval is too short. As a last resort, you can scrape this exporter using a dedicated job with its own scraping interval.

Below is the command line options summary:

elasticsearch_exporter --help

Argument	Introduced in Version	Description	Default
collector.clustersettings	1.6.0	If true, query stats for cluster settings (As of v1.6.0, this flag has replaced "es.cluster_settings").	false
es.uri	1.0.2	Address (host and port) of the Elasticsearch node we should connect to. This could be a local node (`localhost:9200`, for instance), or the address of a remote Elasticsearch server. When basic auth is needed, specify as: `<proto>://<user>:<password>@<host>:<port>`. E.G., `http://admin:pass@localhost:9200`. Special characters in the user credentials need to be URL-encoded.	http://localhost:9200
es.all	1.0.2	If true, query stats for all nodes in the cluster, rather than just the node we connect to.	false
es.indices	1.0.2	If true, query stats for all indices in the cluster.	false
es.indices_settings	1.0.4rc1	If true, query settings stats for all indices in the cluster.	false
es.indices_mappings	1.2.0	If true, query stats for mappings of all indices of the cluster.	false
es.aliases	1.0.4rc1	If true, include informational aliases metrics.	true
es.shards	1.0.3rc1	If true, query stats for all indices in the cluster, including shard-level stats (implies `es.indices=true`).	false
collector.snapshots	1.0.4rc1	If true, query stats for the cluster snapshots. (As of v1.7.0, this flag has replaced "es.snapshots").	false
es.slm		If true, query stats for SLM.	false
es.data_stream		If true, query state for Data Steams.	false
es.timeout	1.0.2	Timeout for trying to get stats from Elasticsearch. (ex: 20s)	5s
es.ca	1.0.2	Path to PEM file that contains trusted Certificate Authorities for the Elasticsearch connection.
es.client-private-key	1.0.2	Path to PEM file that contains the private key for client auth when connecting to Elasticsearch.
es.client-cert	1.0.2	Path to PEM file that contains the corresponding cert for the private key to connect to Elasticsearch.
es.clusterinfo.interval	1.1.0rc1	Cluster info update interval for the cluster label	5m
es.ssl-skip-verify	1.0.4rc1	Skip SSL verification when connecting to Elasticsearch.	false
web.listen-address	1.0.2	Address to listen on for web interface and telemetry.	:9114
web.telemetry-path	1.0.2	Path under which to expose metrics.	/metrics
aws.region	1.5.0	Region for AWS elasticsearch
aws.role-arn	1.6.0	Role ARN of an IAM role to assume.
version	1.0.2	Show version info on stdout and exit.

Commandline parameters start with a single - for versions less than 1.1.0rc1. For versions greater than 1.1.0rc1, commandline parameters are specified with --.

The API key used to connect can be set with the ES_API_KEY environment variable.

Logging

Logging by the exporter is handled by the log/slog package. The output format can be customized with the --log.format flag which defaults to logfmt. The log level can be set with the --log.level flag which defaults to info. The output can be set to either stdout (default) or stderr with the --log.output flag.

Elasticsearch 7.x security privileges

Username and password can be passed either directly in the URI or through the ES_USERNAME and ES_PASSWORD environment variables. Specifying those two environment variables will override authentication passed in the URI (if any).

ES 7.x supports RBACs. The following security privileges are required for the elasticsearch_exporter.

Setting	Privilege Required	Description
collector.clustersettings	`cluster` `monitor`
exporter defaults	`cluster` `monitor`	All cluster read-only operations, like cluster health and state, hot threads, node info, node and cluster stats, and pending cluster tasks.
es.indices	`indices` `monitor` (per index or `*`)	All actions that are required for monitoring (recovery, segments info, index stats and status)
es.indices_settings	`indices` `monitor` (per index or `*`)
es.indices_mappings	`indices` `view_index_metadata` (per index or `*`)
es.shards	not sure if `indices` or `cluster` `monitor` or both
collector.snapshots	`cluster:admin/snapshot/status` and `cluster:admin/repository/get`	ES Forum Post
es.slm	`manage_slm`
es.data_stream	`monitor` or `manage` (per index or `*`)

Further Information

Metrics

Name	Type	Cardinality	Help
elasticsearch_breakers_estimated_size_bytes	gauge	4	Estimated size in bytes of breaker
elasticsearch_breakers_limit_size_bytes	gauge	4	Limit size in bytes for breaker
elasticsearch_breakers_tripped	counter	4	tripped for breaker
elasticsearch_cluster_health_active_primary_shards	gauge	1	The number of primary shards in your cluster. This is an aggregate total across all indices.
elasticsearch_cluster_health_active_shards	gauge	1	Aggregate total of all shards across all indices, which includes replica shards.
elasticsearch_cluster_health_delayed_unassigned_shards	gauge	1	Shards delayed to reduce reallocation overhead
elasticsearch_cluster_health_initializing_shards	gauge	1	Count of shards that are being freshly created.
elasticsearch_cluster_health_number_of_data_nodes	gauge	1	Number of data nodes in the cluster.
elasticsearch_cluster_health_number_of_in_flight_fetch	gauge	1	The number of ongoing shard info requests.
elasticsearch_cluster_health_number_of_nodes	gauge	1	Number of nodes in the cluster.
elasticsearch_cluster_health_number_of_pending_tasks	gauge	1	Cluster level changes which have not yet been executed
elasticsearch_cluster_health_task_max_waiting_in_queue_millis	gauge	1	Max time in millis that a task is waiting in queue.
elasticsearch_cluster_health_relocating_shards	gauge	1	The number of shards that are currently moving from one node to another node.
elasticsearch_cluster_health_status	gauge	3	Whether all primary and replica shards are allocated.
elasticsearch_cluster_health_unassigned_shards	gauge	1	The number of shards that exist in the cluster state, but cannot be found in the cluster itself.
elasticsearch_clustersettings_stats_max_shards_per_node	gauge	0	Current maximum number of shards per node setting.
elasticsearch_clustersettings_allocation_threshold_enabled	gauge	0	Is disk allocation decider enabled.
elasticsearch_clustersettings_allocation_watermark_flood_stage_bytes	gauge	0	Flood stage watermark as in bytes.
elasticsearch_clustersettings_allocation_watermark_high_bytes	gauge	0	High watermark for disk usage in bytes.
elasticsearch_clustersettings_allocation_watermark_low_bytes	gauge	0	Low watermark for disk usage in bytes.
elasticsearch_clustersettings_allocation_watermark_flood_stage_ratio	gauge	0	Flood stage watermark as a ratio.
elasticsearch_clustersettings_allocation_watermark_high_ratio	gauge	0	High watermark for disk usage as a ratio.
elasticsearch_clustersettings_allocation_watermark_low_ratio	gauge	0	Low watermark for disk usage as a ratio.
elasticsearch_filesystem_data_available_bytes	gauge	1	Available space on block device in bytes
elasticsearch_filesystem_data_free_bytes	gauge	1	Free space on block device in bytes
elasticsearch_filesystem_data_size_bytes	gauge	1	Size of block device in bytes
elasticsearch_filesystem_io_stats_device_operations_count	gauge	1	Count of disk operations
elasticsearch_filesystem_io_stats_device_read_operations_count	gauge	1	Count of disk read operations
elasticsearch_filesystem_io_stats_device_write_operations_count	gauge	1	Count of disk write operations
elasticsearch_filesystem_io_stats_device_read_size_kilobytes_sum	gauge	1	Total kilobytes read from disk
elasticsearch_filesystem_io_stats_device_write_size_kilobytes_sum	gauge	1	Total kilobytes written to disk
elasticsearch_indices_active_queries	gauge	1	The number of currently active queries
elasticsearch_indices_docs	gauge	1	Count of documents on this node
elasticsearch_indices_docs_deleted	gauge	1	Count of deleted documents on this node
elasticsearch_indices_deleted_docs_primary	gauge	1	Count of deleted documents with only primary shards
elasticsearch_indices_docs_primary	gauge	1	Count of documents with only primary shards on all nodes
elasticsearch_indices_docs_total	gauge		Count of documents with shards on all nodes
elasticsearch_indices_fielddata_evictions	counter	1	Evictions from field data
elasticsearch_indices_fielddata_memory_size_bytes	gauge	1	Field data cache memory usage in bytes
elasticsearch_indices_filter_cache_evictions	counter	1	Evictions from filter cache
elasticsearch_indices_filter_cache_memory_size_bytes	gauge	1	Filter cache memory usage in bytes
elasticsearch_indices_flush_time_seconds	counter	1	Cumulative flush time in seconds
elasticsearch_indices_flush_total	counter	1	Total flushes
elasticsearch_indices_get_exists_time_seconds	counter	1	Total time get exists in seconds
elasticsearch_indices_get_exists_total	counter	1	Total get exists operations
elasticsearch_indices_get_missing_time_seconds	counter	1	Total time of get missing in seconds
elasticsearch_indices_get_missing_total	counter	1	Total get missing
elasticsearch_indices_get_time_seconds	counter	1	Total get time in seconds
elasticsearch_indices_get_total	counter	1	Total get
elasticsearch_indices_indexing_delete_time_seconds_total	counter	1	Total time indexing delete in seconds
elasticsearch_indices_indexing_delete_total	counter	1	Total indexing deletes
elasticsearch_indices_index_current	gauge	1	The number of documents currently being indexed to an index
elasticsearch_indices_indexing_index_time_seconds_total	counter	1	Cumulative index time in seconds
elasticsearch_indices_indexing_index_total	counter	1	Total index calls
elasticsearch_indices_mappings_stats_fields	gauge	1	Count of fields currently mapped by index
elasticsearch_indices_mappings_stats_json_parse_failures_total	counter	0	Number of errors while parsing JSON
elasticsearch_indices_mappings_stats_scrapes_total	counter	0	Current total Elasticsearch Indices Mappings scrapes
elasticsearch_indices_mappings_stats_up	gauge	0	Was the last scrape of the Elasticsearch Indices Mappings endpoint successful
elasticsearch_indices_merges_docs_total	counter	1	Cumulative docs merged
elasticsearch_indices_merges_total	counter	1	Total merges
elasticsearch_indices_merges_total_size_bytes_total	counter	1	Total merge size in bytes
elasticsearch_indices_merges_total_time_seconds_total	counter	1	Total time spent merging in seconds
elasticsearch_indices_query_cache_cache_total	counter	1	Count of query cache
elasticsearch_indices_query_cache_cache_size	gauge	1	Size of query cache
elasticsearch_indices_query_cache_count	counter	2	Count of query cache hit/miss
elasticsearch_indices_query_cache_evictions	counter	1	Evictions from query cache
elasticsearch_indices_query_cache_memory_size_bytes	gauge	1	Query cache memory usage in bytes
elasticsearch_indices_query_cache_total	counter	1	Size of query cache total
elasticsearch_indices_refresh_time_seconds_total	counter	1	Total time spent refreshing in seconds
elasticsearch_indices_refresh_total	counter	1	Total refreshes
elasticsearch_indices_request_cache_count	counter	2	Count of request cache hit/miss
elasticsearch_indices_request_cache_evictions	counter	1	Evictions from request cache
elasticsearch_indices_request_cache_memory_size_bytes	gauge	1	Request cache memory usage in bytes
elasticsearch_indices_search_fetch_time_seconds	counter	1	Total search fetch time in seconds
elasticsearch_indices_search_fetch_total	counter	1	Total number of fetches
elasticsearch_indices_search_query_time_seconds	counter	1	Total search query time in seconds
elasticsearch_indices_search_query_total	counter	1	Total number of queries
elasticsearch_indices_segments_count	gauge	1	Count of index segments on this node
elasticsearch_indices_segments_memory_bytes	gauge	1	Current memory size of segments in bytes
elasticsearch_indices_settings_creation_timestamp_seconds	gauge	1	Timestamp of the index creation in seconds
elasticsearch_indices_settings_stats_read_only_indices	gauge	1	Count of indices that have read_only_allow_delete=true
elasticsearch_indices_settings_total_fields	gauge		Index setting value for index.mapping.total_fields.limit (total allowable mapped fields in a index)
elasticsearch_indices_settings_replicas	gauge		Index setting value for index.replicas
elasticsearch_indices_shards_docs	gauge	3	Count of documents on this shard
elasticsearch_indices_shards_docs_deleted	gauge	3	Count of deleted documents on each shard
elasticsearch_indices_store_size_bytes	gauge	1	Current size of stored index data in bytes
elasticsearch_indices_store_size_bytes_primary	gauge		Current size of stored index data in bytes with only primary shards on all nodes
elasticsearch_indices_store_size_bytes_total	gauge		Current size of stored index data in bytes with all shards on all nodes
elasticsearch_indices_store_throttle_time_seconds_total	counter	1	Throttle time for index store in seconds
elasticsearch_indices_translog_operations	counter	1	Total translog operations
elasticsearch_indices_translog_size_in_bytes	counter	1	Total translog size in bytes
elasticsearch_indices_warmer_time_seconds_total	counter	1	Total warmer time in seconds
elasticsearch_indices_warmer_total	counter	1	Total warmer count
elasticsearch_jvm_gc_collection_seconds_count	counter	2	Count of JVM GC runs
elasticsearch_jvm_gc_collection_seconds_sum	counter	2	GC run time in seconds
elasticsearch_jvm_memory_committed_bytes	gauge	2	JVM memory currently committed by area
elasticsearch_jvm_memory_max_bytes	gauge	1	JVM memory max
elasticsearch_jvm_memory_used_bytes	gauge	2	JVM memory currently used by area
elasticsearch_jvm_memory_pool_used_bytes	gauge	3	JVM memory currently used by pool
elasticsearch_jvm_memory_pool_max_bytes	counter	3	JVM memory max by pool
elasticsearch_jvm_memory_pool_peak_used_bytes	counter	3	JVM memory peak used by pool
elasticsearch_jvm_memory_pool_peak_max_bytes	counter	3	JVM memory peak max by pool
elasticsearch_os_cpu_percent	gauge	1	Percent CPU used by the OS
elasticsearch_os_load1	gauge	1	Shortterm load average
elasticsearch_os_load5	gauge	1	Midterm load average
elasticsearch_os_load15	gauge	1	Longterm load average
elasticsearch_process_cpu_percent	gauge	1	Percent CPU used by process
elasticsearch_process_cpu_seconds_total	counter	1	Process CPU time in seconds
elasticsearch_process_mem_resident_size_bytes	gauge	1	Resident memory in use by process in bytes
elasticsearch_process_mem_share_size_bytes	gauge	1	Shared memory in use by process in bytes
elasticsearch_process_mem_virtual_size_bytes	gauge	1	Total virtual memory used in bytes
elasticsearch_process_open_files_count	gauge	1	Open file descriptors
elasticsearch_snapshot_stats_number_of_snapshots	gauge	1	Total number of snapshots
elasticsearch_snapshot_stats_oldest_snapshot_timestamp	gauge	1	Oldest snapshot timestamp
elasticsearch_snapshot_stats_snapshot_start_time_timestamp	gauge	1	Last snapshot start timestamp
elasticsearch_snapshot_stats_latest_snapshot_timestamp_seconds	gauge	1	Timestamp of the latest SUCCESS or PARTIAL snapshot
elasticsearch_snapshot_stats_snapshot_end_time_timestamp	gauge	1	Last snapshot end timestamp
elasticsearch_snapshot_stats_snapshot_number_of_failures	gauge	1	Last snapshot number of failures
elasticsearch_snapshot_stats_snapshot_number_of_indices	gauge	1	Last snapshot number of indices
elasticsearch_snapshot_stats_snapshot_failed_shards	gauge	1	Last snapshot failed shards
elasticsearch_snapshot_stats_snapshot_successful_shards	gauge	1	Last snapshot successful shards
elasticsearch_snapshot_stats_snapshot_total_shards	gauge	1	Last snapshot total shard
elasticsearch_thread_pool_active_count	gauge	14	Thread Pool threads active
elasticsearch_thread_pool_completed_count	counter	14	Thread Pool operations completed
elasticsearch_thread_pool_largest_count	gauge	14	Thread Pool largest threads count
elasticsearch_thread_pool_queue_count	gauge	14	Thread Pool operations queued
elasticsearch_thread_pool_rejected_count	counter	14	Thread Pool operations rejected
elasticsearch_thread_pool_threads_count	gauge	14	Thread Pool current threads count
elasticsearch_transport_rx_packets_total	counter	1	Count of packets received
elasticsearch_transport_rx_size_bytes_total	counter	1	Total number of bytes received
elasticsearch_transport_tx_packets_total	counter	1	Count of packets sent
elasticsearch_transport_tx_size_bytes_total	counter	1	Total number of bytes sent
elasticsearch_clusterinfo_last_retrieval_success_ts	gauge	1	Timestamp of the last successful cluster info retrieval
elasticsearch_clusterinfo_up	gauge	1	Up metric for the cluster info collector
elasticsearch_clusterinfo_version_info	gauge	6	Constant metric with ES version information as labels
elasticsearch_slm_stats_up	gauge	0	Up metric for SLM collector
elasticsearch_slm_stats_total_scrapes	counter	0	Number of scrapes for SLM collector
elasticsearch_slm_stats_json_parse_failures	counter	0	JSON parse failures for SLM collector
elasticsearch_slm_stats_retention_runs_total	counter	0	Total retention runs
elasticsearch_slm_stats_retention_failed_total	counter	0	Total failed retention runs
elasticsearch_slm_stats_retention_timed_out_total	counter	0	Total retention run timeouts
elasticsearch_slm_stats_retention_deletion_time_seconds	gauge	0	Retention run deletion time
elasticsearch_slm_stats_total_snapshots_taken_total	counter	0	Total snapshots taken
elasticsearch_slm_stats_total_snapshots_failed_total	counter	0	Total snapshots failed
elasticsearch_slm_stats_total_snapshots_deleted_total	counter	0	Total snapshots deleted
elasticsearch_slm_stats_total_snapshots_failed_total	counter	0	Total snapshots failed
elasticsearch_slm_stats_snapshots_taken_total	counter	1	Snapshots taken by policy
elasticsearch_slm_stats_snapshots_failed_total	counter	1	Snapshots failed by policy
elasticsearch_slm_stats_snapshots_deleted_total	counter	1	Snapshots deleted by policy
elasticsearch_slm_stats_snapshot_deletion_failures_total	counter	1	Snapshot deletion failures by policy
elasticsearch_slm_stats_operation_mode	gauge	1	SLM operation mode (Running, stopping, stopped)
elasticsearch_data_stream_stats_up	gauge	0	Up metric for Data Stream collection
elasticsearch_data_stream_stats_total_scrapes	counter	0	Total scrapes for Data Stream stats
elasticsearch_data_stream_stats_json_parse_failures	counter	0	Number of parsing failures for Data Stream stats
elasticsearch_data_stream_backing_indices_total	gauge	1	Number of backing indices for Data Stream
elasticsearch_data_stream_store_size_bytes	gauge	1	Current size of data stream backing indices in bytes

Alerts & Recording Rules

We provide examples for Prometheus alerts and recording rules as well as an Grafana Dashboard and a Kubernetes Deployment.

The example dashboard needs the node_exporter installed. In order to select the nodes that belong to the Elasticsearch cluster, we rely on a label cluster. Depending on your setup, it can derived from the platform metadata:

For example on GCE

- source_labels: [__meta_gce_metadata_Cluster]
  separator: ;
  regex: (.*)
  target_label: cluster
  replacement: ${1}
  action: replace

Please refer to the Prometheus SD documentation to see which metadata labels can be used to create the cluster label.

Credit & License

elasticsearch_exporter is maintained by the Prometheus Community.

elasticsearch_exporter was then maintained by the nice folks from JustWatch. Then transferred this repository to the Prometheus Community in May 2021.

This package was originally created and maintained by Eric Richardson, who transferred this repository to us in January 2017.

Maintainers of this repository:

Christoph Oelmüller christoph.oelmueller@justwatch.com @zwopir

Please refer to the Git commit log for a complete list of contributors.

Contributing

We welcome any contributions. Please fork the project on GitHub and open Pull Requests for any proposed changes.

Please note that we will not merge any changes that encourage insecure behaviour. If in doubt please open an Issue first to discuss your proposal.