Home

Awesome

Elasticsearch Exporter

CircleCI Go Report Card

Prometheus exporter for various metrics about Elasticsearch, written in Go.

Installation

For pre-built binaries please take a look at the releases. https://github.com/prometheus-community/elasticsearch_exporter/releases

Docker

docker pull quay.io/prometheuscommunity/elasticsearch-exporter:latest
docker run --rm -p 9114:9114 quay.io/prometheuscommunity/elasticsearch-exporter:latest

Example docker-compose.yml:

elasticsearch_exporter:
    image: quay.io/prometheuscommunity/elasticsearch-exporter:latest
    command:
     - '--es.uri=http://elasticsearch:9200'
    restart: always
    ports:
    - "127.0.0.1:9114:9114"

Kubernetes

You can find a helm chart in the prometheus-community charts repository at https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-elasticsearch-exporter

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install [RELEASE_NAME] prometheus-community/prometheus-elasticsearch-exporter

Configuration

NOTE: The exporter fetches information from an Elasticsearch cluster on every scrape, therefore having a too short scrape interval can impose load on ES master nodes, particularly if you run with --es.all and --es.indices. We suggest you measure how long fetching /_nodes/stats and /_all/_stats takes for your ES cluster to determine whether your scraping interval is too short. As a last resort, you can scrape this exporter using a dedicated job with its own scraping interval.

Below is the command line options summary:

elasticsearch_exporter --help
ArgumentIntroduced in VersionDescriptionDefault
collector.clustersettings1.6.0If true, query stats for cluster settings (As of v1.6.0, this flag has replaced "es.cluster_settings").false
es.uri1.0.2Address (host and port) of the Elasticsearch node we should connect to. This could be a local node (localhost:9200, for instance), or the address of a remote Elasticsearch server. When basic auth is needed, specify as: <proto>://<user>:<password>@<host>:<port>. E.G., http://admin:pass@localhost:9200. Special characters in the user credentials need to be URL-encoded.http://localhost:9200
es.all1.0.2If true, query stats for all nodes in the cluster, rather than just the node we connect to.false
es.indices1.0.2If true, query stats for all indices in the cluster.false
es.indices_settings1.0.4rc1If true, query settings stats for all indices in the cluster.false
es.indices_mappings1.2.0If true, query stats for mappings of all indices of the cluster.false
es.aliases1.0.4rc1If true, include informational aliases metrics.true
es.shards1.0.3rc1If true, query stats for all indices in the cluster, including shard-level stats (implies es.indices=true).false
collector.snapshots1.0.4rc1If true, query stats for the cluster snapshots. (As of v1.7.0, this flag has replaced "es.snapshots").false
es.slmIf true, query stats for SLM.false
es.data_streamIf true, query state for Data Steams.false
es.timeout1.0.2Timeout for trying to get stats from Elasticsearch. (ex: 20s)5s
es.ca1.0.2Path to PEM file that contains trusted Certificate Authorities for the Elasticsearch connection.
es.client-private-key1.0.2Path to PEM file that contains the private key for client auth when connecting to Elasticsearch.
es.client-cert1.0.2Path to PEM file that contains the corresponding cert for the private key to connect to Elasticsearch.
es.clusterinfo.interval1.1.0rc1Cluster info update interval for the cluster label5m
es.ssl-skip-verify1.0.4rc1Skip SSL verification when connecting to Elasticsearch.false
web.listen-address1.0.2Address to listen on for web interface and telemetry.:9114
web.telemetry-path1.0.2Path under which to expose metrics./metrics
aws.region1.5.0Region for AWS elasticsearch
aws.role-arn1.6.0Role ARN of an IAM role to assume.
version1.0.2Show version info on stdout and exit.

Commandline parameters start with a single - for versions less than 1.1.0rc1. For versions greater than 1.1.0rc1, commandline parameters are specified with --.

The API key used to connect can be set with the ES_API_KEY environment variable.

Logging

Logging by the exporter is handled by the log/slog package. The output format can be customized with the --log.format flag which defaults to logfmt. The log level can be set with the --log.level flag which defaults to info. The output can be set to either stdout (default) or stderr with the --log.output flag.

Elasticsearch 7.x security privileges

Username and password can be passed either directly in the URI or through the ES_USERNAME and ES_PASSWORD environment variables. Specifying those two environment variables will override authentication passed in the URI (if any).

ES 7.x supports RBACs. The following security privileges are required for the elasticsearch_exporter.

SettingPrivilege RequiredDescription
collector.clustersettingscluster monitor
exporter defaultscluster monitorAll cluster read-only operations, like cluster health and state, hot threads, node info, node and cluster stats, and pending cluster tasks.
es.indicesindices monitor (per index or *)All actions that are required for monitoring (recovery, segments info, index stats and status)
es.indices_settingsindices monitor (per index or *)
es.indices_mappingsindices view_index_metadata (per index or *)
es.shardsnot sure if indices or cluster monitor or both
collector.snapshotscluster:admin/snapshot/status and cluster:admin/repository/getES Forum Post
es.slmmanage_slm
es.data_streammonitor or manage (per index or *)

Further Information

Metrics

NameTypeCardinalityHelp
elasticsearch_breakers_estimated_size_bytesgauge4Estimated size in bytes of breaker
elasticsearch_breakers_limit_size_bytesgauge4Limit size in bytes for breaker
elasticsearch_breakers_trippedcounter4tripped for breaker
elasticsearch_cluster_health_active_primary_shardsgauge1The number of primary shards in your cluster. This is an aggregate total across all indices.
elasticsearch_cluster_health_active_shardsgauge1Aggregate total of all shards across all indices, which includes replica shards.
elasticsearch_cluster_health_delayed_unassigned_shardsgauge1Shards delayed to reduce reallocation overhead
elasticsearch_cluster_health_initializing_shardsgauge1Count of shards that are being freshly created.
elasticsearch_cluster_health_number_of_data_nodesgauge1Number of data nodes in the cluster.
elasticsearch_cluster_health_number_of_in_flight_fetchgauge1The number of ongoing shard info requests.
elasticsearch_cluster_health_number_of_nodesgauge1Number of nodes in the cluster.
elasticsearch_cluster_health_number_of_pending_tasksgauge1Cluster level changes which have not yet been executed
elasticsearch_cluster_health_task_max_waiting_in_queue_millisgauge1Max time in millis that a task is waiting in queue.
elasticsearch_cluster_health_relocating_shardsgauge1The number of shards that are currently moving from one node to another node.
elasticsearch_cluster_health_statusgauge3Whether all primary and replica shards are allocated.
elasticsearch_cluster_health_unassigned_shardsgauge1The number of shards that exist in the cluster state, but cannot be found in the cluster itself.
elasticsearch_clustersettings_stats_max_shards_per_nodegauge0Current maximum number of shards per node setting.
elasticsearch_clustersettings_allocation_threshold_enabledgauge0Is disk allocation decider enabled.
elasticsearch_clustersettings_allocation_watermark_flood_stage_bytesgauge0Flood stage watermark as in bytes.
elasticsearch_clustersettings_allocation_watermark_high_bytesgauge0High watermark for disk usage in bytes.
elasticsearch_clustersettings_allocation_watermark_low_bytesgauge0Low watermark for disk usage in bytes.
elasticsearch_clustersettings_allocation_watermark_flood_stage_ratiogauge0Flood stage watermark as a ratio.
elasticsearch_clustersettings_allocation_watermark_high_ratiogauge0High watermark for disk usage as a ratio.
elasticsearch_clustersettings_allocation_watermark_low_ratiogauge0Low watermark for disk usage as a ratio.
elasticsearch_filesystem_data_available_bytesgauge1Available space on block device in bytes
elasticsearch_filesystem_data_free_bytesgauge1Free space on block device in bytes
elasticsearch_filesystem_data_size_bytesgauge1Size of block device in bytes
elasticsearch_filesystem_io_stats_device_operations_countgauge1Count of disk operations
elasticsearch_filesystem_io_stats_device_read_operations_countgauge1Count of disk read operations
elasticsearch_filesystem_io_stats_device_write_operations_countgauge1Count of disk write operations
elasticsearch_filesystem_io_stats_device_read_size_kilobytes_sumgauge1Total kilobytes read from disk
elasticsearch_filesystem_io_stats_device_write_size_kilobytes_sumgauge1Total kilobytes written to disk
elasticsearch_indices_active_queriesgauge1The number of currently active queries
elasticsearch_indices_docsgauge1Count of documents on this node
elasticsearch_indices_docs_deletedgauge1Count of deleted documents on this node
elasticsearch_indices_deleted_docs_primarygauge1Count of deleted documents with only primary shards
elasticsearch_indices_docs_primarygauge1Count of documents with only primary shards on all nodes
elasticsearch_indices_docs_totalgaugeCount of documents with shards on all nodes
elasticsearch_indices_fielddata_evictionscounter1Evictions from field data
elasticsearch_indices_fielddata_memory_size_bytesgauge1Field data cache memory usage in bytes
elasticsearch_indices_filter_cache_evictionscounter1Evictions from filter cache
elasticsearch_indices_filter_cache_memory_size_bytesgauge1Filter cache memory usage in bytes
elasticsearch_indices_flush_time_secondscounter1Cumulative flush time in seconds
elasticsearch_indices_flush_totalcounter1Total flushes
elasticsearch_indices_get_exists_time_secondscounter1Total time get exists in seconds
elasticsearch_indices_get_exists_totalcounter1Total get exists operations
elasticsearch_indices_get_missing_time_secondscounter1Total time of get missing in seconds
elasticsearch_indices_get_missing_totalcounter1Total get missing
elasticsearch_indices_get_time_secondscounter1Total get time in seconds
elasticsearch_indices_get_totalcounter1Total get
elasticsearch_indices_indexing_delete_time_seconds_totalcounter1Total time indexing delete in seconds
elasticsearch_indices_indexing_delete_totalcounter1Total indexing deletes
elasticsearch_indices_index_currentgauge1The number of documents currently being indexed to an index
elasticsearch_indices_indexing_index_time_seconds_totalcounter1Cumulative index time in seconds
elasticsearch_indices_indexing_index_totalcounter1Total index calls
elasticsearch_indices_mappings_stats_fieldsgauge1Count of fields currently mapped by index
elasticsearch_indices_mappings_stats_json_parse_failures_totalcounter0Number of errors while parsing JSON
elasticsearch_indices_mappings_stats_scrapes_totalcounter0Current total Elasticsearch Indices Mappings scrapes
elasticsearch_indices_mappings_stats_upgauge0Was the last scrape of the Elasticsearch Indices Mappings endpoint successful
elasticsearch_indices_merges_docs_totalcounter1Cumulative docs merged
elasticsearch_indices_merges_totalcounter1Total merges
elasticsearch_indices_merges_total_size_bytes_totalcounter1Total merge size in bytes
elasticsearch_indices_merges_total_time_seconds_totalcounter1Total time spent merging in seconds
elasticsearch_indices_query_cache_cache_totalcounter1Count of query cache
elasticsearch_indices_query_cache_cache_sizegauge1Size of query cache
elasticsearch_indices_query_cache_countcounter2Count of query cache hit/miss
elasticsearch_indices_query_cache_evictionscounter1Evictions from query cache
elasticsearch_indices_query_cache_memory_size_bytesgauge1Query cache memory usage in bytes
elasticsearch_indices_query_cache_totalcounter1Size of query cache total
elasticsearch_indices_refresh_time_seconds_totalcounter1Total time spent refreshing in seconds
elasticsearch_indices_refresh_totalcounter1Total refreshes
elasticsearch_indices_request_cache_countcounter2Count of request cache hit/miss
elasticsearch_indices_request_cache_evictionscounter1Evictions from request cache
elasticsearch_indices_request_cache_memory_size_bytesgauge1Request cache memory usage in bytes
elasticsearch_indices_search_fetch_time_secondscounter1Total search fetch time in seconds
elasticsearch_indices_search_fetch_totalcounter1Total number of fetches
elasticsearch_indices_search_query_time_secondscounter1Total search query time in seconds
elasticsearch_indices_search_query_totalcounter1Total number of queries
elasticsearch_indices_segments_countgauge1Count of index segments on this node
elasticsearch_indices_segments_memory_bytesgauge1Current memory size of segments in bytes
elasticsearch_indices_settings_creation_timestamp_secondsgauge1Timestamp of the index creation in seconds
elasticsearch_indices_settings_stats_read_only_indicesgauge1Count of indices that have read_only_allow_delete=true
elasticsearch_indices_settings_total_fieldsgaugeIndex setting value for index.mapping.total_fields.limit (total allowable mapped fields in a index)
elasticsearch_indices_settings_replicasgaugeIndex setting value for index.replicas
elasticsearch_indices_shards_docsgauge3Count of documents on this shard
elasticsearch_indices_shards_docs_deletedgauge3Count of deleted documents on each shard
elasticsearch_indices_store_size_bytesgauge1Current size of stored index data in bytes
elasticsearch_indices_store_size_bytes_primarygaugeCurrent size of stored index data in bytes with only primary shards on all nodes
elasticsearch_indices_store_size_bytes_totalgaugeCurrent size of stored index data in bytes with all shards on all nodes
elasticsearch_indices_store_throttle_time_seconds_totalcounter1Throttle time for index store in seconds
elasticsearch_indices_translog_operationscounter1Total translog operations
elasticsearch_indices_translog_size_in_bytescounter1Total translog size in bytes
elasticsearch_indices_warmer_time_seconds_totalcounter1Total warmer time in seconds
elasticsearch_indices_warmer_totalcounter1Total warmer count
elasticsearch_jvm_gc_collection_seconds_countcounter2Count of JVM GC runs
elasticsearch_jvm_gc_collection_seconds_sumcounter2GC run time in seconds
elasticsearch_jvm_memory_committed_bytesgauge2JVM memory currently committed by area
elasticsearch_jvm_memory_max_bytesgauge1JVM memory max
elasticsearch_jvm_memory_used_bytesgauge2JVM memory currently used by area
elasticsearch_jvm_memory_pool_used_bytesgauge3JVM memory currently used by pool
elasticsearch_jvm_memory_pool_max_bytescounter3JVM memory max by pool
elasticsearch_jvm_memory_pool_peak_used_bytescounter3JVM memory peak used by pool
elasticsearch_jvm_memory_pool_peak_max_bytescounter3JVM memory peak max by pool
elasticsearch_os_cpu_percentgauge1Percent CPU used by the OS
elasticsearch_os_load1gauge1Shortterm load average
elasticsearch_os_load5gauge1Midterm load average
elasticsearch_os_load15gauge1Longterm load average
elasticsearch_process_cpu_percentgauge1Percent CPU used by process
elasticsearch_process_cpu_seconds_totalcounter1Process CPU time in seconds
elasticsearch_process_mem_resident_size_bytesgauge1Resident memory in use by process in bytes
elasticsearch_process_mem_share_size_bytesgauge1Shared memory in use by process in bytes
elasticsearch_process_mem_virtual_size_bytesgauge1Total virtual memory used in bytes
elasticsearch_process_open_files_countgauge1Open file descriptors
elasticsearch_snapshot_stats_number_of_snapshotsgauge1Total number of snapshots
elasticsearch_snapshot_stats_oldest_snapshot_timestampgauge1Oldest snapshot timestamp
elasticsearch_snapshot_stats_snapshot_start_time_timestampgauge1Last snapshot start timestamp
elasticsearch_snapshot_stats_latest_snapshot_timestamp_secondsgauge1Timestamp of the latest SUCCESS or PARTIAL snapshot
elasticsearch_snapshot_stats_snapshot_end_time_timestampgauge1Last snapshot end timestamp
elasticsearch_snapshot_stats_snapshot_number_of_failuresgauge1Last snapshot number of failures
elasticsearch_snapshot_stats_snapshot_number_of_indicesgauge1Last snapshot number of indices
elasticsearch_snapshot_stats_snapshot_failed_shardsgauge1Last snapshot failed shards
elasticsearch_snapshot_stats_snapshot_successful_shardsgauge1Last snapshot successful shards
elasticsearch_snapshot_stats_snapshot_total_shardsgauge1Last snapshot total shard
elasticsearch_thread_pool_active_countgauge14Thread Pool threads active
elasticsearch_thread_pool_completed_countcounter14Thread Pool operations completed
elasticsearch_thread_pool_largest_countgauge14Thread Pool largest threads count
elasticsearch_thread_pool_queue_countgauge14Thread Pool operations queued
elasticsearch_thread_pool_rejected_countcounter14Thread Pool operations rejected
elasticsearch_thread_pool_threads_countgauge14Thread Pool current threads count
elasticsearch_transport_rx_packets_totalcounter1Count of packets received
elasticsearch_transport_rx_size_bytes_totalcounter1Total number of bytes received
elasticsearch_transport_tx_packets_totalcounter1Count of packets sent
elasticsearch_transport_tx_size_bytes_totalcounter1Total number of bytes sent
elasticsearch_clusterinfo_last_retrieval_success_tsgauge1Timestamp of the last successful cluster info retrieval
elasticsearch_clusterinfo_upgauge1Up metric for the cluster info collector
elasticsearch_clusterinfo_version_infogauge6Constant metric with ES version information as labels
elasticsearch_slm_stats_upgauge0Up metric for SLM collector
elasticsearch_slm_stats_total_scrapescounter0Number of scrapes for SLM collector
elasticsearch_slm_stats_json_parse_failurescounter0JSON parse failures for SLM collector
elasticsearch_slm_stats_retention_runs_totalcounter0Total retention runs
elasticsearch_slm_stats_retention_failed_totalcounter0Total failed retention runs
elasticsearch_slm_stats_retention_timed_out_totalcounter0Total retention run timeouts
elasticsearch_slm_stats_retention_deletion_time_secondsgauge0Retention run deletion time
elasticsearch_slm_stats_total_snapshots_taken_totalcounter0Total snapshots taken
elasticsearch_slm_stats_total_snapshots_failed_totalcounter0Total snapshots failed
elasticsearch_slm_stats_total_snapshots_deleted_totalcounter0Total snapshots deleted
elasticsearch_slm_stats_total_snapshots_failed_totalcounter0Total snapshots failed
elasticsearch_slm_stats_snapshots_taken_totalcounter1Snapshots taken by policy
elasticsearch_slm_stats_snapshots_failed_totalcounter1Snapshots failed by policy
elasticsearch_slm_stats_snapshots_deleted_totalcounter1Snapshots deleted by policy
elasticsearch_slm_stats_snapshot_deletion_failures_totalcounter1Snapshot deletion failures by policy
elasticsearch_slm_stats_operation_modegauge1SLM operation mode (Running, stopping, stopped)
elasticsearch_data_stream_stats_upgauge0Up metric for Data Stream collection
elasticsearch_data_stream_stats_total_scrapescounter0Total scrapes for Data Stream stats
elasticsearch_data_stream_stats_json_parse_failurescounter0Number of parsing failures for Data Stream stats
elasticsearch_data_stream_backing_indices_totalgauge1Number of backing indices for Data Stream
elasticsearch_data_stream_store_size_bytesgauge1Current size of data stream backing indices in bytes

Alerts & Recording Rules

We provide examples for Prometheus alerts and recording rules as well as an Grafana Dashboard and a Kubernetes Deployment.

The example dashboard needs the node_exporter installed. In order to select the nodes that belong to the Elasticsearch cluster, we rely on a label cluster. Depending on your setup, it can derived from the platform metadata:

For example on GCE

- source_labels: [__meta_gce_metadata_Cluster]
  separator: ;
  regex: (.*)
  target_label: cluster
  replacement: ${1}
  action: replace

Please refer to the Prometheus SD documentation to see which metadata labels can be used to create the cluster label.

Credit & License

elasticsearch_exporter is maintained by the Prometheus Community.

elasticsearch_exporter was then maintained by the nice folks from JustWatch. Then transferred this repository to the Prometheus Community in May 2021.

This package was originally created and maintained by Eric Richardson, who transferred this repository to us in January 2017.

Maintainers of this repository:

Please refer to the Git commit log for a complete list of contributors.

Contributing

We welcome any contributions. Please fork the project on GitHub and open Pull Requests for any proposed changes.

Please note that we will not merge any changes that encourage insecure behaviour. If in doubt please open an Issue first to discuss your proposal.