:plugin: elasticsearch :type: input :default_codec: json /////////////////////////////////////////// START - GENERATED VARIABLES, DO NOT EDIT! /////////////////////////////////////////// :version: %VERSION% :release_date: %RELEASE_DATE% :changelog_url: %CHANGELOG_URL% :include_path: ../../../../logstash/docs/include /////////////////////////////////////////// END - GENERATED VARIABLES, DO NOT EDIT! /////////////////////////////////////////// [id="plugins-{type}s-{plugin}"] === Elasticsearch input plugin include::{include_path}/plugin_header.asciidoc[] ==== Description Read from an Elasticsearch cluster, based on search query results. This is useful for replaying test logs, reindexing, etc. You can periodically schedule ingestion using a cron syntax (see `schedule` setting) or run the query one time to load data into Logstash. Example: [source,ruby] input { # Read all documents from Elasticsearch matching the given query elasticsearch { hosts => "localhost" query => '{ "query": { "match": { "statuscode": 200 } }, "sort": [ "_doc" ] }' } } This would create an Elasticsearch query with the following format: [source,json] curl 'http://localhost:9200/logstash-*/_search?&scroll=1m&size=1000' -d '{ "query": { "match": { "statuscode": 200 } }, "sort": [ "_doc" ] }' ==== Scheduling Input from this plugin can be scheduled to run periodically according to a specific schedule. This scheduling syntax is powered by https://github.com/jmettraux/rufus-scheduler[rufus-scheduler]. The syntax is cron-like with some extensions specific to Rufus (e.g. timezone support ). Examples: |========================================================== | `* 5 * 1-3 *` | will execute every minute of 5am every day of January through March. | `0 * * * *` | will execute on the 0th minute of every hour every day. | `0 6 * * * America/Chicago` | will execute at 6:00am (UTC/GMT -5) every day. |========================================================== Further documentation describing this syntax can be found https://github.com/jmettraux/rufus-scheduler#parsing-cronlines-and-time-strings[here]. [id="plugins-{type}s-{plugin}-auth"] ==== Authentication Authentication to a secure Elasticsearch cluster is possible using _one_ of the following options: * <> AND <> * <> * <> [id="plugins-{type}s-{plugin}-autz"] ==== Authorization Authorization to a secure Elasticsearch cluster requires `read` permission at index level and `monitoring` permissions at cluster level. The `monitoring` permission at cluster level is necessary to perform periodic connectivity checks. [id="plugins-{type}s-{plugin}-ecs"] ==== Compatibility with the Elastic Common Schema (ECS) When ECS compatibility is disabled, `docinfo_target` uses the `"@metadata"` field as a default, with ECS enabled the plugin uses a naming convention `"[@metadata][input][elasticsearch]"` as a default target for placing document information. The plugin logs a warning when ECS is enabled and `target` isn't set. TIP: Set the `target` option to avoid potential schema conflicts. [id="plugins-{type}s-{plugin}-options"] ==== Elasticsearch Input configuration options This plugin supports the following configuration options plus the <> and the <> described later. [cols="<,<,<",options="header",] |======================================================================= |Setting |Input type|Required | <> |<>|No | <> |<>|No | <> |<>|No | <> |<>|No | <> | <>|No | <> |<>|No | <> |<>|No | <> |<>|No | <> |<>|No | <> |<>|No | <> |<>|No | <> |<>|No | <> |<>|No | <> |<>|No | <> |<>, one of `["hits","aggregations"]`|No | <> | <>|No | <> |<>|No | <> |<>|No | <> |<>, one of `["auto", "search_after", "scroll"]`|No | <> |<>|No | <> |<>|No | <> |<>|No | <> |list of <>|No | <> |list of <>|No | <> |<>|No | <> |<>|No | <> |<>|No | <> |<>|No | <> |<>|No | <> |<>|No | <> |<>|No | <> |<>|No | <> |<>|No | <> |<>, one of `["full", "none"]`|No | <> | <>|No | <> | {logstash-ref}/field-references-deepdive.html[field reference] | No | <> | <>|No | <> |<>|No |======================================================================= Also see <> for a list of options supported by all input plugins.   [id="plugins-{type}s-{plugin}-api_key"] ===== `api_key` * Value type is <> * There is no default value for this setting. Authenticate using Elasticsearch API key. Note that this option also requires enabling the <> option. Format is `id:api_key` where `id` and `api_key` are as returned by the Elasticsearch {ref}/security-api-create-api-key.html[Create API key API]. [id="plugins-{type}s-{plugin}-ca_trusted_fingerprint"] ===== `ca_trusted_fingerprint` * Value type is <>, and must contain exactly 64 hexadecimal characters. * There is no default value for this setting. * Use of this option _requires_ Logstash 8.3+ The SHA-256 fingerprint of an SSL Certificate Authority to trust, such as the autogenerated self-signed CA for an Elasticsearch cluster. [id="plugins-{type}s-{plugin}-cloud_auth"] ===== `cloud_auth` * Value type is <> * There is no default value for this setting. Cloud authentication string (":" format) is an alternative for the `user`/`password` pair. For more info, check out the {logstash-ref}/connecting-to-cloud.html[Logstash-to-Cloud documentation]. [id="plugins-{type}s-{plugin}-cloud_id"] ===== `cloud_id` * Value type is <> * There is no default value for this setting. Cloud ID, from the Elastic Cloud web console. If set `hosts` should not be used. For more info, check out the {logstash-ref}/connecting-to-cloud.html[Logstash-to-Cloud documentation]. [id="plugins-{type}s-{plugin}-connect_timeout_seconds"] ===== `connect_timeout_seconds` * Value type is <> * Default value is `10` The maximum amount of time, in seconds, to wait while establishing a connection to Elasticsearch. Connect timeouts tend to occur when Elasticsearch or an intermediate proxy is overloaded with requests and has exhausted its connection pool. [id="plugins-{type}s-{plugin}-docinfo"] ===== `docinfo` * Value type is <> * Default value is `false` If set, include Elasticsearch document information such as index, type, and the id in the event. It might be important to note, with regards to metadata, that if you're ingesting documents with the intent to re-index them (or just update them) that the `action` option in the elasticsearch output wants to know how to handle those things. It can be dynamically assigned with a field added to the metadata. Example [source, ruby] input { elasticsearch { hosts => "es.production.mysite.org" index => "mydata-2018.09.*" query => '{ "query": { "query_string": { "query": "*" } } }' size => 500 scroll => "5m" docinfo => true docinfo_target => "[@metadata][doc]" } } output { elasticsearch { index => "copy-of-production.%{[@metadata][doc][_index]}" document_type => "%{[@metadata][doc][_type]}" document_id => "%{[@metadata][doc][_id]}" } } If set, you can use metadata information in the <> common option. Example [source, ruby] input { elasticsearch { docinfo => true docinfo_target => "[@metadata][doc]" add_field => { identifier => "%{[@metadata][doc][_index]}:%{[@metadata][doc][_type]}:%{[@metadata][doc][_id]}" } } } [id="plugins-{type}s-{plugin}-docinfo_fields"] ===== `docinfo_fields` * Value type is <> * Default value is `["_index", "_type", "_id"]` If document metadata storage is requested by enabling the `docinfo` option, this option lists the metadata fields to save in the current event. See {ref}/mapping-fields.html[Meta-Fields] in the Elasticsearch documentation for more information. [id="plugins-{type}s-{plugin}-docinfo_target"] ===== `docinfo_target` * Value type is <> * Default value depends on whether <> is enabled: ** ECS Compatibility disabled: `"@metadata"` ** ECS Compatibility enabled: `"[@metadata][input][elasticsearch]"` If document metadata storage is requested by enabling the `docinfo` option, this option names the field under which to store the metadata fields as subfields. [id="plugins-{type}s-{plugin}-ecs_compatibility"] ===== `ecs_compatibility` * Value type is <> * Supported values are: ** `disabled`: CSV data added at root level ** `v1`,`v8`: Elastic Common Schema compliant behavior * Default value depends on which version of Logstash is running: ** When Logstash provides a `pipeline.ecs_compatibility` setting, its value is used as the default ** Otherwise, the default value is `disabled` Controls this plugin's compatibility with the {ecs-ref}[Elastic Common Schema (ECS)]. [id="plugins-{type}s-{plugin}-hosts"] ===== `hosts` * Value type is <> * There is no default value for this setting. List of one or more Elasticsearch hosts to use for querying. Each host can be either IP, HOST, IP:port, or HOST:port. The port defaults to 9200. [id="plugins-{type}s-{plugin}-index"] ===== `index` * Value type is <> * Default value is `"logstash-*"` The index or alias to search. Check out {ref}/api-conventions.html#api-multi-index[Multi Indices documentation] in the Elasticsearch documentation for info on referencing multiple indices. [id="plugins-{type}s-{plugin}-password"] ===== `password` * Value type is <> * There is no default value for this setting. The password to use together with the username in the `user` option when authenticating to the Elasticsearch server. If set to an empty string authentication will be disabled. [id="plugins-{type}s-{plugin}-proxy"] ===== `proxy` * Value type is <> * There is no default value for this setting. Set the address of a forward HTTP proxy. An empty string is treated as if proxy was not set, this is useful when using environment variables e.g. `proxy => '${LS_PROXY:}'`. [id="plugins-{type}s-{plugin}-query"] ===== `query` * Value type is <> * Default value is `'{ "sort": [ "_doc" ] }'` The query to be executed. Read the {ref}/query-dsl.html[Elasticsearch query DSL documentation] for more information. When <> resolves to `search_after` and the query does not specify `sort`, the default sort `'{ "sort": { "_shard_doc": "asc" } }'` will be added to the query. Please refer to the {ref}/paginate-search-results.html#search-after[Elasticsearch search_after] parameter to know more. [id="plugins-{type}s-{plugin}-response_type"] ===== `response_type` * Value can be any of: `hits`, `aggregations` * Default value is `hits` Which part of the result to transform into Logstash events when processing the response from the query. The default `hits` will generate one event per returned document (i.e. "hit"). When set to `aggregations`, a single Logstash event will be generated with the contents of the `aggregations` object of the query's response. In this case the `hits` object will be ignored. The parameter `size` will be always be set to 0 regardless of the default or user-defined value set in this plugin. [id="plugins-{type}s-{plugin}-request_timeout_seconds"] ===== `request_timeout_seconds` * Value type is <> * Default value is `60` The maximum amount of time, in seconds, for a single request to Elasticsearch. Request timeouts tend to occur when an individual page of data is very large, such as when it contains large-payload documents and/or the <> has been specified as a large value. [id="plugins-{type}s-{plugin}-retries"] ===== `retries` * Value type is <> * Default value is `0` The number of times to re-run the query after the first failure. If the query fails after all retries, it logs an error message. The default is 0 (no retry). This value should be equal to or greater than zero. NOTE: Partial failures - such as errors in a subset of all slices - can result in the entire query being retried, which can lead to duplication of data. Avoiding this would require Logstash to store the entire result set of a query in memory which is often not possible. [id="plugins-{type}s-{plugin}-schedule"] ===== `schedule` * Value type is <> * There is no default value for this setting. Schedule of when to periodically run statement, in Cron format for example: "* * * * *" (execute query every minute, on the minute) There is no schedule by default. If no schedule is given, then the statement is run exactly once. [id="plugins-{type}s-{plugin}-scroll"] ===== `scroll` * Value type is <> * Default value is `"1m"` This parameter controls the keepalive time in seconds of the scrolling request and initiates the scrolling process. The timeout applies per round trip (i.e. between the previous scroll request, to the next). [id="plugins-{type}s-{plugin}-search_api"] ===== `search_api` * Value can be any of: `auto`, `search_after`, `scroll` * Default value is `auto` With `auto` the plugin uses the `search_after` parameter for Elasticsearch version `8.0.0` or higher, otherwise the `scroll` API is used instead. `search_after` uses {ref}/point-in-time-api.html#point-in-time-api[point in time] and sort value to search. The query requires at least one `sort` field, as described in the <> parameter. `scroll` uses {ref}/paginate-search-results.html#scroll-search-results[scroll] API to search, which is no longer recommended. [id="plugins-{type}s-{plugin}-size"] ===== `size` * Value type is <> * Default value is `1000` This allows you to set the maximum number of hits returned per scroll. [id="plugins-{type}s-{plugin}-slices"] ===== `slices` * Value type is <> * There is no default value. * Sensible values range from 2 to about 8. In some cases, it is possible to improve overall throughput by consuming multiple distinct slices of a query simultaneously using {ref}/paginate-search-results.html#slice-scroll[sliced scrolls], especially if the pipeline is spending significant time waiting on Elasticsearch to provide results. If set, the `slices` parameter tells the plugin how many slices to divide the work into, and will produce events from the slices in parallel until all of them are done scrolling. NOTE: The Elasticsearch manual indicates that there can be _negative_ performance implications to both the query and the Elasticsearch cluster when a scrolling query uses more slices than shards in the index. If the `slices` parameter is left unset, the plugin will _not_ inject slice instructions into the query. [id="plugins-{type}s-{plugin}-ssl_certificate"] ===== `ssl_certificate` * Value type is <> * There is no default value for this setting. SSL certificate to use to authenticate the client. This certificate should be an OpenSSL-style X.509 certificate file. NOTE: This setting can be used only if <> is set. [id="plugins-{type}s-{plugin}-ssl_certificate_authorities"] ===== `ssl_certificate_authorities` * Value type is a list of <> * There is no default value for this setting The `.cer` or `.pem` files to validate the server's certificate. NOTE: You cannot use this setting and <> at the same time. [id="plugins-{type}s-{plugin}-ssl_cipher_suites"] ===== `ssl_cipher_suites` * Value type is a list of <> * There is no default value for this setting The list of cipher suites to use, listed by priorities. Supported cipher suites vary depending on the Java and protocol versions. [id="plugins-{type}s-{plugin}-ssl_enabled"] ===== `ssl_enabled` * Value type is <> * There is no default value for this setting. Enable SSL/TLS secured communication to Elasticsearch cluster. Leaving this unspecified will use whatever scheme is specified in the URLs listed in <> or extracted from the <>. If no explicit protocol is specified plain HTTP will be used. [id="plugins-{type}s-{plugin}-ssl_key"] ===== `ssl_key` * Value type is <> * There is no default value for this setting. OpenSSL-style RSA private key that corresponds to the <>. NOTE: This setting can be used only if <> is set. [id="plugins-{type}s-{plugin}-ssl_keystore_password"] ===== `ssl_keystore_password` * Value type is <> * There is no default value for this setting. Set the keystore password [id="plugins-{type}s-{plugin}-ssl_keystore_path"] ===== `ssl_keystore_path` * Value type is <> * There is no default value for this setting. The keystore used to present a certificate to the server. It can be either `.jks` or `.p12` NOTE: You cannot use this setting and <> at the same time. [id="plugins-{type}s-{plugin}-ssl_keystore_type"] ===== `ssl_keystore_type` * Value can be any of: `jks`, `pkcs12` * If not provided, the value will be inferred from the keystore filename. The format of the keystore file. It must be either `jks` or `pkcs12`. [id="plugins-{type}s-{plugin}-ssl_supported_protocols"] ===== `ssl_supported_protocols` * Value type is <> * Allowed values are: `'TLSv1.1'`, `'TLSv1.2'`, `'TLSv1.3'` * Default depends on the JDK being used. With up-to-date Logstash, the default is `['TLSv1.2', 'TLSv1.3']`. `'TLSv1.1'` is not considered secure and is only provided for legacy applications. List of allowed SSL/TLS versions to use when establishing a connection to the Elasticsearch cluster. For Java 8 `'TLSv1.3'` is supported only since **8u262** (AdoptOpenJDK), but requires that you set the `LS_JAVA_OPTS="-Djdk.tls.client.protocols=TLSv1.3"` system property in Logstash. NOTE: If you configure the plugin to use `'TLSv1.1'` on any recent JVM, such as the one packaged with Logstash, the protocol is disabled by default and needs to be enabled manually by changing `jdk.tls.disabledAlgorithms` in the *$JDK_HOME/conf/security/java.security* configuration file. That is, `TLSv1.1` needs to be removed from the list. [id="plugins-{type}s-{plugin}-ssl_truststore_password"] ===== `ssl_truststore_password` * Value type is <> * There is no default value for this setting. Set the truststore password. [id="plugins-{type}s-{plugin}-ssl_truststore_path"] ===== `ssl_truststore_path` * Value type is <> * There is no default value for this setting. The truststore to validate the server's certificate. It can be either .jks or .p12. NOTE: You cannot use this setting and <> at the same time. [id="plugins-{type}s-{plugin}-ssl_truststore_type"] ===== `ssl_truststore_type` * Value can be any of: `jks`, `pkcs12` * If not provided, the value will be inferred from the truststore filename. The format of the truststore file. It must be either `jks` or `pkcs12`. [id="plugins-{type}s-{plugin}-ssl_verification_mode"] ===== `ssl_verification_mode` * Value can be any of: `full`, `none` * Default value is `full` Defines how to verify the certificates presented by another party in the TLS connection: `full` validates that the server certificate has an issue date that’s within the not_before and not_after dates; chains to a trusted Certificate Authority (CA), and has a hostname or IP address that matches the names within the certificate. `none` performs no certificate validation. WARNING: Setting certificate verification to `none` disables many security benefits of SSL/TLS, which is very dangerous. For more information on disabling certificate verification please read https://www.cs.utexas.edu/~shmat/shmat_ccs12.pdf [id="plugins-{type}s-{plugin}-socket_timeout_seconds"] ===== `socket_timeout_seconds` * Value type is <> * Default value is `60` The maximum amount of time, in seconds, to wait on an incomplete response from Elasticsearch while no additional data has been appended. Socket timeouts usually occur while waiting for the first byte of a response, such as when executing a particularly complex query. [id="plugins-{type}s-{plugin}-target"] ===== `target` * Value type is {logstash-ref}/field-references-deepdive.html[field reference] * There is no default value for this setting. Without a `target`, events are created from each hit's `_source` at the root level. When the `target` is set to a field reference, the `_source` of the hit is placed in the target field instead. This option can be useful to avoid populating unknown fields when a downstream schema such as ECS is enforced. It is also possible to target an entry in the event's metadata, which will be available during event processing but not exported to your outputs (e.g., `target \=> "[@metadata][_source]"`). [id="plugins-{type}s-{plugin}-user"] ===== `user` * Value type is <> * There is no default value for this setting. The username to use together with the password in the `password` option when authenticating to the Elasticsearch server. If set to an empty string authentication will be disabled. [id="plugins-{type}s-{plugin}-deprecated-options"] ==== Elasticsearch Input deprecated configuration options This plugin supports the following deprecated configurations. WARNING: Deprecated options are subject to removal in future releases. [cols="<,<,<",options="header",] |======================================================================= |Setting|Input type|Replaced by | <> |a valid filesystem path|<> | <> |<>|<> | <> |<>|<> |======================================================================= [id="plugins-{type}s-{plugin}-ca_file"] ===== `ca_file` deprecated[4.17.0, Replaced by <>] * Value type is <> * There is no default value for this setting. SSL Certificate Authority file in PEM encoded format, must also include any chain certificates as necessary. [id="plugins-{type}s-{plugin}-ssl"] ===== `ssl` deprecated[4.17.0, Replaced by <>] * Value type is <> * Default value is `false` If enabled, SSL will be used when communicating with the Elasticsearch server (i.e. HTTPS will be used instead of plain HTTP). [id="plugins-{type}s-{plugin}-ssl_certificate_verification"] ===== `ssl_certificate_verification` deprecated[4.17.0, Replaced by <>] * Value type is <> * Default value is `true` Option to validate the server's certificate. Disabling this severely compromises security. When certificate validation is disabled, this plugin implicitly trusts the machine resolved at the given address without validating its proof-of-identity. In this scenario, the plugin can transmit credentials to or process data from an untrustworthy man-in-the-middle or other compromised infrastructure. More information on the importance of certificate verification: **https://www.cs.utexas.edu/~shmat/shmat_ccs12.pdf**. [id="plugins-{type}s-{plugin}-common-options"] include::{include_path}/{type}.asciidoc[] :no_codec!: