[![Unit Tests & Lint](https://github.com/scalyr/logstash-output-scalyr/actions/workflows/unit_tests.yml/badge.svg)](https://github.com/scalyr/logstash-output-scalyr/actions/workflows/unit_tests.yml) [![Smoke Tests](https://github.com/scalyr/logstash-output-scalyr/actions/workflows/smoke_tests.yml/badge.svg)](https://github.com/scalyr/logstash-output-scalyr/actions/workflows/smoke_tests.yml) [![Micro Benchmarks](https://github.com/scalyr/logstash-output-scalyr/actions/workflows/microbenchmarks.yml/badge.svg)](https://github.com/scalyr/logstash-output-scalyr/actions/workflows/microbenchmarks.yml) [![Gem Version](https://badge.fury.io/rb/logstash-output-scalyr.svg)](https://badge.fury.io/rb/logstash-output-scalyr) # [Scalyr output plugin for Logstash] This plugin implements a Logstash output plugin that uploads data to [Scalyr](http://www.scalyr.com). You can view documentation for this plugin [on the Scalyr website](https://app.scalyr.com/solutions/logstash). NOTE: If you are encountering connectivity issues and see SSL / TLS erros such as an example below, you should upgrade to version 0.2.6 or higher. ```javascript {"message":"Error uploading to Scalyr (will backoff-retry)", "error_class":"Manticore::ClientProtocolException","url":"https://agent.scalyr.com/addEvents", "message":"PKIX path validation failed: java.security.cert.CertPathValidatorException: validity check failed" } ``` # Quick start 1. Build the gem, run `gem build logstash-output-scalyr.gemspec` 2. Install the gem into a Logstash installation, run `/usr/share/logstash/bin/logstash-plugin install logstash-output-scalyr-0.2.6.gem` or follow the latest official instructions on working with plugins from Logstash. As an alternative, you can directly install latest stable version from RubyGems - ``/usr/share/logstash/bin/logstash-plugin --version 0.2.6 logstash-output-scalyr`` 3. Configure the output plugin (e.g. add it to a pipeline .conf) 4. Restart Logstash # Configuration The Scalyr output plugin has a number of sensible defaults so the minimum configuration only requires your `api_write_token` for upload access. Plugin configuration is achieved by adding an output section to the appropriate config file for your Logstash event pipeline: ``` my_pipeline.conf input { file { path => "/var/log/messages" } } output { scalyr { api_write_token => 'SCALYR_API_KEY' serverhost_field => 'host' logfile_field => 'path' } } ``` In the above example, the Logstash pipeline defines a file input that reads from `/var/log/messages`. Log events from this source have the `host` and `path` fields. The pipeline then outputs to the scalyr plugin, which in this example is configured to remap `host`->`serverHost` and `path`->`logfile`, thus facilitating filtering in the Scalyr UI. ## Notes on serverHost attribute handling > Some of this functionality has been fixed and changed in the v0.2.0 release. In previous versions, plugin added ``serverHost`` attribute with a value of ``Logstash`` to each event and this attribute was not handled correctly - it was treated as a regular event level attribute and not a special attribute which can be used for Source functionality and filtering. By default this plugin will set ``serverHost`` for all the events in a batch to match hostname of the logstash node where the output plugin is running. You can change that either by setting ``serverHost`` attribute in the ``server_attributes`` config option hash or by setting ``serverHost`` attribute on the event level via logstash record attribute. In both scenarios, you will be able to utilize this value for "Sources" functionality and filterin in the Scalyr UI. For example: 1. Define static value for all the events handled by specific plugin instance ``` output { scalyr { api_write_token => 'SCALYR_API_KEY' server_attributes => {'serverHost' => 'my-host-1'} } } ``` 2. Define static value on the event level which is set via logstash filter ``` mutate { add_field => { "serverHost" => "my hostname" } } ``` 3. Define dynamic value on the event level which is set via logstash filter ``` mutate { add_field => { "serverHost" => "%{[host][name]}" } } ``` ## Notes on severity (sev) attribute handling ``sev`` is a special top level DataSet event field which denotes the event severity / log level. To enable this functionality, user needs to define ``severity_field`` plugin config option. This config option tells the plugin which Logstash event field carries the value for the severity field. The actual value needs to be an integer between 0 and 6 inclusive. Those values are mapped to different severity / log levels on DataSet server side as shown below: - 0 -> finest - 1 -> trace - 2 -> debug - 3 -> info - 4 -> warning - 5 -> error - 6 -> fatal / emergency / critical ``` output { scalyr { api_write_token => 'SCALYR_API_KEY' ... severity_field => 'severity' } } ``` In the example above, value for the DataSet severity field should be included in the ``severity`` Logstash event field. In case the field value doesn't contain a valid severity number (0 - 6), ``sev`` field won't be set on the event object to prevent API from rejecting an invalid request. ## Note On Server SSL Certificate Validation By default when validating DataSet endpoint server SSL certificate, logstash plugin will use a combination of system CA certs bundle from ``/etc/ssl/certs/ca-certificates.crt`` and combination of root CA certificates which are bundled with this plugin which represent root certificates used to issue / sign server certificates used by the DataSet API endpoint. In case you want to use only system CA certs bundle (not use certs which are bundled with the plugin), you can do that by using the following config options: ``` output { scalyr { api_write_token => 'SCALYR_API_KEY' ... # You only need to set this config option in case default CA bundle path on your system is # different ssl_ca_bundle_path => "/etc/ssl/certs/ca-certificates.crt" append_builtin_cert => false } } ``` In case you want to use only root CA certs which are bundled with the plugin (not use system CA certs bundle), you can do that by using the following config options: ``` output { scalyr { api_write_token => 'SCALYR_API_KEY' ... # You only need to set this config option in case default CA bundle path on your system is # different ssl_ca_bundle_path => nil append_builtin_cert => true } } ``` ## Options - The Scalyr API write token, these are available at https://www.scalyr.com/keys. This is the only compulsory configuration field required for proper upload `config :api_write_token, :validate => :string, :required => true` --- - If you have an EU-based Scalyr account, please use https://eu.scalyr.com/ `config :scalyr_server, :validate => :string, :default => "https://agent.scalyr.com/"` --- - Path to SSL CA bundle file which is used to verify the server certificate. `config :ssl_ca_bundle_path, :validate => :string, :default => "/etc/ssl/certs/ca-certificates.crt"` If for some reason you need to disable server cert validation (you are strongly recommended to not disable it unless specifically instructed to do so or have a valid reason for it), you can do that by setting ``ssl_verify_peer`` config option to false. --- - server_attributes is a dictionary of key value pairs that represents/identifies the logstash aggregator server (where this plugin is running). Keys are arbitrary except for the 'serverHost' key which holds special meaning to Scalyr and is given special treatment in the Scalyr UI. All of these attributes are optional (not required for logs to be correctly uploaded) `config :server_attributes, :validate => :hash, :default => nil` --- - Related to the server_attributes dictionary above, if you do not define the 'serverHost' key in server_attributes, the plugin will automatically set it, using the aggregator hostname as value, if this value is true. `config :use_hostname_for_serverhost, :validate => :boolean, :default => true` --- - Field that represents the origin of the log event. (Warning: events with an existing 'serverHost' field, it will be overwritten) `config :serverhost_field, :validate => :string, :default => 'serverHost'` --- - The 'logfile' fieldname has special meaning for the Scalyr UI. Traditionally, it represents the origin logfile which users can search for in a dedicated widget in the Scalyr UI. If your Events capture this in a different field you can specify that fieldname here and the Scalyr Output Plugin will rename it to 'logfile' before upload. (Warning: events with an existing 'logfile' field, it will be overwritten) `config :logfile_field, :validate => :string, :default => 'logfile'` --- - The Scalyr Output Plugin expects the main log message to be contained in the Event['message']. If your main log content is contained in a different field, specify it here. It will be renamed to 'message' before upload. (Warning: events with an existing 'message' field, it will be overwritten) `config :message_field, :validate => :string, :default => "message"` --- - A list of fieldnames that are constant for any logfile. Any fields listed here will be sent to Scalyr as part of the `logs` array instead of inside every event to save on transmitted bytes. What constitutes a single "logfile" for correctness is a combination of logfile_field value and serverhost_field value. Only events with a serverHost value with have fields moved. `config :log_constants, :validate => :array, :default => nil` --- - If true, nested values will be flattened (which changes keys to underscore-separated concatenation of all nested keys). `config :flatten_nested_values, :validate => :boolean, :default => false` --- - If set, this will change the delimiter used when concatenating nested keys `config :flatten_nested_values_delimiter, :validate => :string, :default => "_"` --- - If true, the 'tags' field will be flattened into key-values where each key is a tag and each value is set to :flat_tag_value `config :flatten_tags, :validate => :boolean, :default => false` `config :flat_tag_prefix, :validate => :string, :default => 'tag_'` `config :flat_tag_value, :default => 1` --- - Initial interval in seconds between bulk retries. Doubled on each retry up to `retry_max_interval` `config :retry_initial_interval, :validate => :number, :default => 1` --- - Set max interval in seconds between bulk retries. `config :retry_max_interval, :validate => :number, :default => 64` --- - Valid options are bz2, deflate, or none. `config :compression_type, :validate => :string, :default => 'deflate'` --- - An int containing the compression level of compression to use, from 1-9. Defaults to 6 `config :compression_level, :validate => :number, :default => 6` --- # Conceptual Overview ## Persistence Logstash itself supports [Persistent Queues](https://www.elastic.co/guide/en/logstash/current/persistent-queues.html) with at-least-once delivery semantics. It expects output plugins to retry uploads until success or else to write failures into a Dead-Letter Queue (DLQ). Since Logstash offers Persistent Queues, the Scalyr plugin does not perform its own buffering or persistence. More specifically, invocation of `multi_receive` is synchronously retried until success or written to the DLQ upon failure. Note: the `multi_receive` interface does not provide a feedback mechanism (outcome codes etc). ## Concurrency The plugin does not manage its own internal concurrency - no threads are started to increase parallelism. To ensure correct ordering of events in Scalyr configure your pipeline with `pipeline.workers: 1`. ## Data model Logstash Events are arbitrary nested JSON. Scalyr, however, supports a flat key-value model. Users are encouraged to pay attention to the mapping of Logstash Events to Scalyr key-values. ### Special fields Scalyr assigns semantics to certain fields. These semantics allow Scalyr to know which field contains the main message, and also facilitates searching of data. For example, a user may restrict searches to specific combination of `serverHost` and `logfile` in the [Scalyr UI](https://www.scalyr.com/help/log-overview), whereby these 2 fields have dedicated input widgets in the UI. Mapping/renaming of Logstash event fields to these special fields an important configuration step. For example, if the main message is contained in a field named `text_msg`, then you should configure the plugin's `message_field` parameter to `text_msg`. This instructs the plugin to rename event `text_msg` to `message`, thus enabling the Scalyr backend to correctly receive the main log message. Here is the Scalyr API data shape and a description of the special fields: ``` { "ts":