:plugin: dissect
:type: filter

///////////////////////////////////////////
START - GENERATED VARIABLES, DO NOT EDIT!
///////////////////////////////////////////
:version: %VERSION%
:release_date: %RELEASE_DATE%
:changelog_url: %CHANGELOG_URL%
:include_path: ../../../../logstash/docs/include
///////////////////////////////////////////
END - GENERATED VARIABLES, DO NOT EDIT!
///////////////////////////////////////////

[id="plugins-{type}s-{plugin}"]

=== Dissect filter plugin

include::{include_path}/plugin_header.asciidoc[]

==== Description

The Dissect filter is a kind of split operation. Unlike a regular split operation where one delimiter is applied to
the whole string, this operation applies a set of delimiters to a string value. +
Dissect does not use regular expressions and is very fast. +
However, if the structure of your text varies from line to line then Grok is more suitable. +
There is a hybrid case where Dissect can be used to de-structure the section of the line that is reliably repeated and
then Grok can be used on the remaining field values with more regex predictability and less overall work to do. +

A set of fields and delimiters is called a *dissection*.

The dissection is described using a set of `%{}` sections:
....
%{a} - %{b} - %{c}
....

A *field* is the text from `%` to `}` inclusive.

A *delimiter* is the text between a `}` and next `%{` characters.

[NOTE]
Any set of characters that do not fit `%{`, `'not }'`, `}` pattern is a delimiter.

The config might look like this:
....
  filter {
    dissect {
      mapping => {
        "message" => "%{ts} %{+ts} %{+ts} %{src} %{} %{prog}[%{pid}]: %{msg}"
      }
    }
  }
....
When dissecting a string from left to right, text is captured upto the first delimiter - this captured text is stored in the first field.
This is repeated for each field/# delimiter pair thereafter until the last delimiter is reached, then *the remaining text is stored in the last field*. +

*The Key:* +
The key is the text between the `%{` and `}`, exclusive of the ?, +, & prefixes and the ordinal suffix. +
`%{?aaa}` - key is `aaa` +
`%{+bbb/3}` - key is `bbb` +
`%{&ccc}` - key is `ccc` +

===== Normal field notation
The found value is added to the Event using the key. +
`%{some_field}` - a normal field has no prefix or suffix

*Skip field notation:* +
The found value is stored internally but not added to the Event. +
The key, if supplied, is prefixed with a `?`.

`%{}` is an empty skip field.

`%{?foo}` is a named skip field.

===== Append field notation
The value is appended to another value or stored if its the first field seen. +
The key is prefixed with a `+`. +
The final value is stored in the Event using the key. +

[NOTE]
====
The delimiter found before the field is appended with the value. +
If no delimiter is found before the field, a single space character is used.
====

`%{+some_field}` is an append field. +
`%{+some_field/2}` is an append field with an order modifier.

An order modifier, `/digits`, allows one to reorder the append sequence. +
e.g. for a text of `1 2 3 go`, this `%{+a/2} %{+a/1} %{+a/4} %{+a/3}` will build a key/value of `a => 2 1 go 3` +
Append fields without an order modifier will append in declared order. +
e.g. for a text of `1 2 3 go`, this `%{a} %{b} %{+a}` will build two key/values of `a => 1 3 go, b => 2` +

===== Indirect field notation
The found value is added to the Event using the found value of another field as the key. +
The key is prefixed with a `&`. +
`%{&some_field}` - an indirect field where the key is indirectly sourced from the value of `some_field`. +
e.g. for a text of `error: some_error, some_description`, this `error: %{?err}, %{&err}` will build a key/value of `some_error => some_description`.

[NOTE]
for append and indirect field the key can refer to a field that already exists in the event before dissection.

[NOTE]
use a Skip field if you do not want the indirection key/value stored.

e.g. for a text of `google: 77.98`, this `%{?a}: %{&a}` will build a key/value of `google => 77.98`.

[NOTE]
===============================
append and indirect cannot be combined and will fail validation. +
`%{+&something}` - will add a value to the `&something` key, probably not the intended outcome. +
`%{&+something}` will add a value to the `+something` key, again probably unintended. +
===============================

==== Multiple Consecutive Delimiter Handling

[IMPORTANT]
===============================
Starting from version 1.1.1 of this plugin, multiple found delimiter handling has changed.
Now multiple consecutive delimiters will be seen as missing fields by default and not padding.
If you are already using Dissect and your source text has fields padded with extra delimiters,
you will need to change your config. Please read the section below.
===============================

===== Empty data between delimiters
Given this text as the sample used to create a dissection:
....
John Smith,Big Oaks,Wood Lane,Hambledown,Canterbury,CB34RY
....
The created dissection, with 6 fields, is:
....
%{name},%{addr1},%{addr2},%{addr3},%{city},%{zip}
....
When a line like this is processed:
....
Jane Doe,4321 Fifth Avenue,,,New York,87432
....
Dissect will create an event with empty fields for `addr2 and addr3` like so:
....
{
  "name": "Jane Doe",
  "addr1": "4321 Fifth Avenue",
  "addr2": "",
  "addr3": "",
  "city": "New York"
  "zip": "87432"
}
....

===== Delimiters used as padding to visually align fields
*Padding to the right hand side*

Given these texts as the samples used to create a dissection:
....
00000043 ViewReceive     machine-321
f3000a3b Calc            machine-123
....
The dissection, with 3 fields, is:
....
%{id} %{function->} %{server}
....
Note, above, the second field has a `->` suffix which tells Dissect to ignore padding to its right. +
Dissect will create these events:
....
{
  "id": "00000043",
  "function": "ViewReceive",
  "server": "machine-123"
}
{
  "id": "f3000a3b",
  "function": "Calc",
  "server": "machine-321"
}
....
[IMPORTANT]
Always add the `->` suffix to the field on the left of the padding.

*Padding to the left hand side (to the human eye)*

Given these texts as the samples used to create a dissection:
....
00000043     ViewReceive machine-321
f3000a3b            Calc machine-123
....
The dissection, with 3 fields, is now:
....
%{id->} %{function} %{server}
....
Here the `->` suffix moves to the `id` field because Dissect sees the padding as being to the right of the `id` field. +

==== Conditional processing

You probably want to use this filter inside an `if` block. +
This ensures that the event contains a field value with a suitable structure for the dissection.

For example...
....
filter {
  if [type] == "syslog" or "syslog" in [tags] {
    dissect {
      mapping => {
        "message" => "%{ts} %{+ts} %{+ts} %{src} %{} %{prog}[%{pid}]: %{msg}"
      }
    }
  }
}
....

[id="plugins-{type}s-{plugin}-options"]
==== Dissect Filter Configuration Options

This plugin supports the following configuration options plus the <<plugins-{type}s-{plugin}-common-options>> described later.

[cols="<,<,<",options="header",]
|=======================================================================
|Setting |Input type|Required
| <<plugins-{type}s-{plugin}-convert_datatype>> |<<hash,hash>>|No
| <<plugins-{type}s-{plugin}-mapping>> |<<hash,hash>>|No
| <<plugins-{type}s-{plugin}-tag_on_failure>> |<<array,array>>|No
|=======================================================================

Also see <<plugins-{type}s-{plugin}-common-options>> for a list of options supported by all
filter plugins.

&nbsp;

[id="plugins-{type}s-{plugin}-convert_datatype"]
===== `convert_datatype`

  * Value type is <<hash,hash>>
  * Default value is `{}`

With this setting `int` and `float` datatype conversions can be specified. +
These will be done after all `mapping` dissections have taken place. +
Feel free to use this setting on its own without a `mapping` section. +

For example
[source, ruby]
filter {
  dissect {
    convert_datatype => {
      cpu => "float"
      code => "int"
    }
  }
}

[id="plugins-{type}s-{plugin}-mapping"]
===== `mapping`

  * Value type is <<hash,hash>>
  * Default value is `{}`

A hash of dissections of `field => value` +
A later dissection can be done on values from a previous dissection or they can be independent.

For example
[source, ruby]
filter {
  dissect {
    mapping => {
      "message" => "%{field1} %{field2} %{description}"
      "description" => "%{field3} %{field4} %{field5}"
    }
  }
}

This is useful if you want to keep the field `description` but also
dissect it some more.

[id="plugins-{type}s-{plugin}-tag_on_failure"]
===== `tag_on_failure`

  * Value type is <<array,array>>
  * Default value is `["_dissectfailure"]`

Append values to the `tags` field when dissection fails



[id="plugins-{type}s-{plugin}-common-options"]
include::{include_path}/{type}.asciidoc[]