# Overview of fluent-plugin-sanitizer
The fluent-plugin-sanitzer is [Fluentd](https://fluentd.org/) filter plugin to sanitize sensitive information with custom rules. The fluent-plugin-sanitzer provides not only options to sanitize values with custom regular expression and keywords but also build-in options which allows users to easily sanitize IP addresses and hostnames in complex messages.
## Installation
When you are using OSS Fluentd :
```
fluent-gem install fluent-plugin-sanitizer
```
When you are using td-agent :
```
td-agent-gem install fluent-plugin-sanitizer
```
## Configuration
### Parameters
- hash_salt : specify hash salt used to sanitize original information (:string, default: nil)
- rule options
- keys : Name of keys whose values are to be sanitized. You can specify multiple keys. When keys are nested, you can use {parent key}.{child key} like "kubernetes.master_url". (:array, default:[])
- pattern_ipv4 : sanitize if values contain IPv4. (:bool, default: false)
- pattern_fqdn : sanitize if values contain hostname in FQDN style. (:bool, default: false)
- pattern_regex : Sanitize if value mactchs custom regular expression (:regexp, default: /^$/)
- pattern_keywords : Sanitize if values match custom keywords. You can specify multiple keywords. (:array, default:[])
You can specify multiple options in a single rule like following sample configuration.
```
@type sanitizer
hash_salt mysalt
keys source, kubernetes.master_url
pattern_ipv4 true
keys hostname, host
pattern_fqdn true
keys message, system.log
pattern_regex /^Hello World!$/
pattern_keywords password, passwd
```
## Use cases
### Sanitize IP address and hostname
Sample rule #1
```
keys ip
pattern_ipv4 true
keys host
pattern_fqdn true
```
Sample input #1
```
{
"ip":"192.168.10.10",
"host":"test01.demo.com"
}
```
Sample output #1
```
{
"ip":"IPv4_94712b06963e277fe28469388323665d",
"host":"FQDN_37de34e3d799de477c742d8d7bb35550"
}
```
### Sanitize IP addresses and hostnames in between URL and messages
You may sanitize IP addresses and hostnames in URL and messages. The "pattern_ipv4" and "pattern_fqdn" options can help you easily to sanitize information in such cases.
Sample rule #2
```
keys system.url, system.log
pattern_ipv4 true
pattern_fqdn true
```
Sample input #2
```
{
"tag":"test",
"system" :
{
"url":"https://test02.demo.com:8000/event",
"log":"access from 192.168.10.100 was blocked"
}
}
```
Sample output #3
```
{
"tag":"test",
"system":{
"url":"https://FQDN_e9a59624f555d02f06209c9942dded19:8000/event",
"log":"access from IPv4_f7374d61e6d21dc1105f70358a5f8e8f was blocked"
}
}
```
### Sanitize keywords in between messages
When "pattern_keywords" option is selected, fluent-plugin-sanitizer splits messages and sanitizes blocks which match keywords.
Sample rule#3
```
keys message
pattern_keywords user1, application1
```
Sample input #3
```
{
"message":"user1 failed to login application1"
}
```
Sample output #3
```
{
"message":"Keyword_321865df6f0ce6bdf3ea16f74623534a failed to login Keyword_49006ff9b2ab584795e4cbb7636bd17c"
}
```
### Sanitize all messages
Sample rule #4
```
keys message
pattern_regex /^.*$/
```
Sample input#4
```
{
"message":"user1 failed to login application1"
}
```
Sample output #4
```
{
"message":"Regex_70e9b833f5f00a4b0ab9fcf74af81f26"
}
```
## Contribute
Contribution to fluent-plugin-sanitizer is always welcomed.
## Copyright
* Copyright(c) 2021- TK Kubota
* License
* Apache License, Version 2.0