README.md in fluent-plugin-sanitizer-0.1.2 vs README.md in fluent-plugin-sanitizer-0.1.3
- old
+ new
@@ -16,14 +16,15 @@
- hash_salt (optional) : hash salt used when calculating hash value with original information.
- rule options :
- keys (mandatory) : Name of keys whose values will be masked. You can specify multiple keys. When keys are nested, you can use {parent key}.{child key} like "kubernetes.master_url".
- pattern_ipv4 (optional) : Mask IP addresses in IPv4 format. You can use “true” or “false”. (defalt: false)
- pattern_fqdn (optional) : Mask hostname in FQDN style. You can use “true” or “false”. (defalt: false)
- - pattern_regex (optional) : Mask value mactches custom regular expression. You need to provide a regular expression in these options.
- - pattern_regex_prefix (optional) : Define prefix used for masking vales. (default: Regex)
+ - pattern_regex (optional) : Mask value mactches custom regular expression.
+ - regex_capture_group (optional) : If you define capture group in regular expression, you can specify the name of capture group to be masked.
+ - pattern_regex_prefix (optional) : Define prefix used for masking vales. (default: Regex)
- pattern_keywords (optional) : Mask values match custom keywords. You can specify multiple keywords.
- - pattern_keywords_prefix (optional) : Define prefix used for masking vales. (default: Keyword)
+ - pattern_keywords_prefix (optional) : Define prefix used for masking vales. (default: Keyword)
You can specify multiple rules in a single configuration. It is also possible to define multiple pattern options in a single rule like the following sample.
```
<filter **>
@@ -127,9 +128,37 @@
"ssn" : "SSN_f6b6430343a9a749e12db8a112ca74e9"
"phone" : "Phone_0a25187902a0cf755627397eb085d736"
}
}
```
+From v0.1.2, "regex_capture_group" option is available. With "regex_capture_group" option, it is possible to mask specific part of original messages.
+
+**Configuration sample**
+```
+<rule>
+ keys user.email
+ pattern_regex /(?<user>\w+)\@\w+.\w+/
+ regex_capture_group "user"
+ pattern_regex_prefix "USER"
+</rule>
+```
+**Input sample**
+```
+ {
+ "user" : {
+ "email" : "user1@demo.com"
+ }
+ }
+```
+**Output sample**
+```
+ {
+ "user" : {
+ "email" : "USER_321865df6f0ce6bdf3ea16f74623534a@demo.com"
+ }
+ }
+```
+
### Tips : Debug how sanitizer works
When you design custom rules in a configuration file, you might need information about how Sanitizer masks original values into hash values for debugging purposes. You can check that information if you run td-agent/Fluentd with debug option enabled. The debug information is shown in the log file of td-agent/Fluentd like the following log message sample.
**Log message sample**
```