README.md in fluent-plugin-cat-sweep-0.1.3 vs README.md in fluent-plugin-cat-sweep-0.1.4

- old
+ new

@@ -35,10 +35,80 @@ This plugin watches the directory (`file_path_with_glob tmp/test/access.log.*`), and reads the contents and sweep (deafault: remove) for files whose mtime are passed in 60 seconds (can be configured with `waiting_seconds`). Our assumption is that this mechanism should provide more durability than `in_tail` (batch read overcomes than streaming read). +## Potential problem of in_tail + +Assume that an application outputs logs into `/tmp/test/access.log` and rotates it in every one minute interval as + +(initial state) + +``` +tmp/test +└── accesss.log (i-node 4478316) +``` + +(one minute later) + +``` +tmp/test +├── accesss.log (i-node 4478319) +└── accesss.log.1 (i-node 4478316) +``` + +(two minutes later) + +``` +tmp/test +├── accesss.log (i-node 4478322) +├── accesss.log.1 (i-node 4478319) +└── accesss.log.2 (i-node 4478316) +``` + +Your configuration of `in_tail` may become as followings: + +```apache +<source> + @type tail + path tmp/test/access.log + pos_file /var/log/td-agent/access.log.pos + tag access + format none +</source> +``` + +Now, imagine that the fluentd process dies (or manually stops for maintenance) just before the 2nd file of i-node 4478319 is generated, and you restart the fluentd process after two minutes passed. Then, you miss the 2nd file of i-node 4478319. + +(initial state) + +``` +tmp/test +└── accesss.log (i-node 4478316) <= catch +``` + +(fluentd dies) + +(one minute later) + +``` +tmp/test +├── accesss.log (i-node 4478319) <= miss +└── accesss.log.1 (i-node 4478316) +``` + +(two minutes later) + +(fluentd restarts) + +``` +tmp/test +├── accesss.log (i-node 4478322) <= catch +├── accesss.log.1 (i-node 4478319) <= miss +└── accesss.log.2 (i-node 4478316) +``` + ## Configuration ``` <source> type cat_sweep @@ -49,35 +119,35 @@ # Input pattern. It depends on Parser plugin format tsv keys xpath,access_time,label,payload # Required. process files that are older than this parameter(seconds). - # [WARNING!!]: this plugin move or remove files even if the files open, - # so this parameter is set as seconds that the application close files definitely. + # [WARNING!!]: this plugin moves or removes files even if the files are still open. + # make sure to set this parameter for seconds that the application closes files definitely. waiting_seconds 60 # Optional. default is file.cat_sweep tag test.input - # Optional. processing files is renamed with this suffix. default is .processing + # Optional. processing files are renamed with this suffix. default is .processing processing_file_suffix .processing - # Optional. error files is renamed with this suffix. default is .error + # Optional. error files are renamed with this suffix. default is .error error_file_suffix .err # Optional. line terminater. default is "\n" line_terminated_by , - # Optional. max bytes oneline can has. default 536870912 (512MB) + # Optional. max bytes oneline can have. default 536870912 (512MB) oneline_max_bytes 128000 - # Optional. processed files are move to this directory. + # Optional. processed files are moved to this directory. # default '/tmp' move_to /tmp/test_processed - # Optional. this parameter indicated, `move_to` is ignored. - # files that is processed are removed. + # Optional. if this parameter is specified, `move_to` option is ignored. + # processed files are removed instead of being moved to `move_to` directory. # default is false. remove_after_processing true # Optional. default 5 seconds. run_interval 10 @@ -95,10 +165,9 @@ ## Warning * This plugin supports fluentd from v0.10.45 * The support for fluentd v0.10 will end near future -* The support for fluentd v0.14 is not yet ## Contributing 1. Fork it ( https://github.com/civitaspo/fluent-plugin-cat-sweep/fork ) 2. Create your feature branch (`git checkout -b my-new-feature`)