README.md in fluent-plugin-sql-0.2.1 vs README.md in fluent-plugin-sql-0.2.2
- old
+ new
@@ -1,21 +1,21 @@
# SQL input plugin for Fluentd event collector
## Overview
-This sql input plugin reads records from a RDBMS periodically. Thus you can replicate tables to other storages through Fluentd.
+This sql input plugin reads records from a RDBMS periodically. Thus you can copy tables to other storages through Fluentd.
## How does it work?
-This plugin runs following SQL repeatedly every 60 seconds to *tail* a table like `tail` command of UNIX.
+This plugin runs following SQL periodically:
SELECT * FROM *table* WHERE *update\_column* > *last\_update\_column\_value* ORDER BY *update_column* ASC LIMIT 500
-What you need to configure is *update\_column*. The column needs to be updated every time when you update the row so that this plugin detects newly updated rows. Generally, the column is a timestamp such as `updated_at`.
-If you omit to set the column, it uses primary key. And this plugin can't detect updated but it only reads newly inserted rows.
+What you need to configure is *update\_column*. The column should be an incremental column (such as AUTO\_ INCREMENT primary key) so that this plugin reads newly INSERTed rows. Alternatively, you can use a column incremented every time when you update the row (such as `last_updated_at` column) so that this plugin reads the UPDATEd rows as well.
+If you omit to set *update\_column* parameter, it uses primary key.
-It stores last selected rows to a file named state\_file to not forget the last row when fluentd restarted.
+It stores last selected rows to a file (named *state\_file*) to not forget the last row when Fluentd restarts.
## Configuration
<source>
type sql
@@ -24,29 +24,29 @@
database rdb_database
adapter mysql2_or_postgresql_etc
user myusername
password mypassword
- tag_prefix my.rdb
+ tag_prefix my.rdb # optional, but recommended
- select_interval 60s
- select_limit 500
+ select_interval 60s # optional
+ select_limit 500 # optional
state_file /var/run/fluentd/sql_state
<table>
- tag table1
table table1
+ tag table1 # optional
update_column update_col1
- time_column time_col2
+ time_column time_col2 # optional
</table>
<table>
- tag table2
table table2
+ tag table2 # optional
update_column updated_at
- time_column updated_at
+ time_column updated_at # optional
</table>
# detects all tables instead of <table> sections
#all_tables
</source>
@@ -65,8 +65,13 @@
\<table\> sections:
* **tag** tag name of events (optional; default value is table name)
* **table** RDBM table name
-* **update_column**
-* **time_column** (optional)
+* **update_column**: see above description
+* **time_column** (optional): if this option is set, this plugin uses this column's value as the the event's time. Otherwise it uses current time.
+## Limitation
+
+You should make sure target tables have index (and/or partitions) on the *update\_column*. Otherwise SELECT causes full table scan and serious performance problem.
+
+You can't replicate DELETEd rows.