README.md in fluent-plugin-bigquery-1.2.0 vs README.md in fluent-plugin-bigquery-2.0.0.beta

- old
+ new

@@ -1,15 +1,19 @@
 # fluent-plugin-bigquery
 
+**This README is for v2.0.0.beta. but it is not released yet. sorry.**
+
 [Fluentd](http://fluentd.org) output plugin to load/insert data into Google BigQuery.
 
-- **Plugin type**: BufferedOutput
+- **Plugin type**: Output
 
 * insert data over streaming inserts
+  * plugin type is `bigquery_insert`
   * for continuous real-time insertions
   * https://developers.google.com/bigquery/streaming-data-into-bigquery#usecases
 * load data
+  * plugin type is `bigquery_load`
   * for data loading as batch jobs, for big amount of data
   * https://developers.google.com/bigquery/loading-data-into-bigquery
 
 Current version of this plugin supports Google API with Service Account Authentication, but does not support
 OAuth flow for installed applications.
@@ -29,59 +33,64 @@
 
 ## Configuration
 
 ### Options
 
-| name                                   | type          | required?                                    | placeholder? | default                    | description                                                                                                                                                                                |
-| :------------------------------------- | :------------ | :-----------                                 | :----------  | :------------------------- | :-----------------------                                                                                                                                                                   |
-| method                                 | string        | no                                           | no           | insert                     | `insert` (Streaming Insert) or `load` (load job)                                                                                                                                           |
-| auth_method                            | enum          | yes                                          | no           | private_key                | `private_key` or `json_key` or `compute_engine` or `application_default`                                                                                                                   |
-| email                                  | string        | yes (private_key)                            | no           | nil                        | GCP Service Account Email                                                                                                                                                                  |
-| private_key_path                       | string        | yes (private_key)                            | no           | nil                        | GCP Private Key file path                                                                                                                                                                  |
-| private_key_passphrase                 | string        | yes (private_key)                            | no           | nil                        | GCP Private Key Passphrase                                                                                                                                                                 |
-| json_key                               | string        | yes (json_key)                               | no           | nil                        | GCP JSON Key file path or JSON Key string                                                                                                                                                  |
-| project                                | string        | yes                                          | yes          | nil                        |                                                                                                                                                                                            |
-| dataset                                | string        | yes                                          | yes          | nil                        |                                                                                                                                                                                            |
-| table                                  | string        | yes (either `tables`)                        | yes          | nil                        |                                                                                                                                                                                            |
-| tables                                 | array(string) | yes (either `table`)                         | yes          | nil                        | can set multi table names splitted by `,`                                                                                                                                                  |
-| template_suffix                        | string        | no                                           | yes          | nil                        | can use `%{time_slice}` placeholder replaced by `time_slice_format`                                                                                                                        |
-| auto_create_table                      | bool          | no                                           | no           | false                      | If true, creates table automatically                                                                                                                                                       |
-| skip_invalid_rows                      | bool          | no                                           | no           | false                      | Only `insert` method.                                                                                                                                                                      |
-| max_bad_records                        | integer       | no                                           | no           | 0                          | Only `load` method. If the number of bad records exceeds this value, an invalid error is returned in the job result.                                                                       |
-| ignore_unknown_values                  | bool          | no                                           | no           | false                      | Accept rows that contain values that do not match the schema. The unknown values are ignored.                                                                                              |
-| schema                                 | array         | yes (either `fetch_schema` or `schema_path`) | no           | nil                        | Schema Definition. It is formatted by JSON.                                                                                                                                                |
-| schema_path                            | string        | yes (either `fetch_schema`)                  | no           | nil                        | Schema Definition file path. It is formatted by JSON.                                                                                                                                      |
-| fetch_schema                           | bool          | yes (either `schema_path`)                   | no           | false                      | If true, fetch table schema definition from Bigquery table automatically.                                                                                                                  |
-| fetch_schema_table                     | string        | no                                           | yes          | nil                        | If set, fetch table schema definition from this table, If fetch_schema is false, this param is ignored                                                                                     |
-| schema_cache_expire                    | integer       | no                                           | no           | 600                        | Value is second. If current time is after expiration interval, re-fetch table schema definition.                                                                                           |
-| insert_id_field                        | string        | no                                           | no           | nil                        | Use key as `insert_id` of Streaming Insert API parameter.                                                                                                                                  |
-| add_insert_timestamp                   | string        | no                                           | no           | nil                        | Adds a timestamp column just before sending the rows to BigQuery, so that buffering time is not taken into account. Gives a field in BigQuery which represents the insert time of the row. |
-| allow_retry_insert_errors              | bool          | no                                           | no           | false                      | Retry to insert rows when an insertErrors occurs. There is a possibility that rows are inserted in duplicate.                                                                              |
-| request_timeout_sec                    | integer       | no                                           | no           | nil                        | Bigquery API response timeout                                                                                                                                                              |
-| request_open_timeout_sec               | integer       | no                                           | no           | 60                         | Bigquery API connection, and request timeout. If you send big data to Bigquery, set large value.                                                                                           |
-| time_partitioning_type                 | enum          | no (either day)                              | no           | nil                        | Type of bigquery time partitioning feature(experimental feature on BigQuery).                                                                                                              |
-| time_partitioning_expiration           | time          | no                                           | no           | nil                        | Expiration milliseconds for bigquery time partitioning. (experimental feature on BigQuery)                                                                                                 |
+#### common
 
-### Deprecated
+| name                                   | type          | required?                                    | placeholder? | default                    | description                                                                                            |
+| :------------------------------------- | :------------ | :-----------                                 | :----------  | :------------------------- | :-----------------------                                                                               |
+| auth_method                            | enum          | yes                                          | no           | private_key                | `private_key` or `json_key` or `compute_engine` or `application_default`                               |
+| email                                  | string        | yes (private_key)                            | no           | nil                        | GCP Service Account Email                                                                              |
+| private_key_path                       | string        | yes (private_key)                            | no           | nil                        | GCP Private Key file path                                                                              |
+| private_key_passphrase                 | string        | yes (private_key)                            | no           | nil                        | GCP Private Key Passphrase                                                                             |
+| json_key                               | string        | yes (json_key)                               | no           | nil                        | GCP JSON Key file path or JSON Key string                                                              |
+| project                                | string        | yes                                          | yes          | nil                        |                                                                                                        |
+| dataset                                | string        | yes                                          | yes          | nil                        |                                                                                                        |
+| table                                  | string        | yes (either `tables`)                        | yes          | nil                        |                                                                                                        |
+| tables                                 | array(string) | yes (either `table`)                         | yes          | nil                        | can set multi table names splitted by `,`                                                              |
+| auto_create_table                      | bool          | no                                           | no           | false                      | If true, creates table automatically                                                                   |
+| ignore_unknown_values                  | bool          | no                                           | no           | false                      | Accept rows that contain values that do not match the schema. The unknown values are ignored.          |
+| schema                                 | array         | yes (either `fetch_schema` or `schema_path`) | no           | nil                        | Schema Definition. It is formatted by JSON.                                                            |
+| schema_path                            | string        | yes (either `fetch_schema`)                  | no           | nil                        | Schema Definition file path. It is formatted by JSON.                                                  |
+| fetch_schema                           | bool          | yes (either `schema_path`)                   | no           | false                      | If true, fetch table schema definition from Bigquery table automatically.                              |
+| fetch_schema_table                     | string        | no                                           | yes          | nil                        | If set, fetch table schema definition from this table, If fetch_schema is false, this param is ignored |
+| schema_cache_expire                    | integer       | no                                           | no           | 600                        | Value is second. If current time is after expiration interval, re-fetch table schema definition.       |
+| request_timeout_sec                    | integer       | no                                           | no           | nil                        | Bigquery API response timeout                                                                          |
+| request_open_timeout_sec               | integer       | no                                           | no           | 60                         | Bigquery API connection, and request timeout. If you send big data to Bigquery, set large value.       |
+| time_partitioning_type                 | enum          | no (either day)                              | no           | nil                        | Type of bigquery time partitioning feature(experimental feature on BigQuery).                          |
+| time_partitioning_expiration           | time          | no                                           | no           | nil                        | Expiration milliseconds for bigquery time partitioning. (experimental feature on BigQuery)             |
 
-| name                                   | type          | required?    | placeholder? | default                    | description              |
-| :------------------------------------- | :------------ | :----------- | :----------  | :------------------------- | :----------------------- |
-| replace_record_key                     | bool          | no           | no           | false                      | Use other filter plugin. |
-| replace_record_key_regexp{1-10}        | string        | no           | no           | nil                        |                          |
+#### bigquery_insert
 
+| name                                   | type          | required?    | placeholder? | default                    | description                                                                                                                                                                                |
+| :------------------------------------- | :------------ | :----------- | :----------  | :------------------------- | :-----------------------                                                                                                                                                                   |
+| template_suffix                        | string        | no           | yes          | nil                        | can use `%{time_slice}` placeholder replaced by `time_slice_format`                                                                                                                        |
+| skip_invalid_rows                      | bool          | no           | no           | false                      |                                                                                                                                                                                            |
+| insert_id_field                        | string        | no           | no           | nil                        | Use key as `insert_id` of Streaming Insert API parameter. see. https://docs.fluentd.org/v1.0/articles/api-plugin-helper-record_accessor                                                    |
+| add_insert_timestamp                   | string        | no           | no           | nil                        | Adds a timestamp column just before sending the rows to BigQuery, so that buffering time is not taken into account. Gives a field in BigQuery which represents the insert time of the row. |
+| allow_retry_insert_errors              | bool          | no           | no           | false                      | Retry to insert rows when an insertErrors occurs. There is a possibility that rows are inserted in duplicate.                                                                              |
+
+#### bigquery_load
+
+| name                                   | type          | required?    | placeholder? | default                    | description                                                                                                                                    |
+| :------------------------------------- | :------------ | :----------- | :----------  | :------------------------- | :-----------------------                                                                                                                       |
+| source_format                          | enum          | no           | no           | json                       | Specify source format `json` or `csv` or `avro`. If you change this parameter, you must change formatter plugin via `<format>` config section. |
+| max_bad_records                        | integer       | no           | no           | 0                          | If the number of bad records exceeds this value, an invalid error is returned in the job result.                                               |
+
 ### Buffer section
 
 | name                                   | type          | required?    | default                        | description                        |
 | :------------------------------------- | :------------ | :----------- | :-------------------------     | :-----------------------           |
 | @type                                  | string        | no           | memory (insert) or file (load) |                                    |
 | chunk_limit_size                       | integer       | no           | 1MB (insert) or 1GB (load)     |                                    |
 | total_limit_size                       | integer       | no           | 1GB (insert) or 32GB (load)    |                                    |
 | chunk_records_limit                    | integer       | no           | 500 (insert) or nil (load)     |                                    |
 | flush_mode                             | enum          | no           | interval                       | default, lazy, interval, immediate |
-| flush_interval                         | float         | no           | 0.25 (insert) or nil (load)    |                                    |
-| flush_thread_interval                  | float         | no           | 0.05 (insert) or nil (load)    |                                    |
-| flush_thread_burst_interval            | float         | no           | 0.05 (insert) or nil (load)    |                                    |
+| flush_interval                         | float         | no           | 1.0 (insert) or 3600 (load)    |                                    |
+| flush_thread_interval                  | float         | no           | 0.05 (insert) or 5 (load)      |                                    |
+| flush_thread_burst_interval            | float         | no           | 0.05 (insert) or 5 (load)      |                                    |
 
 And, other params (defined by base class) are available
 
 see. https://github.com/fluent/fluentd/blob/master/lib/fluent/plugin/output.rb
 
@@ -140,14 +149,12 @@
 
 Configure insert specifications with target table schema, with your credentials. This is minimum configurations:
 
 ```apache
 <match dummy>
-  @type bigquery
+  @type bigquery_insert
 
-  method insert    # default
-
   auth_method private_key   # default
   email xxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxx@developer.gserviceaccount.com
   private_key_path /home/username/.keys/00000000000000000000000000000000-privatekey.p12
   # private_key_passphrase notasecret # default
 
@@ -179,18 +186,16 @@
 
 For high rate inserts over streaming inserts, you should specify flush intervals and buffer chunk options:
 
 ```apache
 <match dummy>
-  @type bigquery
-
-  method insert    # default
+  @type bigquery_insert
   
   <buffer>
     flush_interval 0.1  # flush as frequent as possible
     
-    buffer_queue_limit 10240        # 1MB * 10240 -> 10GB!
+    total_limit_size 10g
     
     flush_thread_count 16
   </buffer>
   
   auth_method private_key   # default
@@ -254,20 +259,16 @@
 section in the Google BigQuery document.
 
 ### Load
 ```apache
 <match bigquery>
-  @type bigquery
+  @type bigquery_load
 
-  method load
-
   <buffer>
-  @type file
-  path bigquery.*.buffer
-  flush_interval 1800
-  flush_at_shutdown true
-  timekey_use_utc
+    path bigquery.*.buffer
+    flush_at_shutdown true
+    timekey_use_utc
   </buffer>
 
   auth_method json_key
   json_key json_key_path.json
 
@@ -300,11 +301,11 @@
 You first need to create a service account (client ID),
 download its JSON key and deploy the key with fluentd.
 
 ```apache
 <match dummy>
-  @type bigquery
+  @type bigquery_insert
 
   auth_method json_key
   json_key /home/username/.keys/00000000000000000000000000000000-jsonkey.json
 
   project yourproject_id
@@ -317,11 +318,11 @@
 You can also provide `json_key` as embedded JSON string like this.
 You need to only include `private_key` and `client_email` key from JSON key file.
 
 ```apache
 <match dummy>
-  @type bigquery
+  @type bigquery_insert
 
   auth_method json_key
   json_key {"private_key": "-----BEGIN PRIVATE KEY-----\n...", "client_email": "xxx@developer.gserviceaccount.com"}
 
   project yourproject_id
@@ -338,11 +339,11 @@
 In this authentication method, you need to add the API scope "https://www.googleapis.com/auth/bigquery" to the scope list of your
 Compute Engine instance, then you can configure fluentd like this.
 
 ```apache
 <match dummy>
-  @type bigquery
+  @type bigquery_insert
 
   auth_method compute_engine
 
   project yourproject_id
   dataset yourdataset_id
@@ -380,11 +381,11 @@
 For example, with the configuration below,
 data is inserted into tables `accesslog_2014_08`, `accesslog_2014_09` and so on.
 
 ```apache
 <match dummy>
-  @type bigquery
+  @type bigquery_insert
 
   ...
 
   project yourproject_id
   dataset yourdataset_id
@@ -428,11 +429,11 @@
 
 Use placeholder.
 
 ```apache
 <match dummy>
-  @type bigquery
+  @type bigquery_insert
 
   ...
   table   accesslog$%Y%m%d
 
   <buffer time>
@@ -451,11 +452,11 @@
 
 NOTE: `auto_create_table` option cannot be used with `fetch_schema`. You should create the table on ahead to use `fetch_schema`.
 
 ```apache
 <match dummy>
-  @type bigquery
+  @type bigquery_insert
 
   ...
 
   auto_create_table true
   table accesslog_%Y_%m
@@ -475,11 +476,11 @@
 The examples above use the first method.  In this method,
 you can also specify nested fields by prefixing their belonging record fields.
 
 ```apache
 <match dummy>
-  @type bigquery
+  @type bigquery_insert
 
   ...
 
   schema [
     {"name": "time", "type": "INTEGER"},
@@ -526,11 +527,11 @@
 
 The second method is to specify a path to a BigQuery schema file instead of listing fields.  In this case, your fluent.conf looks like:
 
 ```apache
 <match dummy>
-  @type bigquery
+  @type bigquery_insert
 
   ...
   
   schema_path /path/to/httpd.schema
 </match>
@@ -539,11 +540,11 @@
 
 The third method is to set `fetch_schema` to `true` to enable fetch a schema using BigQuery API.  In this case, your fluent.conf looks like:
 
 ```apache
 <match dummy>
-  @type bigquery
+  @type bigquery_insert
 
   ...
   
   fetch_schema true
   # fetch_schema_table other_table # if you want to fetch schema from other table
@@ -557,13 +558,15 @@
 
 ### Specifying insertId property
 
 BigQuery uses `insertId` property to detect duplicate insertion requests (see [data consistency](https://cloud.google.com/bigquery/streaming-data-into-bigquery#dataconsistency) in Google BigQuery documents).
 You can set `insert_id_field` option to specify the field to use as `insertId` property.
+`insert_id_field` can use fluentd record_accessor format like `$['key1'][0]['key2']`.
+(detail. https://docs.fluentd.org/v1.0/articles/api-plugin-helper-record_accessor)
 
 ```apache
 <match dummy>
-  @type bigquery
+  @type bigquery_insert
 
   ...
 
   insert_id_field uuid
   schema [{"name": "uuid", "type": "STRING"}]