README.md in fluent-plugin-bigquery-0.0.2 vs README.md in fluent-plugin-bigquery-0.0.3

- old
+ new

@@ -7,26 +7,26 @@ * https://developers.google.com/bigquery/streaming-data-into-bigquery#usecases * (NOT IMPLEMENTED) load data * for data loading as batch jobs, for big amount of data * https://developers.google.com/bigquery/loading-data-into-bigquery -Current version of this plugin supports Google API with Service Account Authentication, and does not support OAuth. +Current version of this plugin supports Google API with Service Account Authentication, but does not support +OAuth flow for installed applications. ## Configuration ### Streming inserts -For service account authentication, generate service account private key file and email key, then upload private key file onto your server. - Configure insert specifications with target table schema, with your credentials. This is minimum configurations: ```apache <match dummy> type bigquery method insert # default + auth_method private_key # default email xxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxx@developer.gserviceaccount.com private_key_path /home/username/.keys/00000000000000000000000000000000-privatekey.p12 # private_key_passphrase notasecret # default project yourproject_id @@ -56,10 +56,11 @@ buffer_chunk_records_limit 300 # default rate limit for users is 100 buffer_queue_limit 10240 # 1MB * 10240 -> 10GB! num_threads 16 + auth_method private_key # default email xxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxx@developer.gserviceaccount.com private_key_path /home/username/.keys/00000000000000000000000000000000-privatekey.p12 # private_key_passphrase notasecret # default project yourproject_id @@ -97,9 +98,44 @@ * 10 or more threads seems good for inserts over internet * less threads may be good for Google Compute Engine instances (with low latency for BigQuery) * `flush_interval` * `1` is lowest value, without patches on Fluentd v0.10.41 or earlier * see `patches` below + +### Authentication + +There are two methods supported to fetch access token for the service account. +1. Public-Private key pair +2. Predefined access token (Compute Engine only) + +The examples above use the first one. You first need to create a service account (client ID), +download its private key and deploy the key with fluentd. + +On the other hand, you don't need to explicitly create a service account for fluentd when you +run fluentd in Google Compute Engine. In this second authentication method, you need to +add the API scope "https://www.googleapis.com/auth/bigquery" to the scope list of your +Compute Engine instance, then you can configure fluentd like this. + +```apache +<match dummy> + type bigquery + + auth_method compute_engine + + project yourproject_id + dataset yourdataset_id + table tablename + + time_format %s + time_field time + + field_integer time,status,bytes + field_string rhost,vhost,path,method,protocol,agent,referer + field_float requestime + field_boolean bot_access,loginsession +</match> +``` + ### patches This plugin depends on `fluent-plugin-buffer-lightening`, and it includes monkey patch module for BufferedOutput plugin, to realize high rate and low latency flushing. With this patch, sub 1 second flushing available.