README.md in embulk-output-redshift-0.4.0 vs README.md in embulk-output-redshift-0.4.1
- old
+ new
@@ -1,97 +1,104 @@
-# Redshift output plugins for Embulk
-
-Redshift output plugins for Embulk loads records to Redshift.
-
-## Overview
-
-* **Plugin type**: output
-* **Load all or nothing**: depnds on the mode. see bellow.
-* **Resume supported**: depnds on the mode. see bellow.
-
-## Configuration
-
-- **host**: database host name (string, required)
-- **port**: database port number (integer, default: 5439)
-- **user**: database login user name (string, required)
-- **password**: database login password (string, default: "")
-- **database**: destination database name (string, required)
-- **schema**: destination schema name (string, default: "public")
-- **table**: destination table name (string, required)
-- **options**: extra connection properties (hash, default: {})
-- **mode**: "replace" or "insert" (string, required)
-- **batch_size**: size of a single batch insert (integer, default: 16777216)
-- **default_timezone**: If input column type (embulk type) is timestamp, this plugin needs to format the timestamp into a SQL string. This default_timezone option is used to control the timezone. You can overwrite timezone for each columns using column_options option. (string, default: `UTC`)
-- **column_options**: advanced: a key-value pairs where key is a column name and value is options for the column.
- - **type**: type of a column when this plugin creates new tables (e.g. `VARCHAR(255)`, `INTEGER NOT NULL UNIQUE`). This used when this plugin creates intermediate tables (insert, truncate_insert and merge modes), when it creates the target table (insert_direct and replace modes), and when it creates nonexistent target table automatically. (string, default: depends on input column type. `BIGINT` if input column type is long, `BOOLEAN` if boolean, `DOUBLE PRECISION` if double, `CLOB` if string, `TIMESTAMP` if timestamp)
- - **value_type**: This plugin converts input column type (embulk type) into a database type to build a INSERT statement. This value_type option controls the type of the value in a INSERT statement. (string, default: depends on input column type. Available values options are: `byte`, `short`, `int`, `long`, `double`, `float`, `boolean`, `string`, `nstring`, `date`, `time`, `timestamp`, `decimal`, `null`, `pass`)
- - **timestamp_format**: If input column type (embulk type) is timestamp and value_type is `string` or `nstring`, this plugin needs to format the timestamp value into a string. This timestamp_format option is used to control the format of the timestamp. (string, default: `%Y-%m-%d %H:%M:%S.%6N`)
- - **timezone**: If input column type (embulk type) is timestamp, this plugin needs to format the timestamp value into a SQL string. In this cases, this timezone option is used to control the timezone. (string, value of default_timezone option is used by default)
-
-### Modes
-
-* **insert**:
- * Behavior: This mode writes rows to some intermediate tables first. If all those tasks run correctly, runs `INSERT INTO <target_table> SELECT * FROM <intermediate_table_1> UNION ALL SELECT * FROM <intermediate_table_2> UNION ALL ...` query.
- * Transactional: Yes. This mode successfully writes all rows, or fails with writing zero rows.
- * Resumable: Yes.
-* **insert_direct**:
- * Behavior: This mode inserts rows to the target table directly.
- * Transactional: No. If fails, the target table could have some rows inserted.
- * Resumable: No.
-* **truncate_insert**:
- * Behavior: Same with `insert` mode excepting that it truncates the target table right before the last `INSERT ...` query.
- * Transactional: Yes.
- * Resumable: Yes.
-* **merge**:
- * Behavior: This mode writes rows to some intermediate tables first. If all those tasks run correctly, runs `with updated AS (UPDATE .... RETURNING ...) INSERT INTO ....` query.
- * Transactional: Yes.
- * Resumable: Yes.
-* **replace**:
- * Behavior: Same with `insert` mode excepting that it truncates the target table right before the last `INSERT ...` query.
- * Transactional: Yes.
- * Resumable: No.
-
-### Example
-
-```yaml
-out:
- type: redshift
- host: myinstance.us-west-2.redshift.amazonaws.com
- user: pg
- password: ""
- database: my_database
- table: my_table
- access_key_id: ABCXYZ123ABCXYZ123
- secret_access_key: AbCxYz123aBcXyZ123
- s3_bucket: my-redshift-transfer-bucket
- iam_user_name: my-s3-read-only
- mode: insert
-```
-
-Advanced configuration:
-
-```yaml
-out:
- type: redshift
- host: myinstance.us-west-2.redshift.amazonaws.com
- user: pg
- password: ""
- database: my_database
- table: my_table
- access_key_id: ABCXYZ123ABCXYZ123
- secret_access_key: AbCxYz123aBcXyZ123
- s3_bucket: my-redshift-transfer-bucket
- iam_user_name: my-s3-read-only
- options: {loglevel: 2}
- mode: insert_direct
- column_options:
- my_col_1: {type: 'VARCHAR(255)'}
- my_col_3: {type: 'INT NOT NULL'}
- my_col_4: {value_type: string, timestamp_format: `%Y-%m-%d %H:%M:%S %z`, timezone: '-0700'}
- my_col_5: {type: 'DECIMAL(18,9)', value_type: pass}
-```
-
-### Build
-
-```
-$ ./gradlew gem
-```
+# Redshift output plugins for Embulk
+
+Redshift output plugins for Embulk loads records to Redshift.
+
+## Overview
+
+* **Plugin type**: output
+* **Load all or nothing**: depnds on the mode. see bellow.
+* **Resume supported**: depnds on the mode. see bellow.
+
+## Configuration
+
+- **host**: database host name (string, required)
+- **port**: database port number (integer, default: 5439)
+- **user**: database login user name (string, required)
+- **password**: database login password (string, default: "")
+- **database**: destination database name (string, required)
+- **schema**: destination schema name (string, default: "public")
+- **table**: destination table name (string, required)
+- **access_key_id**: access key id for AWS
+- **secret_access_key: secret access key for AWS
+- **iam_user_name: IAM user name for uploading temporary files to S3. The user should have permissions of `s3:GetObject`, `s3:PutObject`, `s3:ListBucket` and `sts:GetFederationToken`.
+- **s3_bucket: S3 bucket name for temporary files
+- **s3_key_prefix: S3 key prefix for temporary files (string, default:"")
+- **options**: extra connection properties (hash, default: {})
+- **mode**: "replace" or "insert" (string, required)
+- **batch_size**: size of a single batch insert (integer, default: 16777216)
+- **default_timezone**: If input column type (embulk type) is timestamp, this plugin needs to format the timestamp into a SQL string. This default_timezone option is used to control the timezone. You can overwrite timezone for each columns using column_options option. (string, default: `UTC`)
+- **column_options**: advanced: a key-value pairs where key is a column name and value is options for the column.
+ - **type**: type of a column when this plugin creates new tables (e.g. `VARCHAR(255)`, `INTEGER NOT NULL UNIQUE`). This used when this plugin creates intermediate tables (insert, truncate_insert and merge modes), when it creates the target table (insert_direct and replace modes), and when it creates nonexistent target table automatically. (string, default: depends on input column type. `BIGINT` if input column type is long, `BOOLEAN` if boolean, `DOUBLE PRECISION` if double, `CLOB` if string, `TIMESTAMP` if timestamp)
+ - **value_type**: This plugin converts input column type (embulk type) into a database type to build a INSERT statement. This value_type option controls the type of the value in a INSERT statement. (string, default: depends on input column type. Available values options are: `byte`, `short`, `int`, `long`, `double`, `float`, `boolean`, `string`, `nstring`, `date`, `time`, `timestamp`, `decimal`, `null`, `pass`)
+ - **timestamp_format**: If input column type (embulk type) is timestamp and value_type is `string` or `nstring`, this plugin needs to format the timestamp value into a string. This timestamp_format option is used to control the format of the timestamp. (string, default: `%Y-%m-%d %H:%M:%S.%6N`)
+ - **timezone**: If input column type (embulk type) is timestamp, this plugin needs to format the timestamp value into a SQL string. In this cases, this timezone option is used to control the timezone. (string, value of default_timezone option is used by default)
+
+### Modes
+
+* **insert**:
+ * Behavior: This mode writes rows to some intermediate tables first. If all those tasks run correctly, runs `INSERT INTO <target_table> SELECT * FROM <intermediate_table_1> UNION ALL SELECT * FROM <intermediate_table_2> UNION ALL ...` query.
+ * Transactional: Yes. This mode successfully writes all rows, or fails with writing zero rows.
+ * Resumable: Yes.
+* **insert_direct**:
+ * Behavior: This mode inserts rows to the target table directly.
+ * Transactional: No. If fails, the target table could have some rows inserted.
+ * Resumable: No.
+* **truncate_insert**:
+ * Behavior: Same with `insert` mode excepting that it truncates the target table right before the last `INSERT ...` query.
+ * Transactional: Yes.
+ * Resumable: Yes.
+* **merge**:
+ * Behavior: This mode writes rows to some intermediate tables first. If all those tasks run correctly, runs `with updated AS (UPDATE .... RETURNING ...) INSERT INTO ....` query.
+ * Transactional: Yes.
+ * Resumable: Yes.
+* **replace**:
+ * Behavior: Same with `insert` mode excepting that it truncates the target table right before the last `INSERT ...` query.
+ * Transactional: Yes.
+ * Resumable: No.
+
+### Example
+
+```yaml
+out:
+ type: redshift
+ host: myinstance.us-west-2.redshift.amazonaws.com
+ user: pg
+ password: ""
+ database: my_database
+ table: my_table
+ access_key_id: ABCXYZ123ABCXYZ123
+ secret_access_key: AbCxYz123aBcXyZ123
+ iam_user_name: my-s3-read-only
+ s3_bucket: my-redshift-transfer-bucket
+ s3_key_prefix: temp/redshift
+ mode: insert
+```
+
+Advanced configuration:
+
+```yaml
+out:
+ type: redshift
+ host: myinstance.us-west-2.redshift.amazonaws.com
+ user: pg
+ password: ""
+ database: my_database
+ table: my_table
+ access_key_id: ABCXYZ123ABCXYZ123
+ secret_access_key: AbCxYz123aBcXyZ123
+ iam_user_name: my-s3-read-only
+ s3_bucket: my-redshift-transfer-bucket
+ s3_key_prefix: temp/redshift
+ options: {loglevel: 2}
+ mode: insert_direct
+ column_options:
+ my_col_1: {type: 'VARCHAR(255)'}
+ my_col_3: {type: 'INT NOT NULL'}
+ my_col_4: {value_type: string, timestamp_format: `%Y-%m-%d %H:%M:%S %z`, timezone: '-0700'}
+ my_col_5: {type: 'DECIMAL(18,9)', value_type: pass}
+```
+
+### Build
+
+```
+$ ./gradlew gem
+```