# TD output plugin for Embulk TODO: Write short description here ## Overview * **Plugin type**: output * **Load all or nothing**: yes * **Resume supported**: no ## Configuration - **apikey**: apikey (string, required) - **endpoint**: hostname (string, default='api.treasuredata.com') - **http_proxy**: http proxy configuration (tuple of host, port and useSsl, default is null) - **use_ssl**: the flag (boolean, default=true) - **auto_create_table**: the flag for creating the database and/or the table if they don't exist (boolean, default=true) - **mode**: 'append', 'replace' and 'truncate' (string, default='append') - **database**: database name (string, required) - **table**: table name (string, required) - **session**: bulk_import session name (string, optional) - **time_column**: user-defined time column (string, optional) - **unix_timestamp_unit**: if type of "time" or **time_column** is long, it's considered unix timestamp. This option specify its unit in sec, milli, micro or nano (enum, default: `sec`) - **tmpdir**: temporal directory - **upload_concurrency**: upload concurrency (int, default=2). max concurrency is 8. - **file_split_size**: split size (long, default=16384 (16MB)). - **stop_on_invalid_record**: stop bulk load transaction if a file includes invalid record (such as invalid timestamp) (boolean, default=false). - **displayed_error_records_count_limit**: limit the count of the shown error records skipped by the perform job (int, default=10). - **default_timestamp_type_convert_to**: configure output type of timestamp columns. Available options are "sec" (convert timestamp to UNIX timestamp in seconds) and "string" (convert timestamp to string). (string, default: `"string"`) - **default_timezone**: default timezone (string, default='UTC') - **default_timestamp_format**: default timestamp format (string, default=`%Y-%m-%d %H:%M:%S.%6N`) - **column_options**: advanced: a key-value pairs where key is a column name and value is options for the column. - **timezone**: If input column type (embulk type) is timestamp, this plugin needs to format the timestamp value into a SQL string. In this cases, this timezone option is used to control the timezone. (string, value of default_timezone option is used by default) - **format**: If input column type (embulk type) is timestamp, this plugin needs to format the timestamp value into a string. This timestamp_format option is used to control the format of the timestamp. (string, value of default_timestamp_format option is used by default) ## Modes * **append**: - Uploads data to existing table directly. * **replace**: - Creates new temp table and uploads data to the temp table first. - After uploading finished, the table specified as 'table' option is replaced with the temp table. - Schema in existing table is not migrated to the replaced table. * **truncate**: - Creates new temp table and uploads data to the temp table first. - After uploading finished, the table specified as 'table' option is replaced with the temp table. - Schema in existing table is added to the replaced table. ## Example Here is sample configuration for TD output plugin. ```yaml out: type: td apikey: endpoint: api.treasuredata.com database: my_db table: my_table time_column: created_at ``` ## Install ``` $ embulk gem install embulk-output-td ``` ### Http Proxy Configuration If you want to add your Http Proxy configuration, you can use `http_proxy` parameter: ```yaml out: type: td apikey: endpoint: api.treasuredata.com http_proxy: {host: localhost, port: 8080, use_ssl: false} database: my_db table: my_table time_column: created_at ``` ## Build ### Build by Gradle ``` $ git clone https://github.com/treasure-data/embulk-output-td.git $ cd embulk-output-td $ ./gradlew gem classpath ``` ### Run on Embulk $ bin/embulk run -I embulk-output-td/lib/ config.yml