Sha256: e89f1e68a0a3076926da83879574a86efbbc410694dee23f7e77a0de6001b7b9
Contents?: true
Size: 1.66 KB
Versions: 3
Compression:
Stored size: 1.66 KB
Contents
# Hdfs file output plugin for Embulk A File Output Plugin for Embulk to write HDFS. ## Overview * **Plugin type**: file output * **Load all or nothing**: yes * **Resume supported**: no * **Cleanup supported**: no ## Configuration - **config_files** list of paths to Hadoop's configuration files (array of strings, default: `[]`) - **config** overwrites configuration parameters (hash, default: `{}`) - **path_prefix** prefix of target files (string, required) - **file_ext** suffix of target files (string, required) - **sequence_format** format for sequence part of target files (string, default: `'.%03d.%02d'`) - **rewind_seconds** When you use Date format in path_prefix property(like `/tmp/embulk/%Y-%m-%d/out`), the format is interpreted by using the time which is Now minus this property. (int, default: `0`) - **overwrite** overwrite files when the same filenames already exists (boolean, default: `false`) - *caution*: even if this property is `true`, this does not mean ensuring the idempotence. if you want to ensure the idempotence, you need the procedures to remove output files after or before running. ## Example ```yaml out: type: hdfs config_files: - /etc/hadoop/conf/core-site.xml - /etc/hadoop/conf/hdfs-site.xml config: fs.defaultFS: 'hdfs://hdp-nn1:8020' fs.hdfs.impl: 'org.apache.hadoop.hdfs.DistributedFileSystem' fs.file.impl: 'org.apache.hadoop.fs.LocalFileSystem' path_prefix: '/tmp/embulk/hdfs_output/%Y-%m-%d/out' file_ext: 'txt' overwrite: true formatter: type: csv encoding: UTF-8 ``` ## Build ``` $ ./gradlew gem ``` ## Development ``` $ ./gradlew classpath $ bundle exec embulk run -I lib example.yml ```
Version data entries
3 entries across 3 versions & 1 rubygems
Version | Path |
---|---|
embulk-output-hdfs-0.2.2 | README.md |
embulk-output-hdfs-0.2.1 | README.md |
embulk-output-hdfs-0.2.0 | README.md |