Sha256: 9a28550b53f85b23b736acb994a48647444b5657e9d3db0a1ef916865c9a1a75
Contents?: true
Size: 1.49 KB
Versions: 3
Compression:
Stored size: 1.49 KB
Contents
# Hdfs file input plugin for Embulk Read files on Hdfs. ## Overview * **Plugin type**: file input * **Resume supported**: not yet * **Cleanup supported**: no ## Configuration - **config_files** list of paths to Hadoop's configuration files (array of strings, default: `[]`) - **config** overwrites configuration parameters (hash, default: `{}`) - **input_path** file path on Hdfs. you can use glob and Date format like `%Y%m%d/%s`. - **rewind_seconds** When you use Date format in input_path property, the format is executed by using the time which is Now minus this property. ## Example ```yaml in: type: hdfs config_files: - /opt/analytics/etc/hadoop/conf/core-site.xml - /opt/analytics/etc/hadoop/conf/hdfs-site.xml config: fs.defaultFS: 'hdfs://hdp-nn1:8020' dfs.replication: 1 fs.hdfs.impl: 'org.apache.hadoop.hdfs.DistributedFileSystem' fs.file.impl: 'org.apache.hadoop.fs.LocalFileSystem' input_path: /user/embulk/test/%Y-%m-%d/* rewind_seconds: 86400 decoders: - {type: gzip} parser: charset: UTF-8 newline: CRLF type: csv delimiter: "\t" quote: '' escape: '' trim_if_not_quoted: true skip_header_lines: 0 allow_extra_columns: true allow_optional_columns: true columns: - {name: c0, type: string} - {name: c1, type: string} - {name: c2, type: string} - {name: c3, type: long} ``` ## Build ``` $ ./gradlew gem ``` ## Development ``` $ ./gradlew classpath $ bundle exec embulk run -I lib example.yml ```
Version data entries
3 entries across 3 versions & 1 rubygems
Version | Path |
---|---|
embulk-input-hdfs-0.0.3 | README.md |
embulk-input-hdfs-0.0.2 | README.md |
embulk-input-hdfs-0.0.1 | README.md |