# Csv2Hash [![Code Climate](https://codeclimate.com/github/joel/csv2hash.png)](https://codeclimate.com/github/joel/csv2hash) [![Dependency Status](https://gemnasium.com/joel/csv2hash.png)](https://gemnasium.com/joel/csv2hash) [![Build Status](https://travis-ci.org/joel/csv2hash.png?branch=master)](https://travis-ci.org/joel/csv2hash) (Travis CI) [![Coverage Status](https://coveralls.io/repos/joel/csv2hash/badge.png)](https://coveralls.io/r/joel/csv2hash) It is a DSL to validate and map a CSV to a Ruby Hash. ## Installation Add this line to your application's Gemfile: gem 'csv2hash' And then execute: $ bundle Or install it yourself as: $ gem install csv2hash ## Usage Parsing is based on rules, you must defined rules of parsing ### Rules You should declare a definition for you CSV, and then define for each cell what you would expect. Example : If you want the very first cell, located on the first line and on the first column to be a string with values are either 'yes' either 'no', you can write the following validation rule: { name: 'aswering', type: 'string', values: ['yes', 'no'], position: [0,0] } :type attribute has 'string' for default value, therefore you can just write this: { name: 'aswering', values: ['yes', 'no'], position: [0,0] } You can define you own message but default message is 'undefined :key on :position' { name: 'aswering', values: ['yes', 'no'], position: [0,0], message: 'this value is not supported' } You can also define Range of values { name: 'score', values: 0..5, position: [0,0] } The message is parsed: { ..., message: 'value of :name is not supported, please you one of :values' } It produces : value of answering is not supported, please use one of [yes, no] ### Default values Only position is required: * :position All remaining keys are optionals: * message: 'undefined :key on :position' * mappable: true * type: 'string' * values: nil * nested: nil * allow_blank: false * extra_validator: nil ## Define where your data is expected **IMPORTANT!** Position mean [Y, X], where Y is rows, X columns A definition should be provided. There are 2 types of definitions: * search for data at a precise position in the table: `y,x` * or search for data in a column of rows, where all the rows are the same: `x` (column index) ## Samples ### Validation of cells with defined precision Consider the following CSV: | Fields | Person Informations | Optional | |-------------|----------------------|----------| | Nickname | jo | no | | First Name | John | yes | | Last Name | Doe | yes | Precise position validation sample: class MyParser attr_accessor :file_path def initialize file_path @file_path = file_path end def data @data_wrapper ||= Csv2hash.new(definition, file_path).parse end private def rules [].tap do |mapping| mapping << { position: [2,1], key: 'first_name' } mapping << { position: [3,1], key: 'last_name' } end end def definition Csv2Hash::Definition.new(rules, type = Csv2Hash::Definition::MAPPING, header_size: 1) end end ### Validation of a collection (Regular CSV) Consider the following CSV: | Nickname | First Name | Last Name | |----------|------------|-----------| | jo | John | Doe | | ja | Jane | Doe | Collection validation sample: class MyParser attr_accessor :file_path def initialize file_path @file_path = file_path end def data @data_wrapper ||= Csv2hash.new(definition, file_path).parse end private def rules [].tap do |mapping| mapping << { position: 0, key: 'nickname' } mapping << { position: 1, key: 'first_name' } mapping << { position: 2, key: 'last_name' } end end def definition Csv2Hash::Definition.new(rules, type = Csv2Hash::Definition::COLLECTION, header_size: 1) end end ### Structure validation rules You may want to validate some structure, like min or max number of columns, definition accepts structure_rules as a key for the third parameter. Current validations are: MinColumn, MaxColumn class MyParser attr_accessor :file_path def initialize file_path @file_path = file_path end def data @data_wrapper ||= Csv2hash.new(definition, file_path).parse end private def rules [].tap do |mapping| mapping << { position: 0, key: 'nickname' } mapping << { position: 1, key: 'first_name' } mapping << { position: 2, key: 'last_name' } end end def definition Csv2Hash::Definition.new(rules, type = Csv2Hash::Definition::COLLECTION, structure_rules: { 'MinColumns' => 2, 'MaxColumns' => 3 }) end end ### CSV Headers You can define the number of rows to skip in the header of the CSV. Definition.new(rules, type, header_size: 0) ### Parser and configuration Pasrer can take several parameters like that: definition, file_path, exception_mode=true, data_source=nil, ignore_blank_line=false you can pass directly Array of data (Array at 2 dimensions) really useful for testing, if you don't care about blank lines in your CSV you can ignore them. ### Response The parser return values wrapper into DataWrapper Object, you can call ```.valid?``` method on this Object and grab either data or errors like that : response = parser.parse if response.valid? response.data else response.errors end data or errors are Array, but errors can be formatted on csv format with .to_csv call response.errors.to_csv ## Exception or Not ! You can choose into 2 different modes of parsing, either **exception mode** for raise exception in first breaking rules or **csv mode** for get csv original data + errors throwing into added columns. ### On **CSV MODE** you can choose different way for manage errors `.parse()` return `data_wrapper` if `.parse()` is invalid, you can code your own behavior: in your code parser = Csv2hash.new(definition, 'file_path').new response = parser.parse return response if response.valid? # Whatever In the same time Csv2hash call **notify(response)** method when CSV parsing fail, you can add your own Notifier: module Csv2hash module Plugins class Notifier def initialize csv2hash csv2hash.notifier.extend NotifierWithEmail end module NotifierWithEmail def notify response filename = 'issues_errors.csv' tempfile = Tempfile.new [filename, File.extname(filename)] File.open(tempfile.path, 'wb') { |file| file.write response.errors.to_csv } # Send mail with csv file + errors and free resource tempfile.unlink end end end end end Or other implementation ### Errors Format errors is a Array of Hash { y: 1, x: 0, message: 'message', key: 'key', value: '' } ## Sample ### Csv data | Fields | Person Informations | |-------------|----------------------| | Nickname | nil | ### Rule { position: [1,1], key: 'nickname', allow_blank: false } ### Error { y: 1, x: 1, message: 'undefined nikcname on [0, 0]', key: 'nickname', value: nil } ## Personal Validator Rule You can define your own Validator For downcase validation class DowncaseValidator < Csv2hash::ExtraValidator def valid? value !!(value.match /^[a-z]+$/) end end in your rule { position: [0,0], key: 'name', extra_validator: DowncaseValidator.new, message: 'your data should be written in lowercase only.' } Csv data [ [ 'Foo' ] ] # Upgrading from 0.1 to 0.2 The signature of Definition#new has changed, the last parameter is a configuration hash, while in versions prior to 0.2 it was an integer (header_size) consider upgrading your code : Prior to 0.2 : ``` Csv2Hash::Definition.new(rules, type = Csv2Hash::Definition::COLLECTION, 1) ``` Starting from 0.2 : ``` Csv2Hash::Definition.new(rules, type = Csv2Hash::Definition::COLLECTION, header_size: 1) ``` If no configuration is passed, header_size defaults remains to 0 ## Contributing 1. Fork it 2. Create your feature branch (`git checkout -b my-new-feature`) 3. Commit your changes (`git commit -am 'Added some feature'`) 4. Push to the branch (`git push origin my-new-feature`) 5. Create new Pull Request