# Burner [![Gem Version](https://badge.fury.io/rb/burner.svg)](https://badge.fury.io/rb/burner) [![Build Status](https://travis-ci.org/bluemarblepayroll/burner.svg?branch=master)](https://travis-ci.org/bluemarblepayroll/burner) [![Maintainability](https://api.codeclimate.com/v1/badges/dbc3757929b67504f6ca/maintainability)](https://codeclimate.com/github/bluemarblepayroll/burner/maintainability) [![Test Coverage](https://api.codeclimate.com/v1/badges/dbc3757929b67504f6ca/test_coverage)](https://codeclimate.com/github/bluemarblepayroll/burner/test_coverage) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) This library serves as the skeleton for a processing engine. It allows you to organize your code into Jobs, then stitch those jobs together as steps. ## Installation To install through Rubygems: ````bash gem install burner ```` You can also add this to your Gemfile: ````bash bundle add burner ```` ## Examples The purpose of this library is to provide a framework for creating highly de-coupled functions (known as jobs), and then allow for the stitching of them back together in any arbitrary order (know as steps.) Although our example will be somewhat specific and contrived, the only limit to what the jobs and order of jobs are is up to your imagination. ### JSON-to-YAML File Converter All the jobs for this example are shipped with this library. In this example, we will write a pipeline that can read a JSON file and convert it to YAML. Pipelines are data-first so we can represent a pipeline using a hash: ````ruby pipeline = { jobs: [ { name: :read, type: 'io/read', path: '{input_file}' }, { name: :output_id, type: :echo, message: 'The job id is: {__id}' }, { name: :output_value, type: :echo, message: 'The current value is: {__value}' }, { name: :parse, type: 'deserialize/json' }, { name: :convert, type: 'serialize/yaml' }, { name: :write, type: 'io/write', path: '{output_file}' } ], steps: %i[ read output_id output_value parse convert output_value write ] } params = { input_file: 'input.json', output_file: 'output.yaml' } payload = Burner::Payload.new(params: params) ```` Assuming we are running this script from a directory where an `input.json` file exists, we can then programatically process the pipeline: ````ruby Burner::Pipeline.make(pipeline).execute(payload: payload) ```` We should now see a output.yaml file created. Some notes: * Some values are able to be string-interpolated using the provided Payload#params. This allows for the passing runtime configuration/data into pipelines/jobs. * The job's ID can be accessed using the `__id` key. * The current job's payload value can be accessed using the `__value` key. * Jobs can be re-used (just like the output_id and output_value jobs). ### Capturing Feedback / Output By default, output will be emitted to `$stdout`. You can add or change listeners by passing in optional values into Pipeline#execute. For example, say we wanted to capture the output from our json-to-yaml example: ````ruby class StringOut def initialize @io = StringIO.new end def puts(msg) tap { io.write("#{msg}\n") } end def read io.rewind io.read end private attr_reader :io end string_out = StringOut.new output = Burner::Output.new(outs: string_out) payload = Burner::Payload.new(params: params) Burner::Pipeline.make(pipeline).execute(output: output, payload: payload) log = string_out.read ```` The value of `log` should now look similar to: ````bash [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] Pipeline started with 7 step(s) [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] Parameters: [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] - input_file: input.json [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] - output_file: output.yaml [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] -------------------------------------------------------------------------------- [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] [1] Burner::Jobs::IO::Read::read [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] - Reading: spec/fixtures/input.json [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] - Completed in: 0.0 second(s) [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] [2] Burner::Jobs::Echo::output_id [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] - The job id is: [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] - Completed in: 0.0 second(s) [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] [3] Burner::Jobs::Echo::output_value [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] - The current value is: [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] - Completed in: 0.0 second(s) [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] [4] Burner::Jobs::Deserialize::Json::parse [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] - Completed in: 0.0 second(s) [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] [5] Burner::Jobs::Serialize::Yaml::convert [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] - Completed in: 0.0 second(s) [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] [6] Burner::Jobs::Echo::output_value [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] - The current value is: [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] - Completed in: 0.0 second(s) [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] [7] Burner::Jobs::IO::Write::write [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] - Writing: output.yaml [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] - Completed in: 0.0 second(s) [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] -------------------------------------------------------------------------------- [8bdc394e-7047-4a1a-87ed-6c54ed690ed5 | 2020-10-14 13:49:59 UTC] Pipeline ended, took 0.001 second(s) to complete ```` Notes: * The Job ID is specified as the leading UUID in each line. * `outs` can be provided an array of listeners, as long as each listener responds to `puts(msg)`. ### Command Line Pipeline Processing This library also ships with a built-in script `exe/burner` that illustrates using the `Burner::Cli` API. This class can take in an array of arguments (similar to a command-line) and execute a pipeline. The first argument is the path to a YAML file with the pipeline's configuration and each subsequent argument is a param in `key=value` form. Here is how the json-to-yaml example can utilize this interface: #### Create YAML Pipeline Configuration File Write the following json_to_yaml_pipeline.yaml file to disk: ````yaml jobs: - name: read type: io/read path: '{input_file}' - name: output_id type: echo message: 'The job id is: {__id}' - name: output_value type: echo message: 'The current value is: {__value}' - name: parse type: deserialize/json - name: convert type: serialize/yaml - name: write type: io/write path: '{output_file}' steps: - read - output_id - output_value - parse - convert - output_value - write ```` #### Run Using Script From the command-line, run: ````bash bundle exec burner json_to_yaml_pipeline.yaml input_file=input.json output_file=output.yaml ```` The pipeline should be processed and output.yaml should be created. #### Run Using Programmatic API Instead of the script, you can invoke it using code: ````ruby args = %w[ json_to_yaml_pipeline.yaml input_file=input.json output_file=output.yaml ] Burner::Cli.new(args).invoke ```` ### Core Job Library This library only ships with very basic, rudimentary jobs that are meant to just serve as a baseline: #### Collection * **collection/arrays_to_objects** [mappings]: Convert an array of arrays to an array of objects. * **collection/graph** [config, key]: Use (Hashematics)[https://github.com/bluemarblepayroll/hashematics] to turn a flat array of objects into a deeply nested object tree. * **collection/objects_to_arrays** [mappings]: Convert an array of objects to an array of arrays. * **collection/shift** [amount]: Remove the first N number of elements from an array. * **collection/transform** [attributes, exclusive, separator]: Iterate over all objects and transform each key per the attribute transformers specifications. If exclusive is set to false then the current object will be overridden/merged. Separator can also be set for key path support. This job uses (Realize)[https://github.com/bluemarblepayroll/realize], which provides its own extendable value-transformation pipeline. * **collection/unpivot** [pivot_set]: Take an array of objects and unpivot specific sets of keys into rows. Under the hood it uses [HashMath's Unpivot class](https://github.com/bluemarblepayroll/hash_math#unpivot-hash-key-coalescence-and-row-extrapolation). * **collection/values** [include_keys]: Take an array of objects and call `#values` on each object. If include_keys is true (it is false by default), then call `#keys` on the first object and inject that as a "header" object. #### De-serialization * **deserialize/csv** []: Take a CSV string and de-serialize into object(s). Currently it will return an array of arrays, with each nested array representing one row. * **deserialize/json** []: Treat input as a string and de-serialize it to JSON. * **deserialize/yaml** [safe]: Treat input as a string and de-serialize it to YAML. By default it will try and (safely de-serialize)[https://ruby-doc.org/stdlib-2.6.1/libdoc/psych/rdoc/Psych.html#method-c-safe_load] it (only using core classes). If you wish to de-serialize it to any class type, pass in `safe: false` #### IO * **io/exist** [path, short_circuit]: Check to see if a file exists. The path parameter can be interpolated using `Payload#params`. If short_circuit was set to true (defaults to false) and the file does not exist then the pipeline will be short-circuited. * **io/read** [binary, path]: Read in a local file. The path parameter can be interpolated using `Payload#params`. If the contents are binary, pass in `binary: true` to open it up in binary+read mode. * **io/write** [binary, path]: Write to a local file. The path parameter can be interpolated using `Payload#params`. If the contents are binary, pass in `binary: true` to open it up in binary+write mode. #### Serialization * **serialize/csv** []: Take an array of arrays and create a CSV. * **serialize/json** []: Convert value to JSON. * **serialize/yaml** []: Convert value to YAML. #### General * **dummy** []: Do nothing * **echo** [message]: Write a message to the output. The message parameter can be interpolated using `Payload#params`. * **set** [value]: Set the value to any arbitrary value. * **sleep** [seconds]: Sleep the thread for X number of seconds. ### Adding & Registering Jobs Where this library shines is when additional jobs are plugged in. Burner uses its `Burner::Jobs` class as its class-level registry built with (acts_as_hashable)[https://github.com/bluemarblepayroll/acts_as_hashable]'s acts_as_hashable_factory directive. Let's say we would like to register a job to parse a CSV: ````ruby class ParseCsv < Burner::Job def perform(output, payload) payload.value = CSV.parse(payload.value, headers: true).map(&:to_h) nil end end Burner::Jobs.register('parse_csv', ParseCsv) ```` `parse_csv` is now recognized as a valid job and we can use it: ````ruby pipeline = { jobs: [ { name: :read, type: 'io/read', path: '{input_file}' }, { name: :output_id, type: :echo, message: 'The job id is: {__id}' }, { name: :output_value, type: :echo, message: 'The current value is: {__value}' }, { name: :parse, type: :parse_csv }, { name: :convert, type: 'serialize/yaml' }, { name: :write, type: 'io/write', path: '{output_file}' } ], steps: %i[ read output_id output_value parse convert output_value write ] } params = { input_file: File.join('spec', 'fixtures', 'cars.csv'), output_file: File.join(TEMP_DIR, "#{SecureRandom.uuid}.yaml") } payload = Burner::Payload.new(params: params) Burner::Pipeline.make(pipeline).execute(output: output, payload: payload) ```` ## Contributing ### Development Environment Configuration Basic steps to take to get this repository compiling: 1. Install [Ruby](https://www.ruby-lang.org/en/documentation/installation/) (check burner.gemspec for versions supported) 2. Install bundler (gem install bundler) 3. Clone the repository (git clone git@github.com:bluemarblepayroll/burner.git) 4. Navigate to the root folder (cd burner) 5. Install dependencies (bundle) ### Running Tests To execute the test suite run: ````bash bundle exec rspec spec --format documentation ```` Alternatively, you can have Guard watch for changes: ````bash bundle exec guard ```` Also, do not forget to run Rubocop: ````bash bundle exec rubocop ```` ### Publishing Note: ensure you have proper authorization before trying to publish new versions. After code changes have successfully gone through the Pull Request review process then the following steps should be followed for publishing new versions: 1. Merge Pull Request into master 2. Update `lib/burner/version.rb` using [semantic versioning](https://semver.org/) 3. Install dependencies: `bundle` 4. Update `CHANGELOG.md` with release notes 5. Commit & push master to remote and ensure CI builds master successfully 6. Run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org). ## Code of Conduct Everyone interacting in this codebase, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/bluemarblepayroll/burner/blob/master/CODE_OF_CONDUCT.md). ## License This project is MIT Licensed.