# BjnInventory

This gem is designed to help materialize a standardized inventory of devices according to:

* A model you specify
* Multiple inventory sources you configure
* A mapping between each source type and your standard model

The materialized inventory can easily be converted to different formats:
* Raw JSON or YAML
* Ansible dynamic inventory

## Installation

Add this line to your application's Gemfile:

```ruby
gem 'bjn_inventory'
```

And then execute:

    $ bundle

Or install it yourself as:

    $ gem install bjn_inventory

## Usage

An inventory is a list of devices, created from a specification, that specifies:
* A device model
* One or more sources, each of which has:
  - A source-specific filename, URL or other location information
  - A source-specific origin, which maps source data to the standard model
* Context data
  - These are just JSON files that you can refer to in your map and model

Example usage (for an Ansible dynamic inventory script--but see `ansible-from`):
```ruby
require 'bjn_inventory'

manifest = JSON.parse(File.read('inventory.json'))
inventory = BjnInventory.new(manifest)
ansible = {
  group_by: [
    environment,
    region,
    roles
  ],
  groups: {
    webservers: ['www', 'stage-www']
  }
}
# Adds a method to Array to convert to Ansible dynamic inventory output
puts inventory.to_ansible(ansible)
```

inventory.json
```ruby
{ "model": "/etc/inventory/device.json",
  "context": "/etc/inventory/data/context",
  "sources": [
    { "file": "/etc/inventory/data/device42.json",
      "rules": "/etc/inventory/maps/device42.rb" },
    { "file": "/etc/inventory/data/aws.json",
      "rules": "/etc/inventory/maps/aws.rb" }
  ]
}
```

## Model

When **bjn_inventory** produces an inventory, each device conforms
to the device model. They have only the fields specified in the
model, and defaults are filled in according to the model.

Your inventory sources may not conform to the model. That's where
there are [Rules#rules] to map the source entries into proper devices,
according to the model you provide.

The model generally takes the form of a JSON file, but can be embedded
directly in the inventory specification.

### Merge Rules

When two devices need to be merged (for example, you are invoking
**BjnInventory::Inventory#by(key)** and they have the same value
for the *key* field), a new Device object is created with field
values taken from the second, merged with the first (similar to
a Hash merge). This merge is done according to the device model
and the following rules:

* Non-**nil** values take precedence over **nil**
* Hashes are merged shallowly according to a standard
  Ruby hash merge
* Arrays are concatenated, except duplicate values from
  the second device are not added
* The second device's other values take precedence

The resulting merged device's **#origin** method returns an
array of the different origins used to merge it together.

## Sources

Each source you specify is read, in order. When you invoke
**BjnInventory::Inventory#by(key)**, all sources are used. If two
entries have the same *key*, they are merged together using the
[merge rules#merge-rules] (basically, it merges intelligently based on
the default values in the model). Order is strictly preserved here, so
sources listed later in the list have data precedence over those
listed earlier (like Ruby merge logic).

Right now, only two kinds of sources are supported: inline, with the
`entries` key, where the inventory entries are specified directly in
the source; or with the `file` key, where the file must contain a JSON
array of objects.

In other words, it's assumed that you're separately downloading your
source of inventory into JSON files for **bjn_inventory** to operate on.
In the future, a download command or plugin may be allowed here.

This package also provides a downloader command for AWS EC2. Use
`aws-ec2-source` to download and minimally process an EC2 instance
list to provide an inventory source.

### Rules

The mapping rules allow you to specify field values in the model and
what calculation to perform from a source to derive them in the
following DSL (domain-specific language). The rules consist of either
text or a filename.

The `origin` command takes a string, which specifies (for your
convenience) the origin type of the resulting devices. For example,
your AWS mapping rules might use `origin 'aws'` to identify resulting
devices as having come from the AWS source. You can reuse rules
files for different sources, or not, so it's up to you whether this
origin represents a particular source of data or a particular kind
of device.

The `map` command takes a hash with a field name as a key and a mapping
rule type (such as `ruby` or `jsonpath`). For example, the following
rule specifies that the `name` field in your device model comes from
either the `fqdn` field from the source entry, or the `name` field, if
`fqdn` is **nil**:

```ruby
map name: ruby { |data| data['fqdn'] || data['name'] }
```

As an alternative to this syntax, you can mention the field name
directly and omit the `ruby` keyword:

```ruby
name { |data| data['fqdn'] || data['name'] }
```

These are examples of **ruby** rules, which take the form of a Ruby
block. This block accepts up to two arguments: the first is the data
from the source entry (the "raw" data) in the form of a Hash, with
the default values from the device model added; the second is the
current **BjnInventory::Device** object. For example, if your model
contains the `name` and `domain` fields, the following rule specifies
a `fqdn` field that uses them:

```ruby
fqdn { |_data, device| device.name + '.' + device.domain }
```

There are other rule types available:

A `jsonpath` rule uses a JSONPath expression in a string to set
the device field. For example, the following rule sets the
`name` field using the value of the `Name` tag (assuming AWS
tagging, e.g. when using `aws-ec2-source`):

```ruby
name jsonpath '$.tags[?(@["key"]=="Name")].value'
```

A `synonym` rule simply makes the field a synonym for another
device model field. For example, the following rule makes the
field `management_ip` exactly synonymous with the `ip_address`
field:

```ruby
management_ip synonym :ip_address
```

An `always` rule simply uses a constant value for that field.
For example, the following rule sets the value of `system_type`
to `ec2_instance` unconditionally:

```ruby
system_type always 'ec2_instance'
```

## Commands

### Formatters

The inventory can be output in different formats. Two formatters
are provided which can display the inventory as model-conformant
devices: either as a JSON array or an object keyed by an identifying
field (`inventory-model`); or in the
[Ansible Dynamic Inventory](http://docs.ansible.com/ansible/latest/intro_dynamic_inventory.html)
format.

The `refresh_inventory_data` formatter formats the inventory into
groups and devices in a file tree. A `devices.json` index is produced,
with all devices, and each device also gets a file in `devices/<key>.json`.
In addition the groups are listed in a `groups.json` index, and each
has a list of devices in `groups/<group>.json`. This facilitates sharing
the inventory over the web, as well (though **bjn_inventory** is not
itself a network service).

### Downloaders

The overall design of the software encourages you to download entries
from inventory sources in a "close to raw" manner, and let
the source merging and mapping rules transform the entries into
devices. Because the core inventory generation offers no mapping,
the downloader is also the point at which to filter out entries
that should not be devices.

For convenience, AWS downloaders are provided for instances
(`aws-ec2-source`), classic ELB (`aws-elb-source`) and RDS
(`aws-rds-source`) resources. Run each with valid AWS credentials in
your `~/.aws/credentials` file to see what the output looks like.
They each accept a `--filter` argument: in the EC2 downloader's case,
this is passed to the AWS API to filter instances according to the
attributes and tags it offers; in the case of the ELB downloader,
the syntax is the same, but it is enforced by an internal rules-
matching library. The RDS downloader also does filtering based only
on tags, using the `--tag-filters` option.

### Service Maps

A common use case for device inventories is to be able to present
service endpoints calculated dynamically from inventory. These could
be used as-is, imported into a service discovery system, etc. To
facilitate this, a special formatter is provided which maps devices
and groups into "service endpoints", which are defined by a service
map.

A service map is a JSON object stored in a file (and provided to the
`service-map` command via the `--map` option), where the keys are
service prefixes and the values are objects. It can be arbitrarily
deeply nested, with the deepest object in the tree being a service
specifier.

A service specifier consists of the special field `hosts`, the value
of which is a list of groups. Each device in all the groups is added
to the service under the preceding prefix. The other fields in the
service specifier are arbitrary; each is passed through unchanged, so
that the "leaf" of the service map consists of an object where the
service specifier's key is joined with each of the specifier fields by
a dot (`.`); and the `hosts` field has each device's endpoint listed
with the `join_with` character (by default, a comma; or as a JSON
array).

The groups are determined by a group specification: similar to the
`ansible-from` command, a JSON object (in the file given by the
`--groups` option) with a `"group_by"` key, the value of which is a
list of fields to create groups by. Note that if you use
the `ansible-from` command, you can use the same groups file; but
Ansible's `"groups"` key, which specifies groups of groups, is ignored.

In the service prefix (the nested keys and objects which precede the
"leaf" of the service map), device fields can be given with a dollar sign
(`$`) prepended; these will be interpolated with the actual devices'
field values. This also means that trees can be copied into the map multiple
times, once for every unique value of the field in the relevant groups.

The above description sounds complicated, but the result is a fairly
simple way to map service endpoints into an arbitrarily complex tree, and
a couple of examples are probably best to illustrate the point. Let's say
you have three devices in your inventory: two in the `us-west-2` region
and one in the `eu-west-1` region. In your `us-west-2` region, one instance
has the `web` role and one has the `db` role. In the `us-west-1` region,
you just have a webserver. Your whole inventory looks like this (for
example, if you run `inventory-model --manifest manifest.json`:

```javascript
[
  { "name": "web-01",
    "roles": ["web"],
    "region": "us-west-2" },
  { "name": "db-01",
    "roles": ["db"],
    "region": "us-west-2" },
  { "name": "web-02",
    "roles": ["web"],
    "region": "eu-west-1" }
]
```
 
You create the following service map:


```javascript
{
  "services": {
    "$region": {
      "www": {
        "hosts": ["web"],
        "port": 80
      }
    }
  },
  "monitor": {
    "$region": {
      "nagios": {
        "hosts": ["web", "db"],
        "key": "monitoring-key.rsa"
      }
    }
  }
}
```

When you run `service-map --map map.json --manifest manifest.json --hosts-field name`, you
get the following output:

```javascript
{
  "services": {
    "us-west-2": {
      "www.hosts": "10.0.1.1",
      "www.port": 80
    },
    "eu-west-1": {
      "www.hosts": "10.0.10.1",
      "www.port": 80
    }
  },
  "monitor": {
    "us-west-2": {
      "nagios.hosts": "10.0.1.1,10.0.1.2",
      "nagios.key": "monitoring-key.rsa"
    },
    "eu-west-1": {
      "nagios.hosts": "10.0.10.1",
      "nagios.key": "monitoring-key.rsa"
    }
  }
}
```

If you need to have a key in your service map that starts with a
dollar sign, use two dollar signs instead.

## TODO

* The `refresh_inventory_data` formatter needs to changed to be
  more parallel with the other formatters (`--ansible` should
  be `--groups`, it should probably be named `inventory-files`
  or something.
* The `aws-rds-source` should be refactored to take the `--filters`
  argument and use **BjnInventory::Util::Filter::JsonAws** like
  `aws-elb-source`.

## Design Decisions

* No calculated fields in the model.
* No filtering: you can either filter the sources, or you can filter
  the inventory after producing it
* No validation of model fields (exactly, though merge is sensitive to
  the model)
* For now, no fancy file finding (filenames in manifests must be
  absolute, or correct with respect to wherever you're running the
  software). Possibly this might change with the addition of the
  ability to pass a filename to **BjnInventory::Inventory.new()**.

## Development

After checking out the repo, run `bin/setup` to install
dependencies. Then, run `rake spec` to run the tests. You can also run
`bin/console` for an interactive prompt that will allow you to
experiment.

To install this gem onto your local machine, run `bundle exec rake
install`. To release a new version, update the version number in
`version.rb`, and then run `bundle exec rake release`, which will
create a git tag for the version, push git commits and tags, and push
the `.gem` file to [rubygems.org](https://rubygems.org).

## Contributing

Bug reports can be sent to Ops Tools <tools+bjn_inventory@bluejeans.com>.