# ActiveReporter [![Actions Status](https://github.com/chaunce/active_reporter/workflows/Ruby/badge.svg)](https://github.com/chaunce/active_reporter/actions) `ActiveReporter` is a framework for aggregating data about [Rails](http://rubyonrails.org) models backed by [PostgreSQL](http://www.postgresql.org), [MySQL](https://www.mysql.com), or [SQLite](https://www.sqlite.org) databases. It's designed to be flexible enough to accommodate many use cases, but opinionated enough to avoid the need for boilerplate. `ActiveReporter` is based on the `repor` gem by Andrew Ross https://github.com/asross/repor - [Basic usage](#basic-usage) - [Building reports](#building-reports) - [Defining reports](#defining-reports) - [Base relation](#base-relation) - [Dimensions (x-axes)](#dimensions-x-axes) - [Filtering by dimensions](#filtering-by-dimensions) - [Grouping by dimensions](#grouping-by-dimensions) - [Customizing dimensions](#customizing-dimensions) - [Aggregators (y-axes)](#aggregators-y-axes) - [Customizing aggregators](#customizing-aggregators) - [Serializing reports](#serializing-reports) - [Contributing](#contributing) - [License](#license) ## Basic usage Here are some examples of how to define, run, and serialize a `ActiveReporter::Report`: ```ruby class PostReport < ActiveReporter::Report report_on :Post category_dimension :author, relation: ->(r) { r.joins(:author) }, expression: 'users.name' number_dimension :likes time_dimension :created_at count_aggregator :number_of_posts sum_aggregator :total_likes, expression: 'posts.likes' array_aggregator :post_ids, expression: 'posts.id' end # show me # published posts from 2014-2015 with at least 4 likes, by author report = PostReport.new( relation: Post.published, groupers: [:author], aggregators: [:number_of_posts], dimensions: { likes: { only: { min: 4 } }, created_at: { only: { min: '2014', max: '2015' } } } ) puts report.data # => [ # { key: 'James Joyce', value: 10 }, # { key: 'Margaret Atwood', value: 4 } # { key: 'Toni Morrison', value: 5 } # ] # show me likes on specific authors' posts by author and year, from 1985-1987 report = PostReport.new( groupers: [:author, :created_at], aggregators: [:total_likes], dimensions: { created_at: { only: { min: '1985', max: '1987' }, bin_width: 'year' }, author: { only: ['Edith Wharton', 'James Baldwin'] } } ) puts report.data # => [{ # key: { min: Tue, 01 Jan 1985 00:00:00 UTC +00:00, # max: Wed, 01 Jan 1986 00:00:00 UTC +00:00 }, # values: [ # { key: 'Edith Wharton', value: 35 }, # { key: 'James Baldwin', value: 13 } # ] # }, { # key: { min: Wed, 01 Jan 1986 00:00:00 UTC +00:00, # max: Thu, 01 Jan 1987 00:00:00 UTC +00:00 }, # values: [ # { key: 'Edith Wharton', value: 0 }, # { key: 'James Baldwin', value: 0 } # ] # }, { # key: { min: Thu, 01 Jan 1987 00:00:00 UTC +00:00, # max: Fri, 01 Jan 1988 00:00:00 UTC +00:00 }, # values: [ # { key: 'Edith Wharton', value: 0 }, # { key: 'James Baldwin', value: 19 } # ] # }] csv_serializer = ActiveReporter::Serializer::Csv.new(report) puts csv_serializer.csv_text # => csv text string chart_serializer = ActiveReporter::Serializer::Highcharts.new(report) puts chart_serializer.highcharts_options # => highcharts options hash ``` To define a report, you declare dimensions (which represent attributes of your data) and aggregators (which represent quantities you want to measure). To run a report, you instantiate it with one aggregator and at least one dimension, then inspect its `data`. You can also wrap it in a serializer to get results in useful formats. ## Building reports Just call `ReportClass.new(params)`, where `params` is a hash with these keys: - `aggregators` (required) is a list of the names of the aggregator(s) to aggregate by - `groupers` (required) is a list of the names of the dimension(s) to group by - `relation` (optional) provides an initial scope for the data - `dimensions` (optional) holds dimension-specific filter or grouping options See below for more details about dimension-specific parameters. ## Defining reports ### Base relation A `ActiveReporter::Report` either needs to know what `ActiveRecord` class it is reporting on, or it needs to know a `table_name` and a `base_relation`. You can specify an `ActiveRecord` class by calling the `report_on` class method with a class or class name, or if you prefer, you can override the other two as instance methods. By default, it will try to infer an `ActiveRecord` class from the report class name by dropping `/Report$/` and constantizing. ```ruby class PostReport < ActiveReporter::Report end PostReport.new.table_name # => 'posts' PostReport.new.base_relation # => Post.all class PostStructuralReport < ActiveReporter::Report report_on :Post def base_relation super.where(author: 'Foucault') end end PostStructuralReport.new.table_name # => 'posts' PostStructuralReport.new.base_relation # => Post.where(author: 'Foucault') ``` Finally, you can also use `autoreport_on` if you'd like to automatically infer dimensions from your columns and associations. `autoreport_on` will try to map most columns to dimensions, and if the column in question is for a `belongs_to` association, will even try to join and report on the association's name: ```ruby class PostReport < ActiveReporter::Report autoreport_on Post end PostReport.new.dimensions.keys # => %i[:created_at, :updated_at, :likes, :title, :author] PostReport.new.dimensions[:author].expression # => 'users.name' ``` Autoreport behavior can be customized by overriding certain methods; see the `ActiveReporter::Report` code for more information. ### Dimensions (x-axes) You define dimensions on your `ActiveReporter::Report` to represent attributes of your data you're interested in. Dimensions objects can filter or group your relation by a SQL expression, and accept/return simple Ruby values of various types. There are several built-in types of dimensions: - `Category` - Groups/filters the relation by the discrete values of the `expression` - `Number` - Groups/filters the relation by binning a continuous numeric `expression` - `Time` - Like number dimensions, but the bins are increments of time You define dimensions in your report class like this: ```ruby class PostReport < ActiveReporter::Report category_dimension :status number_dimension :author_rating, expression: 'users.rating', relation: ->(r) { r.joins(:author) } time_dimension :publication_date, expression: 'posts.published_at' end ``` The SQL expression a dimension uses defaults to: ```ruby "#{report.table_name}.#{dimension.name}" ``` but this can be overridden by passing an `expression` option. Additionally, if the filtering or grouping requires joins or other SQL operations, a custom `relation` proc can be passed, which will be called beforehand. #### Filtering by dimensions All dimensions can be filtered to one or more values by passing in `params[:dimensions][][:only]`. `Category#only` should be passed the exact values you'd like to filter to (or what will map to them after connection adapter quoting). `Number` and `Time` are "bin" dimensions, and their `only`s should be passed one or more bin ranges. Bin ranges should be hashes of at least one of `min` and `max`, or they should just be `nil` to explicitly select rows for which `expression` is null. Bin range filtering is `min`-inclusive but `max`-exclusive. For `Number`, the bin values should be numbers or strings of digits. For `Time`, the bin values should be dates/times or `Time.zone.parse`-able strings. #### Grouping by dimensions To group by a dimension, pass its `name` to `params[:groupers]`. For bin dimensions (`Number` and `Time`), where the values being grouped by are ranges of numbers or times, you can specify additional options to control the width and distribution of those bins. In particular, you can pass values to: - `params[:dimensions][][:bins]`, - `params[:dimensions][][:bin_count]`, or - `params[:dimensions][][:bin_width]` `bins` is the most general option; you can use it to divide the full domain of the data into non-uniform, overlapping, and even null bin ranges. It should be passed an array of the same min/max hashes or `nil` used in filtering. `bin_count` will divide the domain of the data into a fixed number of bins. It should be passed a positive integer. `bin_width` will tile the domain with bins of a fixed width. It should be passed a positive number for `Number`s and a "duration" for `Time`s. Durations can either be strings of a number followed by a time increment (minutes, hours, days, weeks, months, years), or they can be hashes suitable for use with [`ActiveSupport::TimeWithZone#advance`](http://apidock.com/rails/ActiveSupport/TimeWithZone/advance). E.g.: ``` params[:dimensions][