# Plumbing

Actors, Observers and Data Pipelines.

## Configuration

The most important configuration setting is the `mode`, which governs how background tasks are handled.

By default it is `:inline`, so every command or query is handled synchronously.  This is the ruby behaviour you know and love (although see the section on `await` below).

`:async` mode handles tasks using fibers (via the [Async gem](https://socketry.github.io/async/index.html)).  Your code should include the "async" gem in its bundle, as Plumbing does not load it by default.

`:threaded` mode handles tasks using a thread pool via [Concurrent Ruby](https://ruby-concurrency.github.io/concurrent-ruby/master/Concurrent/Promises.html)).  Your code should include the "concurrent-ruby" gem in its bundle, as Plumbing does not load it by default.

However, `:threaded` mode is not safe for Ruby on Rails applications.  In this case, use `:threaded_rails` mode, which is identical to `:threaded`, except it wraps the tasks in the Rails executor.  This ensures your actors do not interfere with the Rails framework.  Note that the Concurrent Ruby's default `:io` scheduler will create extra threads at times of high demand, which may put pressure on the ActiveRecord database connection pool.  A future version of plumbing will allow the thread pool to be adjusted with a maximum number of threads, preventing contention with the connection pool.

The `timeout` setting is used when performing queries - it defaults to 30s.

```ruby
  require "plumbing"
  puts Plumbing.config.mode
  # => :inline

  Plumbing.configure mode: :async, timeout: 10

  puts Plumbing.config.mode
  # => :async
```

If you are running a test suite, you can temporarily update the configuration by passing a block.

```ruby
  require "plumbing"
  puts Plumbing.config.mode
  # => :inline

  Plumbing.configure mode: :async do
    puts Plumbing.config.mode
    # => :async
    first_test
    second_test
  end

  puts Plumbing.config.mode
  # => :inline
```

## Plumbing::Pipeline - transform data through a pipeline

Define a sequence of operations that proceed in order, passing their output from one operation as the input to another.  [Unix pipes](https://en.wikipedia.org/wiki/Pipeline_(Unix)) in Ruby.

Use `perform` to define a step that takes some input and returns a different output.
  Specify `using` to re-use an existing `Plumbing::Pipeline` as a step within this pipeline.
Use `execute` to define a step that takes some input, performs an action but passes the input, unchanged, to the next step.

If you have [dry-validation](https://dry-rb.org/gems/dry-validation/1.10/) installed, you can validate your input using a `Dry::Validation::Contract`.  Alternatively, you can define a `pre_condition` to test that the inputs are valid.

You can also verify that the output generated is as expected by defining a `post_condition`.

### Usage:

[Building an array using multiple steps with a pre-condition and post-condition](/spec/examples/pipeline_spec.rb)

```ruby
  require "plumbing"
  class BuildArray < Plumbing::Pipeline
    perform :add_first
    perform :add_second
    perform :add_third

    pre_condition :must_be_an_array do |input|
      input.is_a? Array
    end

    post_condition :must_have_three_elements do |output|
      output.length == 3
    end

    private

    def add_first(input) = input << "first"

    def add_second(input) = input << "second"

    def add_third(input) = input << "third"
  end

  BuildArray.new.call []
  # => ["first", "second", "third"]

  BuildArray.new.call 1
  # => Plumbing::PreconditionError("must_be_an_array")

  BuildArray.new.call ["extra element"]
  # => Plumbing::PostconditionError("must_have_three_elements")
```

[Validating input parameters with a contract](/spec/examples/pipeline_spec.rb)
```ruby
  require "plumbing"
  require "dry/validation"

  class SayHello < Plumbing::Pipeline
    validate_with "SayHello::Input"
    perform :say_hello

    private

    def say_hello input
      "Hello #{input[:name]} - I will now send a load of annoying marketing messages to #{input[:email]}"
    end

    class Input < Dry::Validation::Contract
      params do
        required(:name).filled(:string)
        required(:email).filled(:string)
      end
      rule :email do
        key.failure("must be a valid email") unless /\A[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+\z/i.match? value
      end
    end
  end

  SayHello.new.call(name: "Alice", email: "alice@example.com")
  # => Hello Alice - I will now send a load of annoying marketing messages to alice@example.com

  SayHello.new.call(some: "other data")
  # => Plumbing::PreConditionError
```

[Building a pipeline through composition](/spec/examples/pipeline_spec.rb)

```ruby
  require "plumbing"
  class ExternalStep < Plumbing::Pipeline
    perform :add_item_to_array

    private

    def add_item_to_array(input) = input << "external"
  end

  class BuildSequenceWithExternalStep < Plumbing::Pipeline
    perform :add_first
    perform :add_second, using: "ExternalStep"
    perform :add_third

    private

    def add_first(input) = input << "first"

    def add_third(input) = input << "third"
  end

  BuildSequenceWithExternalStep.new.call([])
  # => ["first", "external", "third"]
```

## Plumbing::Actor - safe asynchronous objects

An [actor](https://en.wikipedia.org/wiki/Actor_model) defines the messages an object can receive, similar to a regular object.
However, in traditional object-orientated programming, a thread of execution moves from one object to another.  If there are multiple threads, then each object may be accessed concurrently, leading to race conditions or data-integrity problems - and very hard to track bugs.

Actors are different.  Conceptually, each actor has it's own thread of execution, isolated from every other actor in the system.  When one actor sends a message to another actor, the receiver does not execute its method in the caller's thread.  Instead, it places the message on a queue and waits until its own thread is free to process the work.  If the caller would like to access the return value from the method, then it must wait until the receiver has finished processing.

This means each actor is only ever accessed by a single thread and the vast majority of concurrency issues are eliminated.

[Plumbing::Actor](/lib/plumbing/actor.rb) allows you to define the `async` public interface to your objects.  Calling `.start` builds a proxy to the actual instance of your object and ensures that any messages sent are handled in a manner appropriate to the current mode - immediately for inline mode, using fibers for async mode and using threads for threaded and threaded_rails mode.

When sending messages to an actor, this just works.

However, as the caller, you do not have direct access to the return values of the messages that you send.  Instead, you must call `#value` - or alternatively, wrap your call in `await { ... }`.  The block form of `await` is added in to ruby's `Kernel` so it is available everywhere.  It is also safe to use with non-actors (in which case it just returns the original value from the block).

```ruby
@actor = MyActor.start name: "Alice"

@actor.name.value
# => "Alice"

await { @actor.name }
# => "Alice"

await { "Bob" }
# => "Bob"
```

This then makes the caller's thread block until the receiver's thread has finished its work and returned a value.  Or if the receiver raises an exception, that exception is then re-raised in the calling thread.

The actor model does not eliminate every possible concurrency issue.  If you use `value` or `await`, it is possible to deadlock yourself.

Actor A, running in Thread 1, sends a message to Actor B and then awaits the result, meaning Thread 1 is blocked.  Actor B, running in Thread 2, starts to work, but needs to ask Actor A a question.  So it sends a message to Actor A and awaits the result.  Thread 2 is now blocked, waiting for Actor A to respond.  But Actor A, running in Thread 1, is blocked, waiting for Actor B to respond.

This potential deadlock only occurs if you use `value` or `await` and have actors that call back in to each other.  If your objects are strictly layered, or you never use `value` or `await` (perhaps, instead using a Pipe to observe events), then this particular deadlock should not occur.  However, just in case, every call to `value` has a timeout defaulting to 30s.

### Inline actors

Even though inline mode is not asynchronous, you must still use `value` or `await` to access the results from another actor.  However, as deadlocks are impossible in a single thread, there is no timeout.

### Async actors

Using async mode is probably the easiest way to add concurrency to your application.  It uses fibers to allow for "concurrency but not parallelism" - that is execution will happen in the background but your objects or data will never be accessed by two things at the exact same time.

### Threaded actors

Using threaded (or threaded_rails) mode gives you concurrency and parallelism.  If all your public objects are actors and you are careful about callbacks then the actor model will keep your code safe.  But there are a couple of extra things to consider.

Firstly, when you pass parameters or return results between threads, those objects are "transported" across the boundaries.
Most objects are cloned. Hashes, keyword arguments and arrays have their contents recursively transported.  And any object that uses `GlobalID::Identification` (for example, ActiveRecord models) are marshalled into a GlobalID, then unmarshalled back in to their original object.  This is to prevent the same object from being amended in both the caller and receiver's threads.

Secondly, when you pass a block (or Proc parameter) to another actor, the block/proc will be executed in the receiver's thread.  This means you must not access any variables that would normally be in scope for your block (whether local variables or instance variables of other objects - see note below)  This is because you will be accessing them from a different thread to where they were defined, leading to potential race conditions.  And, if you access any actors, you must not use `value` or `await` or you risk a deadlock.  If you are within an actor and need to pass a block or proc parameter, you should use the `safely` method to ensure that your block is run within the context of the calling actor, not the receiving actor.

For example, when defining a custom filter,  the filter adds itself as an observer to its source.  The source triggers the `received` method on the filter, which will run in the context of the source.  So the custom filter uses `safely` to move back into its own context and access its instance variables.

```ruby
class EveryThirdEvent < Plumbing::CustomFilter
  def initialize source:
    super
    @events = []
  end

  def received event
    safely do
      @events << event
      if @events.count >= 3
        @events.clear
        self << event
      end
    end
  end
end
```

(Note: we break that rule in the specs for Pipe objects - we use a block observer that sets the value on a local variable.  That's because it is a controlled situation where we know there are only two threads involved and we are explicitly waiting for the second thread to complete.  For almost every app that uses actors, there will be multiple threads and it will be impossible to predict the access patterns).

### Constructing actors

Instead of constructing your object with `.new`, use `.start`.  This builds a proxy object that wraps the target instance and dispatches messages through a safe mechanism.  Only messages that have been defined as part of the actor are available in this proxy - so you don't have to worry about callers bypassing the actor's internal context.

### Referencing actors

If you're within a method inside your actor and you want to pass a reference to yourself, instead of using `self`, you should use `proxy` (which is also aliased as `as_actor` or `async`).

Also be aware that if you use actors in one place, you need to use them everywhere - especially if you're using threads.  This is because as the actor sends messages to its collaborators, those calls will be made from within the actor's internal context.  If the collaborators are also actors, the subsequent messages will be handled correctly, if not, data consistency bugs could occur.  This does not mean that every class needs to be an actor, just your "public API" classes which may be accessed from multiple actors or other threads.

### Usage

[Defining an actor](/spec/examples/actor_spec.rb)

```ruby
  require "plumbing"

  class Employee
    include Plumbing::Actor
    async :name, :job_title, :greet_slowly, :promote
    attr_reader :name, :job_title

    def initialize(name)
      @name = name
      @job_title = "Sales assistant"
    end

    private

    def promote
      sleep 0.5
      @job_title = "Sales manager"
    end

    def greet_slowly
      sleep 0.2
      "H E L L O"
    end
  end

  @person = Employee.start "Alice"

  await { @person.name }
  # =>  "Alice"
  await { @person.job_title }
  # => "Sales assistant"

  # by using `await`, we will block until `greet_slowly` has returned a value
  await { @person.greet_slowly }
  # =>  "H E L L O"

  # this time, we're not awaiting the result, so this will run in the background (unless we're using inline mode)
  @person.greet_slowly

  # this will run in the background
  @person.promote
  # this will block - it will not return until the previous calls, #greet_slowly, #promote, and this call to #job_title have completed
  await { @person.job_title }
  # => "Sales manager"
```

## Plumbing::Pipe - a composable observer

[Observers](https://ruby-doc.org/3.3.0/stdlibs/observer/Observable.html) in Ruby are a pattern where objects (observers) register their interest in another object (the observable).  This pattern is common throughout programming languages (event listeners in Javascript, the dependency protocol in [Smalltalk](https://en.wikipedia.org/wiki/Smalltalk)).

[Plumbing::Pipe](lib/plumbing/pipe.rb) makes observers "composable".  Instead of simply just registering for notifications from a single observable, we can build sequences of pipes.  These sequences can filter notifications and route them to different listeners, or merge multiple sources into a single stream of notifications.

Pipes are implemented as actors, meaning that event notifications can be dispatched asynchronously.  The observer's callback will be triggered from within the pipe's internal context so you should immediately trigger a command on another actor to maintain safety.

### Usage

[A simple observer](/spec/examples/pipe_spec.rb):
```ruby
  require "plumbing"

  @source = Plumbing::Pipe.start
  @observer = @source.add_observer do |event|
    puts event.type
  end

  @source.notify "something_happened", message: "But what was it?"
  # => "something_happened"
```

[Simple filtering](/spec/examples/pipe_spec.rb):
```ruby
  require "plumbing"

  @source = Plumbing::Pipe.start
  @filter = Plumbing::Filter.start source: @source do |event|
    %w[important urgent].include? event.type
  end
  @observer = @filter.add_observer do |event|
    puts event.type
  end

  @source.notify "important", message: "ALERT! ALERT!"
  # => "important"
  @source.notify "unimportant", message: "Nothing to see here"
  # => <no output>
```

[Custom filtering](/spec/examples/pipe_spec.rb):
```ruby
  require "plumbing"
  class EveryThirdEvent < Plumbing::CustomFilter
    def initialize source:
      super source: source
      @events = []
    end

    def received event
      # #received is called in the context of the `source` actor
      # in order to safely access the `EveryThirdEvent` instance variables
      # we need to move into the context of our own actor
      safely do
        # store this event into our buffer
        @events << event
        # if this is the third event we've received then clear the buffer and broadcast the latest event
        if @events.count >= 3
          @events.clear
          self << event
        end
      end
    end
  end

  @source = Plumbing::Pipe.start
  @filter = EveryThirdEvent.start(source: @source)
  @observer = @filter.add_observer do |event|
    puts event.type
  end

  1.upto 10 do |i|
    @source.notify i.to_s
  end
  # => "3"
  # => "6"
  # => "9"
```

[Joining multiple sources](/spec/examples/pipe_spec.rb):
```ruby
  require "plumbing"

  @first_source = Plumbing::Pipe.start
  @second_source = Plumbing::Pipe.start

  @junction = Plumbing::Junction.start @first_source, @second_source

  @observer = @junction.add_observer do |event|
    puts event.type
  end

  @first_source.notify "one"
  # => "one"
  @second_source.notify "two"
  # => "two"
```
## Plumbing::RubberDuck - duck types and type-casts

Define an [interface or protocol](https://en.wikipedia.org/wiki/Interface_(object-oriented_programming)) specifying which messages you expect to be able to send.

Then cast an object into that type.  This first tests that the object can respond to those messages and then builds a proxy that responds to those messages (and no others).  However, if you take one of these proxies, you can safely re-cast it as another type (as long as the original target object responds to the correct messages).

### Usage

Define your interface (Person in this example), then cast your objects (instances of PersonData and CarData).

[Casting objects as duck-types](/spec/examples/rubber_duck_spec.rb):
```ruby
  require "plumbing"

  Person = Plumbing::RubberDuck.define :first_name, :last_name, :email
  LikesFood = Plumbing::RubberDuck.define :favourite_food

  PersonData = Struct.new(:first_name, :last_name, :email, :favourite_food)
  CarData = Struct.new(:make, :model, :colour)

  @porsche_911 = CarData.new "Porsche", "911", "black"
  @person = @porsche_911.as Person
  # => Raises a TypeError as CarData does not respond_to #first_name, #last_name, #email

  @alice = PersonData.new "Alice", "Aardvark", "alice@example.com", "Ice cream"
  @person = @alice.as Person
  @person.first_name
  # => "Alice"
  @person.email
  # => "alice@example.com"
  @person.favourite_food
  # => NoMethodError - #favourite_food is not part of the Person rubber duck (even though it is part of the underlying PersonData struct)

  # Cast our Person into a LikesFood rubber duck
  @hungry = @person.as LikesFood
  @hungry.favourite_food
  # => "Ice cream"
```

You can also use the same `@object.as type` to type-check instances against modules or classes.  This creates a RubberDuck proxy based on the module or class you're casting into.  So the cast will pass if the object responds to the correct messages, even if a strict `.is_a?` test would fail.

```ruby
  require "plumbing"

  module Person
    def first_name = @first_name

    def last_name = @last_name

    def email = @email
  end

  module LikesFood
    def favourite_food = @favourite_food
  end

  PersonData = Struct.new(:first_name, :last_name, :email, :favourite_food)
  CarData = Struct.new(:make, :model, :colour)

  @porsche_911 = CarData.new "Porsche", "911", "black"
  expect { @porsche_911.as Person }.to raise_error(TypeError)

  @alice = PersonData.new "Alice", "Aardvark", "alice@example.com", "Ice cream"

  @alics.is_a? Person
  # => false - PersonData does not `include Person`
  @person = @alice.as Person
  # This cast is OK because PersonData responds to :first_name, :last_name and :email
  expect(@person.first_name).to eq "Alice"
  expect(@person.email).to eq "alice@example.com"
  expect { @person.favourite_food }.to raise_error(NoMethodError)

  @hungry = @person.as LikesFood
  expect(@hungry.favourite_food).to eq "Ice cream"
```

## Installation

Note: this gem is licensed under the [LGPL](/LICENCE).  This may or may not make it unsuitable for use by you or your company.

Install the gem and add to the application's Gemfile by executing:

```sh
bundle add standard-procedure-plumbing
```

Then:

```ruby
require 'plumbing'

# Set the mode for your Actors and Pipes
Plumbing.config mode: :async
```

## Development

After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).

## Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/standard_procedure/plumbing. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/standard_procedure/plumbing/blob/main/CODE_OF_CONDUCT.md).

## Code of Conduct

Everyone interacting in the Plumbing project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/standard_procedure/plumbing/blob/main/CODE_OF_CONDUCT.md).