--- layout: page title: Documentation countheads: true toc: true comments: true --- {% danger_block %} Work in progress! It contains a lot of tpyos, please let us know. There are comments at the bottom or you can submit a PR against [pages branch](https://github.com/dynflow/dynflow/tree/pages). Please help with the documentation if you know Dynflow. Thanks! {% enddanger_block %} ## High level overview TODO *TODO to be refined* Dynflow (**DYN**amic work**FLOW**) is a workflow engine written in Ruby that allows to: - Keep track of the progress of running processes - Run the code asynchronously - Resume the process when something goes wrong, skip some steps when needed - Detect independent parts and run them concurrently - Compose simple actions into more complex scenarios - Extend the workflows from third-party libraries - Keep consistency between local transactional database and external services - Suspend the long-running steps, not blocking the thread pool - Cancel steps when possible - Extend the actions behavior with middlewares - Pick different adapters to provide: storage backend, transactions, or executor implementation Dynflow has been developed to be able to support orchestration of services in the [Katello](http://katello.org) and [Foreman](http://theforeman.org/) projects. *TODO* - what problems does Dynflow solve? - maybe a little history ## Glossary - **Action** - building block of execution plans, a Ruby class inherited from `Dynflow::Action`, defines code to be run in each phase. - **Phase** - Each action has three phases: `plan`, `run`, `finalize`. - **Input** - A `Hash` of data coming to the action. - **Output** - A `Hash` of data that the action produces. It's persisted and can be used as input of other actions. - **Execution plan** - definition of the workflow: product of the plan phase, - **Triggering an action** - entering the plan phase, starting with the plan method of the action. The execution follows immediately. - **Flow** - definition of the `run`/`finalize` phase, holding the information about steps that can run concurrently/in sequence. Part of execution plan. - **Executor** - service that executes the run and finalize flows based on the execution plan. It can run in the same process as the plan phase or in different process (using the remote executor). - **World** - the universe where the Dynflow runs the code: it holds all needed configuration. Usually there's only one world per Dynflow process, besides configuration it also holds `Persistence`, `Logger`, `Executor` and all the other objects necessary for action executing. This concept allows us to avoid globally shared state. Also, the worlds can talk to each other, which is helpful for production and high-availability setups, having multiple worlds on different hosts handle the execution of the execution plans. ## Examples See the [examples directory](https://github.com/Dynflow/dynflow/tree/master/examples) for the code in action. Running those files (except the `example_helper.rb` file) leads to the Dynflow runtime being initialized (including the web console where one can explore the features and experiment). *TODO* - for async operations - for orchestrating system/ssh calls - for keeping consistency between local database and external systems - sub-tasks ## How to use ### World creation TODO - *include executor description* ### Development vs production - *In development execution runs in the same process, in production there is an executor process.* *TODO* ### Action anatomy {% digraph %} rankdir=LR Trigger -> Plan -> Run -> Finalize {% enddigraph %} When action is triggered, Dynflow executes plan method on this action, which is responsible for building the execution plan. It builds the execution plan by calling `plan_action` and `plan_self` methods, effectively listing actions that should be run as a part of this execution plan. In other words, it compiles a list of actions on which method `run` will be called. Also it's responsible for giving these actions an order. A simple example of such plan action might look like this ```ruby # this would plan deletion of files passed as an input array def plan(files) files.each do |filename| plan_action MyActions::File::Destroy, filename end end ``` Note that it does not have to be only other actions that are planned to run. In fact it's very common that the action plans itself, which means it will put its own `run` method call in the execution plan. In order to do that you can use `plan_self`. This could be used in MyActions::File::Destroy used in previous example ```ruby class MyActions::File::Destroy < Dynflow::Action def plan(filename) plan_self path: filename end def run File.rm(input.fetch(:path)) end end ``` In example above, it seems that `plan_self` is just shortcut to `plan_action MyActions::File::Destroy, filename` but it's not entirely true. Note that `plan_action` always triggers `plan` of a given action while `plan_self` plans only the `run` and `finalize` of Action, so by using `plan_action` we'd end up in endless loop. Also note, that run method does not take any input. In fact, it can use `input` method that refers to arguments, that were used in `plan_self`. Similar to the input mentioned above, the run produces output. After that some finalizing steps can be taken. Actions can use outputs of other actions as parts of their inputs establishing dependency. Action's state is serialized between each phase and survives machine/executor restarts. As lightly touched in the previous paragraphs there are 3 phases: `plan`, `run` and `finalize`. Plan phase starts by triggering an action. `run` and `finalize` are only ran if you use `plan_self` in `plan` phase. #### Input and Output Both input and output are `Hash`es accessible by `Action#input` and `Action#output` methods. They need to be serializable to JSON so it should contain only combination of primitive Ruby types like: `Hash`, `Array`, `String`, `Integer`, etc. {% warning_block %} One should avoid using directly ```ruby self.output = { key: data } ``` It might delete other data stored in the output (potentially by middleware and other parts of the action). Therefore it's preferred to use ```ruby output.update(key: data, another_key: another_data) # or for one key output[:key] = data ``` {% endwarning_block %} {% info_block %} You may sometime find these input/output format definitions: ```ruby class AnAction < Dynflow::Action input_format do param :id, Integer param :name, String end output_format do param :uuid, String end end ``` This might me quite handy especially in combination with [subscriptions](#subscriptions) functionality. The format follows [apipie-params](https://github.com/iNecas/apipie-params) for more details. Validations of input/output could be performed against this description but it's not turned on by default. (It needs to be revisited and updated to be fully functional.) {% endinfo_block %} #### Triggering Triggering the action means starting the plan phase, followed by immediate execution. Any action is triggered by calling: ``` ruby world_instance.trigger(AnAction, *args) ``` {% info_block %} In Foreman and Katello actions are usually triggered by `ForemanTask.sync_task` and `ForemanTasks.async_task` so following part is not that important if you are using `ForemanTasks`. {% endinfo_block %} `World#trigger` method returns object of `TriggerResult` type. Which is [Algebrick](http://blog.pitr.ch/projects/algebrick/) variant type where definition follows: ```ruby TriggerResult = Algebrick.type do # Returned by #trigger when planning fails. PlaningFailed = type { fields! execution_plan_id: String, error: Exception } # Returned by #trigger when planning is successful but execution fails to start. ExecutionFailed = type { fields! execution_plan_id: String, error: Exception } # Returned by #delay when scheduling succeeded. Scheduled = type { fields! execution_plan_id: String } # Returned by #trigger when planning is successful, #future will resolve after # ExecutionPlan is executed. Triggered = type { fields! execution_plan_id: String, future: Future } variants PlaningFailed, ExecutionFailed, Triggered end ``` If you do not know `Algebrick` you can think about these as `Struct`s with types. You can see how it's used to distinguish all the possible results [in ForemanTasks module](https://github.com/theforeman/foreman-tasks/blob/master/lib/foreman_tasks.rb#L20-L32). ```ruby def self.trigger_task(async, action, *args, &block) Match! async, true, false match trigger(action, *args, &block), # Raise if there is any error caused either by failed planning or # by faild start of execution. (on ::Dynflow::World::PlaningFailed.(error: ~any) | ::Dynflow::World::ExecutionFailed.(error: ~any) do |error| raise error end), # Succesfully triggered. (on ::Dynflow::World::Triggered.( execution_plan_id: ~any, future: ~any) do |id, finished| # block on the finished Future if this is called synchronously finished.wait if async == false return ForemanTasks::Task::DynflowTask.find_by_external_id!(id) end) end ``` #### Scheduling Scheduling an action means setting it up to be triggered at set time in future. Any action can be delayed by calling: ```ruby world_instance.delay(AnAction, { start_at: Time.now + 360, start_before: Time.now + 400 }, *args) ``` This snippet of code would delay `AnAction` with arguments `args` to be executed in the time interval between `start_at` and `start_before`. Setting `start_before` to `nil` would delay execution of this action without the timeout limit. When an action is delayed, an execution plan object is created with state set to `scheduled`, but it doesn't run the the plan phase yet, the planning happens when the `start_at` time comes. If the planning doesn't happen in time (e.g. after `start_before`), the execution plan is marked as failed (its state is set to `stopped` and result to `error`). Since the `args` have to be saved, there must be a mechanism to safely serialize and deserialize them in order to make them survive being saved in a database. This is handled by a serializer. Different serializers can be set per action by overriding its `delay` method. Planning of the delayed plans is handled by `DelayedExecutor`, an object which periodically checks for delayed execution plans and plans them. Scheduled execution plans don't do anything by themselves, they just wait to be picked up and planned by a DelayedExecutor. It means that if no DelayedExecutor is present, their planning will be delayed until a scheduler is spawned. #### Plan phase Planning always uses the thread triggering the action. Plan phase configures action's input for run phase. It starts by executing `plan` method of the action instance passing in arguments from `World#trigger method` ```ruby world_instance.trigger(AnAction, *args) # executes following an_action.plan(*args) # an_action is AnAction ``` `plan` method is inherited from Dynflow::Action and by default it plans itself if `run` method is present using first argument as input. It also calls `finalize` method if it is present. ```ruby class AnAction < Dynflow::Action def run output.update self.input end end world_instance.trigger AnAction, data: 'nothing' ``` The above will just plan itself copying input to output in run phase. In most cases the `plan` method is overridden to plan self with transformed arguments and/or to plan other actions. In the Rails application, the arguments of the plan phase are often the ActiveRecord objects, that are then used to produce the inputs for the actions. Let's look at the argument transformation first: ```ruby class AnAction < Dynflow::Action def plan(any_array) # pick just numbers plan_self numbers: any_array.select { |v| v.is_a? Number } end def run # compute sum - simulating a time consuming operation output.update sum: input[:numbers].reduce(&:+) end end ``` {% info_block %} It's considered a good practice to use just enough data for the input for the action to perform the job. That means not too much (such as using ActiveRecord's attributes), as it might have performance impact as well as causes issues when changing the attributes later. On the other hand, the input should contain enough data to perform the job without the need for reaching to external sources. Therefore, instead of passing just the ActiveRecord id and loading the whole record again in run phase, just to use some attributes, it's better to use these attributes directly as input of the action. Following theses rules should lead to the best results, both from readability and performance point of view. {% endinfo_block %} Now let's see an example with action planning: ```ruby class SumNumbers < Dynflow::Action def plan(numbers) plan_self numbers: numbers end def run output.update sum: input[:numbers].reduce(&:+) end end class SumManyNumbers < Dynflow::Action def plan(numbers) # references to planned actions planned_sub_sum_actions = numbers.each_slice(10).map do |numbers| plan_action SumNumbers, numbers end # prepare array of output references where each points to sum in the # output of particular action sub_sums = planned_sub_sum_actions.map do |action| action.output[:sum] end # plan one last action which will sum the sub_sums # it depends on all planned_sub_sum_actions because it uses theirs outputs plan_action SumNumbers, sub_sums end end world_instance.trigger SumManyNumbers, (1..100).to_a ``` Above example will in parallel sum numbers by slices of 10 values: first action sums `1..10`, second action sums `11..20`, ..., tenth action sums `91..100`. After all sub sums are computed one final action sums the sub sums into final sum. {% warning_block %} This example is here to demonstrate the planning abilities. In reality this parallelization of compute intensive tasks does not have a positive effect on Dynflow running on MRI. The pool of workers may starve. It is not a big issue since Dynflow is mainly used to orchestrate external services. *TODO add link to detail explanation in How it works when available.* {% endwarning_block %} Action may access local DB in plan phase, see [Database and Transactions](#database-and-transactions). #### Run phase Actions has a run phase if there is `run` method implemented. (There may be actions just planning other actions.) The run method implements the main piece of work done by this action converting input into output. Input is immutable in this phase. It's the right place for all the steps which are likely to fail. Action `run` phase are allowed to have side effects like: file operations, calls to other systems, etc. Local DB should not be accessed in this phase, see [Database and Transactions](#database-and-transactions) #### Finale phase Main purpose of `finalize` phase is to be able access local DB after action finishes successfully, like: indexing based on new data, updating records as fully created, etc. Finalize phase does not modify input or output of the action. Action may access local DB in `finalize` phase and must be **idempotent**, see [Database and Transactions](#database-and-transactions). ### Dependencies As already mentioned, actions can use output of different actions as their input (or just parts). When they do it creates dependency between actions, which is automatically detected by Dynflow and the execution plan is built accordingly. ```ruby def plan first_action = plan_action AnAction second_action = plan_action AnAction, first_action.output[:a_key_in_output] end ``` `second_action` uses part of the `first_action`'s output therefore it depends on the `first_action`. If actions are planned without this dependency as follows ```ruby def plan first_action = plan_action AnAction second_action = plan_action AnAction end ``` then they are independent and they are executed concurrently. There is also other mechanism how to describe dependencies between actions than just the one based on output usage. Dynflow user can specify the order between planned actions with DSL methods `sequence` and `concurrence`. Both methods are taking blocks and they specify how actions planned inside the block (or inner `sequence` and `concurrence` blocks) should be executed. By default `plan` considers its space as inside `concurrence`. Which means ```ruby def plan first_action = plan_action AnAction second_action = plan_action AnAction end ``` equals ```ruby def plan concurrence do first_action = plan_action AnAction second_action = plan_action AnAction end end ``` You can establish same dependency between `first_action` and `second_action` without using output by using `sequence` ```ruby def plan sequence do first_action = plan_action AnAction second_action = plan_action AnAction end end ``` As mentioned the `sequence` and `concurrence` methods can be nested and mixed with output usage to create more complex dependencies. Let see commented example: ```ruby def plan # Plans 3 actions of type AnAction to be executed in sequence # argument is the index in the sequence. actions_executed_sequentially = sequence do 3.times.map { |i| plan_action AnAction, i } end # Depends on output of the last action in `actions_executed_sequentially` # so it's added to the above sequence to be executed as 4th. action1 = plan_action AnAction, actions_executed_sequentially.last.output # It's planned in default plan's concurrency scope so it's executed concurrently # with the other four actions. action2 = plan_action AnAction end ``` The order than will be: - concurrently: - sequentially: 1. `*actions_executed_sequentially` 1. `action1` - `action2` Let's see one more example: ```ruby def plan actions = sequence do 2.times.map do |i| concurrency do 2.times.map { plan_action AnAction, i } end end end end ``` Which results in order of execution: - sequentially: 1. concurrently: - `actions[0][0]` argument: 0 - `actions[0][1]` argument: 0 1. concurrently: - `actions[1][0]` argument: 1 - `actions[1][1]` argument: 1 {% info_block %} It's on our todo-list to change that to be able to define acyclic-graph of dependencies between the actions. `sequence` and `concurrence` methods will then be deprecated and kept just for backward compatibility. {% endinfo_block %} {% warning_block %} Internally dependencies are also modeled with objects representing Sequences and Concurrences, which makes it weaker than acyclic-graph so in some cases during the dependency resolution it might not lead into the most effective execution plan. Some actions will run in sequence even though they could be run concurrently. This limitation is likely to be removed in some of the further releases. {% endwarning_block %} ### Database and Transactions Dynflow was designed to help with orchestration of other services. The usual execution looks as follows, we use an ActiveRecord User as example of a resource. 1. Trigger user creation, argument is an unsaved ActiveRecord user object 1. Planning: The user is stored in local DB (in the Dynflow hosting application) within the `plan` phase. The record is marked as incomplete. 1. Running: Operations needed for the user in external services with (e.g.) REST call. The phase finishes when the all the external calls succeeded successfully. 1. Finalizing: The record in local DB is marked as done: ready to be used. Potentially, saving some data that were retrieved in the `run` phase back to the local database. For that reason there are transactions around whole `plan` and `finalize` phase (all action's plan methods are in one transaction). If anything goes wrong in the `plan` phase any change made during planning to local DB is reverted. Same holds for finalizing, if anything goes wrong, all changes are reverted. Therefore all finalization methods has to be **idempotent**. Internally Dynflow uses Sequel as its ORM, but users may choose what they need to access they data. There is an interface `TransactionAdapters::Abstract` where its implementations may provide transactions using different ORMs. The most common one probably being `TransactionAdapters::ActiveRecord`. So in the above example 2. and 4. step would be wrapped in `ActiveRecord` transaction if `TransactionAdapters::ActiveRecord` is used. Second outcome of the design is convention when actions should be accessing local Database: - **allowed** - in `plan` and `finalize` phases - **disallowed** - (or at least discouraged) in the `run` phase {% warning_block %} *TODO warning about AR pool configuration, needs to have sufficient size* {% endwarning_block %} ### Composition Dynflow is designed to allow easy composition of small building blocks called `Action`s. Typically there are actions composing smaller pieces together and other actions doing actual steps of work as in following example: ```ruby class CreateInfrastructure < Dynflow::Action def plan sequence do concurrence do plan_action(CreateMachine, 'host1', 'db') plan_action(CreateMachine, 'host2', 'storage') end plan_action(CreateMachine, 'host3', 'web_server', :db_machine => 'host1', :storage_machine => 'host2') end end end ``` Action `CreateInfrastructure` does not have a `run` method defined, it only defines `plan` action where other actions composed together. ### Subscriptions Even though composing actions is quite easy and allows to decompose business logic to small pieces it does not directly support extensions by plugins. For that there are subscriptions. `Actions` can subscribe from a plugin, gem, any other library to already loaded `Actions`, doing so they extend the planning process with self. Lets look at an example starting by definition of a core action ```ruby # This action can be extended without doing any # other steps to support it. class ACoreAppAction < Dynflow::Action def plan(arguments) plan_self(args: arguments) plan_action(AnotherCoreAppAction, arguments.first) end def run puts "Running core action: #{input[:args]}" self.output.update success: true end end ``` followed by an action definition defined in a plugin/gem/etc. ```ruby class APluginAction < Dynflow::Action # plan this action whenever ACoreAppAction action is planned def self.subscribe ACoreAppAction end def plan(arguments) # arguments are same as in ACoreAppAction#plan plan_self(args: arguments) end def run puts "Running plugin action: #{input[:args]}" end end ``` Subscribed actions are planned with same arguments as action they are subscribing to which is called `trigger`. Their plan method is called right after planning of the triggering action finishes. It's also possible to access target action and use its output which makes it dependent (running in sequence) on triggering action. ```ruby def plan(arguments) plan_self trigger_success: trigger.output[:success] end def run self.output.update 'trigger succeeded' if self.input[:trigger_success] end ``` Subscription is designed for extension by plugins, it should **not** be used inside a single library/app-module. It would make the process definition hard to follow (all subscribed actions would need to be looked up). ### Suspending Sometimes action represents tasks taken in different services, (e.g. repository synchronization in [Pulp](http://www.pulpproject.org/)). Dynflow tries not to waste computer resources so it offers tools to free threads to work on other actions while waiting on external tasks or events. Dynflow allows actions to suspend and be woken up on external events. Lets create a simulation of an external service before showing the example of suspending action. ```ruby class AnExternalService def start_synchronization(report_to) Thread.new do sleep 1 report_to << :done end end end ``` The `AnExternalService` can be invoked to `start_synchronization` and it will report back a second later to action passed in argument `report_to`. It sends event `:done` back by `<<` method. Lets look at an action example. ```ruby class AnAction < Dynflow::Action EXTERNAL_SERVICE = AnExternalService.new def plan plan_self end def run(event) case event when nil # first run suspend do |suspended_action| EXTERNAL_SERVICE.start_synchronization suspended_action end when :done # external task is done output.update success: true # let the run phase finish normally else raise 'unknown event' end end end ``` Which is then executed as follows: 1. `AnAction` is triggered 1. It's planned. 1. Its `run` phase begins. 1. `run` method is invoked with no event (`nil`). 1. Matches with case branch initiating the external synchronization. 1. Action initializes the synchronization and pass in reference to suspended_action. 1. Action is suspended, execution of the run method finishes immediately after `suspend` is called, its block parameter is evaluated right after suspending. 1. Action is kept on memory to be woken up when events are received but it does not block any threads. 1. Action receives `:done` event through suspend action reference. 1. `run` method is executed again with `:done` event. 1. Output is updated with `success: true` and actions finishes `run` phase. 1. There is no `finalize` phase, action is done. This event mechanism is quite flexible, it can be used for example to build a [polling action abstraction](https://github.com/Dynflow/dynflow/blob/master/lib/dynflow/action/polling.rb) which is a topic for next chapter. ### Polling Not all services support callbacks to be registered which would allow to wake up suspended actions only once at the end when the external task is finished. In that case we often need to poll the service to see if the task is still running or finished. For that purpose there is `Polling` module in Dynflow. Any action can be turned into a polling one just by including the module. ```ruby class AnAction < Dynflow::Action include Dynflow::Action::Polling ``` There are 3 methods need to be always implemented: - `done?` - determines when the task is complete based on external task's data. - `invoke_external_task` - starts the external task. - `poll_external_task` - polls the external task status data and returns a status (JSON serializable data like: `Hash`, `Array`, `String`, etc.) which are stored in action's output. ```ruby def done? external_task[:progress] == 1 end def invoke_external_task triger_the_task_with_rest_call end def poll_external_task data = poll_data_with_rest_call progress = calculate_progress data # => a float in 0..1 { progress: progress data: data } end end ``` This action will do following in run phase: 1. `invoke_external_task` on first run of the action 1. suspends and then periodically: 1. wakes up 1. `poll_external_task` 1. checks if `done?`: - `true` -> it concludes the run phase - `false` -> it schedules next polling There are 2 other methods handling external task data which can optionally overridden: - `external_task` - reads the external task's stored data, by default it reads `self.output[:task]` - `external_task=` - writes the the external task's stored data, by default it writes to `self.output[:task] = value` There are also other features implemented like: - Gradual prolongation of the polling interval. - Retries on a poll failing. Please see the [`Polling` module](https://github.com/Dynflow/dynflow/blob/master/lib/dynflow/action/polling.rb) for more details. ### States Each **Action phase** can be in one of the following states: - **Pending** - Not yet executed. - **Running** - An action phase id being executed right now. - **Success** - Execution of an action phase finished successfully. - **Error** - There was an error during execution. - **Suspended** - Only `run` phase, when action sleeps waiting for events to be woken up. - **Skipped** - Failed actions can be marked as skipped allowing rest of the execution plan to finish successfully. - **Skipping** - Action is marked for skipping but execution plan was not yet resumed to mark it as Skipped. **Execution plan** has following states: - **Pending** - Planning did not start yet. - **Scheduled** - Scheduled for later execution, not yet planned. - **Planning** - It's being planned. - **Planned** - It've been planned, running phase did not start yet. - **Running** - It's running, `run` and `finalize` phases of actions are executed. - **Paused** - It was paused when running. Happens on error or executor restart. - **Stopped** - Execution plan is completed. **Execution plan** also has following results: - **Success** - Everything finished without error or skips. - **Warning** - When there are skipped steps. - **Error** - When one or more actions failed. - **Pending** - Execution plan still runs. *TODO how do I access such states as a programmer?* *TODO which Action phase states are "finish" and which requires user interaction?* ### Error handling If an error is raised in **`plan` phase**, it is persisted in the Action object for later inspection and it bubbles up in `World#trigger` method which was used to trigger the action leading to this error. If you compare it to errors raised during `run` and `finalize` phase, there's one major difference: Those never bubble up in `trigger` because they are running in executor not in triggering Thread, they are persisted just in the Action object. If there is an error in **`run` phase**, the execution pauses. You can inspect the error in [console](#console). The error may be intermittent or you may fix the problem manually. After that the execution plan can be resumed and it'll continue by rerunning the failed action and continuing with the rest of the actions. During fixing the problem you may also do the steps in the actions manually, in that case the failed action can be also marked as skipped. After resuming the skipped action is not executed and the execution plan continues with the rest. If there is an error in **`finalize` phase**, whole `finalize` phase for all the actions is rollbacked and can be rerun when the problem is fixed by resuming. If you encounter an error during run phase `error!` or usual `raise` can be used. #### Rescue strategy TODO ### Console TODO - *where to access* - *screenshots* ### Testing TODO - *testing helper methods* - *examples* - *see [testing of testing](https://github.com/Dynflow/dynflow/blob/master/test/testing_test.rb)* ### Long-running actions Dynflow was designed as an Orchestration tool, parallelization of heavy CPU computation tasks was not directly considered. Even with multiple executors single execution plan always runs on one executor, so without JRuby it wont scale well (MRI's GIL). However JRuby support should be added soon (TODO update when merged). Another problem with long-running actions are blocked worker. Executor has only a limited pool of workers, if more of them become busy it may result in worsen performance. Blocking actions for long time are also problematic. Solutions are: - **Using action suspending** - suspending the action until a condition is met, freeing the worker. - **Offloading computation** - CPU heavy parts can be offloaded to different services notifying the suspended actions when the computation is done. ### Middleware Each action class has chain of middlewares which wrap phases of the action execution. It's very similar to rack middlewares. To create new middleware inherit from `Dynflow::Middleware` class. It has 5 methods which can be overridden: `plan`, `run`, `finalize`, `plan_phase`, `finalize_phase`. Where the default implementation for all the methods looks as following ```ruby def plan(*args) pass *args end ``` When overriding user can insert code before and/or after the `pass` method which executes next middleware in the chain or the action itself which is at the end of the chain. Most usually the `pass` is always called somewhere in the overridden method. There may be some cases when it can be omitted, then it'll prevent all following middlewares and action from running. Some implementation examples: [KeepCurrentUser](https://github.com/theforeman/foreman-tasks/blob/master/app/lib/actions/middleware/keep_current_user.rb), [Action::Progress::Calculate](https://github.com/Dynflow/dynflow/blob/master/lib/dynflow/action/progress.rb#L13-L42). Each Action has a chain of middlewares defined. Middleware can be added by calling `use` in the action class. ```ruby class AnAction < Dynflow::Action use AMiddleware, after: AnotherMiddleware end ``` Method `use` understands 3 option keys: - `:before` - makes this middleware to be ordered before a given middleware - `:after` - makes this middleware to be ordered after a given middleware - `:replace` - this middleware will replace given middleware The `:before` and `:after` keys are used to build a graph from the middlewares which is then sorted down with [topological sort](http://ruby-doc.org//stdlib-2.0/libdoc/tsort/rdoc/TSort.html) to the chain of middleware execution. ### Sub-plans - *when to use?* - *how to use?* ## How it works TODO {% info_block %} This part is based on the current work-in-progress on [multi-executors support](https://github.com/Dynflow/dynflow/pull/139) {% endinfo_block %} ### Action states TODO - *normal phases and Present phase* - *how to walk the execution plan* ### The world anatomy The world represents the Dynflow's run-time and it acts as an external interface. It holds all the configuration and sub-components needed for the Dynflow to perform its job. The Dynflow worlds is composed of the following sub-components: 1. **persistence** - provides the durability functionality: see [persistence](#persistence) 1. **coordinator** - provides the coordination between worlds: see [coordinator](#coordinator) 1. **connector** - provides messages passing between worlds: see [connector](#connector) 1. **executor** - the run-time itself executing the execution plan. Not every worlds has to have the executor present (there might be pure client worlds: useful in production, see [develpment vs. production](#development-vs-production). 1. **client dispatcher** - responsible for communication between client requests and other worlds 1. **executor dispatcher** - responsible for getting requests from other worlds and sending the responses 1. **delayed executor** - responsible for planning and exectuion of scheduled tasks {% plantuml %} actor client frame "world" { interface adapter as PersistenceAdapter interface adapter as CoordinatorAdapter interface adapter as ConnectorAdapter [coordinator] -up-> CoordinatorAdapter [connector] -up-> ConnectorAdapter [persistence] -up-> PersistenceAdapter [connector] <-- [client dispatcher] [connector] <-- [executor dispatcher] [executor] <-- [executor dispatcher] [coordinator] <-- [client dispatcher] [coordinator] <-- [executor dispatcher] [persistence] <-- [executor] client -left--> [client dispatcher] client -left--> [persistence] } {% endplantuml %} The underlying technologies are hidden behind adapters abstraction, which allows choosing the right technology for the job, while keeping the rest of the Dynflow intact. ### Client world vs. executor world In the simplest case, the world handles both the client requests and the execution itself. This is useful for small all-in-one deployments and development. In production, however, one might want to separate the client worlds (often running in the web-server process or client library) and the executor worlds (running as part of standalone service). This setup makes the execution more stable and high-available (in active-active mode). There might be multiple client as well as executor worlds in the Dynflow infrastructure. The executor world has still its own client dispatcher, so that it can act as a client for triggering other execution plans (useful in [sub-plans](#sub-plans) feature). {% plantuml %} actor client frame "client world" { interface adapter as ClientPersistenceAdapter interface adapter as ClientCoordinatorAdapter interface adapter as ClientConnectorAdapter component [coordinator] as ClientCoordinator component [persistence] as ClientPersistence component [connector] as ClientConnector component [client dispatcher] as ClientClientDispatcher [ClientCoordinator] -down-> ClientCoordinatorAdapter [ClientConnector] -down-> ClientConnectorAdapter [ClientPersistence] -down--> ClientPersistenceAdapter [ClientClientDispatcher] --> [ClientConnector] [ClientClientDispatcher] --> [ClientCoordinator] client -down--> [ClientClientDispatcher] client -down--> [ClientPersistence] } frame "executor world" { interface adapter as PersistenceAdapter interface adapter as CoordinatorAdapter interface adapter as ConnectorAdapter [coordinator] -up--> CoordinatorAdapter [connector] -up--> ConnectorAdapter [persistence] -up-> PersistenceAdapter [connector] <-- [client dispatcher] [connector] <-- [executor dispatcher] [executor] <-- [executor dispatcher] [coordinator] <-- [client dispatcher] [coordinator] <-- [executor dispatcher] [persistence] <-- [executor] } database "Database" as Database ClientPersistenceAdapter -down- Database Database -down- PersistenceAdapter node "Synchronization\nservice" as SyncService ClientCoordinatorAdapter -down- SyncService SyncService -down- CoordinatorAdapter node "Message\nbus" as MessageBus ClientConnectorAdapter -down- MessageBus MessageBus -down- ConnectorAdapter {% endplantuml %} ### Single-database model Dynflow recognizes different expectations from the underlying technologies from the persistence (durability), coordinator (real-time synchronization) and connector (transport). However, we also realize that for many use-cases, a single shared database is just enough infrastructure the user needs for the job. Forcing using different technologies would mean just useless overhead. Therefore, it's possible to use a single shared SQL database to do all the work. {% plantuml %} actor client frame "client world" { interface adapter as ClientPersistenceAdapter interface adapter as ClientCoordinatorAdapter interface adapter as ClientConnectorAdapter component [coordinator] as ClientCoordinator component [persistence] as ClientPersistence component [connector] as ClientConnector component [client dispatcher] as ClientClientDispatcher [ClientCoordinator] -down-> ClientCoordinatorAdapter [ClientConnector] -down-> ClientConnectorAdapter [ClientPersistence] -down--> ClientPersistenceAdapter [ClientClientDispatcher] --> [ClientConnector] [ClientClientDispatcher] --> [ClientCoordinator] client -down--> [ClientClientDispatcher] client -down--> [ClientPersistence] } frame "executor world" { interface adapter as PersistenceAdapter interface adapter as CoordinatorAdapter interface adapter as ConnectorAdapter [coordinator] -up--> CoordinatorAdapter [connector] -up--> ConnectorAdapter [persistence] -up-> PersistenceAdapter [connector] <-- [client dispatcher] [connector] <-- [executor dispatcher] [executor] <-- [executor dispatcher] [coordinator] <-- [client dispatcher] [coordinator] <-- [executor dispatcher] [persistence] <-- [executor] } database "Database" as Database ClientPersistenceAdapter -down- Database Database -down- PersistenceAdapter ClientCoordinatorAdapter -down- Database Database -down- CoordinatorAdapter ClientConnectorAdapter -down- Database Database -down- ConnectorAdapter {% endplantuml %} {% info_block %} Something as simple as sqlite is enough for getting the Dynlfow up-and-running and getting something done. For the best connector results, it's recommended to use PostgreSQL, as Dynflow can utilize the [listen/notify](http://www.postgresql.org/docs/9.0/static/sql-notify.html) feature for better response times. {% endinfo_block %} ### Inner-world communication {% plantuml %} participant "Connector" as ClientConnector participant "Connector" as ExecutorConnector box "Client World" participant Client participant "Client Dispatcher" participant "ClientConnector" end box box "Executor World" participant "ExecutorConnector" participant "Executor Dispatcher" participant Executor end box autonumber Client -> "Client Dispatcher" : publish_request\nexecution_plan_id "Client Dispatcher" --> Client : IVar "Client Dispatcher" -> "ClientConnector" : send\nEnvelope[Execution] "ClientConnector" -> "ExecutorConnector" : handle_envelope\nEnvelope[Execution] "ExecutorConnector" -> "Executor Dispatcher" : handle_request\nEnvelope[Execution] "Executor Dispatcher" -> Executor : execute\nexecution_plan_id note over Executor: executing… activate Executor "Executor Dispatcher" -> ExecutorConnector : send\nEnvelope[Accepted] "ExecutorConnector" -> "ClientConnector" : handle_envelope\nEnvelope[Accepted] "ClientConnector" -> "Client Dispatcher" : dispatch_response\nEnvelope[Accepted] Executor -> "Executor Dispatcher" : finished deactivate Executor "Executor Dispatcher" -> ExecutorConnector : send\nEnvelope[Finished] "ExecutorConnector" -> "ClientConnector" : handle_envelope\nEnvelope[Finished] "ClientConnector" -> "Client Dispatcher" : dispatch_response\nEnvelope[Finished] note over "Client Dispatcher": IVar fullfilled {% endplantuml %} 1) the client prepares an execution plan, saves it into persistence and passes its id to the client dispatcher 2) the client dispatcher creates an [IVar](http://www.rubydoc.info/github/ruby-concurrency/concurrent-ruby/Concurrent/IVar) that represents the future value of the execution plan after it finishes execution. The client can use this value to wait for the execution plan to finish or hook other procedures on the finish-time. 3) the client dispatcher creates an envelope (see [connector](#connector) for more details), containing the request for execution, with ``receiver id`` set to ``AnyExecutor`` 4) the connector uses some scheduling algorithm to choose what executor to send the request to and replaces the ``AnyExecutor`` with it. It sends the envelope the the chosen executor. 5) the connector on the executor side receives the envelope and asks the executor dispatcher to handle it 6) the executor dispatcher acquires the lock on the execution plan and initiates the execution. In the mean-time, it lets the client know that the work was accepted (there might be additional logic on the client to handle a timeout after the work was not accepted) 7 - 9) the connector layer propagates the ``Accepted`` response back to client 10) the executor finishes execution 11 - 12) the connector layer propagates the ``Finished`` response 13) the client dispatcher resolves the original ``IVar``, so that the client is able to find out about the finished execution. The behavior is the same even in the "one world" scenario: in that case this world is participating both on the client and the executor side, connected through a direct in-memory connector. ### Persistence The persistence making sure that the serialized states of the execution plans are persisted for recovery and status tracking. The execution plan data are stored in it, with the actual state. Unlike coordinator, all the persisted data don't have to be available for all the worlds at the same time: every world needs just the data that it is actively working on. Also, all the data don't have to be fully synchronized between worlds (as long as the up-to-date data about relevant execution plans are available for the world). ### Connector Provides messages passing between worlds. The message has a form of envelope of the following structure: * **sender id** - the id of the world that produced the envelope * **receiver id** - the id of the world that should receive the message, or ``AnyExecutor``, when load-balancing * **request id** - the client-unique id used for pairing the request - response at the client * **message** - the body of the message: the connector doesn't care about that as long as it's serializable #### Load-balancing The connector is responsible for spreading the load across the executor worlds. It's determined by the ``AnyExecutor`` value at the ``request id`` field. The implementation available in the current version uses a simple round-robin algorithm for this purpose. ### Coordinator This component (especially important in a multi-executor setup): Makes sure no two executors are executing the same execution plan at the same time (a.k.a locking). It also provides the information about the worlds available in the system (the worlds register). Unlike the persistence, it's not required to persist the data (could be recalculated), but it needs to provide a globally shared state. The main type of objects the coordinator works with is a record which consists of: * type - the type of the record * id - the id of the record (unique in scope of a type) * data - data in arbitrary format, based on the type There is a special type of record called ``lock``, that keeps the owner information as well. It's used for keeping information about what executor is actively working on what execution plan: the executor is not allowed to start executing the unless it has successfully acquired a lock for it. ### Thread-pools TODO - *how it works now* - *how it'll work* - *gotchas* - *worker pool sizing* ### Suspending -> events TODO ## Use cases TODO - *Embedded without a DB, like inside CLI tool for a complex installation* - *reserve resources in planning do not try to do `if`s in run phase* - *Projects: katello, foreman, staypuft, fusor* ## Comments **Comments are temporally turned on here for faster feedback.**