README.md in rflow-1.0.0a4 vs README.md in rflow-1.0.0a5

- old
+ new

@@ -62,14 +62,15 @@ messages out/in a single port, as well as allow a single port to be accessed by an array. * __Connection__ - a directed link between an output port and an input port. RFlow supports generalized connection types; however, only - ZeroMQ links are currently used. + ZeroMQ links are currently used. Round-robin and broadcast message + delivery are supported on a per-link basis. * __Message__ - a bit of serialized data that is sent out an output - port and recieved on an input port. Due to the serialization, + port and received on an input port. Due to the serialization, message types and schemas are explicitly defined. In a departure from "pure" FBP, RFlow supports sending multiple message types via a single connection. * __Workflow__ - the common name for the digraph created when the @@ -226,20 +227,19 @@ The `provenance` is a way for a component to annotate a message with a bit of data that should (by convention) be carried through the workflow with the message, as well as being copied to derived messages. For example, a TCP server component would spin up a TCP -server and, upon recieving a connection and packets on a session, it +server and, upon receiving a connection and packets on a session, it would marshal the packets into `RFlow::Messsage`s and send them out its output ports. Messages received on its input port, however, need to have a way to be matched to the corresponding underlying TCP connection. `provenance` provides a method for the TCP server component to add a bit of metadata (namely an identifier for the TCP connection) such that later messages that contain the same provenance can be matched to the correct underlying TCP connection. - The other parts of the message envelope are related to the embedded data object. In addition to the data object itself (which is encoded with a specific Avro schema), there are a few fields that describe the embedded data, namely the `data_type_name`, the `data_serialization_type`, and the `data_schema`. By including all @@ -317,11 +317,10 @@ message.data.default? # => true message.data.int = 1024 messaga.data.default? # => false ``` - ## RFlow Workflow Configuration RFlow currently stores its configuration in a SQLite database which are internally accessed via ActiveRecord. Given that SQLite is a rather simple and standard interface, non-RFlow components could @@ -349,16 +348,17 @@ * ports - belonging to a component (via `component_uuid` foreign key), also has a `type` column for ActiveRecord STI, which gets set to either a `RFlow::Configuration::InputPort` or `RFlow::Configuration::OutputPort`. -* connections - a connection between two ports via foriegn keys +* connections - a connection between two ports via foreign keys `input_port_uuid` and `output_port_uuid`. Like ports, connections are typed via AR STI (`RFlow::Configuration::ZMQConnection` and - 'RFlow::Configuration::BrokeredZMGConnection` are the only + `RFlow::Configuration::BrokeredZMQConnection` are the only supported values for now) and have a YAML serialized `options` - hash. A connection also (potentially) defines the port keys. + hash and a `delivery` type (`round-robin` or `broadcast`). + A connection also (potentially) defines the port keys. RFlow also provides a RubyDSL for configuration-like file to be used to load the database: ```ruby @@ -409,13 +409,14 @@ ZeroMQ communication between components in the same shard uses ZeroMQ's `inproc` socket type for maximum performance. ZeroMQ communication between components in different shards is accomplished with a ZeroMQ `ipc` socket. In the case of a many-to-many connection (many workers in a producing shard and many workers in a consuming shard), a ZeroMQ message broker -process is created to route the messages appropriately. Senders round-robin -to receivers and receivers fair-queue the messages from the senders. -Load balancing based on receiver responsiveness is not currently implemented. +process is created to route the messages appropriately. By default, +senders round-robin to receivers, though broadcast delivery can be chosen +instead. Receivers fair-queue the messages from senders. Load balancing +based on receiver responsiveness is not currently implemented. To define a custom shard in the Ruby DSL, use the `shard` method. For example: ```ruby @@ -460,10 +461,12 @@ # Wire components together config.connect 'generate_ints1#out' => 'filter#in' config.connect 'generate_ints2#out' => 'filter#in' config.connect 'filter#filtered' => 'replicate#in' - config.connect 'filter#out' => 'output1#in' + # choosing broadcast delivery delivers a copy to each worker for + # the shard + config.connect 'filter#out' => 'output1#in', :delivery => 'broadcast' config.connect 'filter#filtered' => 'output2#in' end ``` At runtime, shards with no components defined will have no workers and