= Class Reference
Working up from the ground is useful to get a sense for how Tap does what it does. This reference goes through the modules and classes that build up a task application: Tasks, Apps, and Envs.
== Tasks
==== Methods
http://tap.rubyforge.org/images/Method.png
Tasks begin with methods, simply a block of code.
==== Tap::Support::Executable
http://tap.rubyforge.org/images/Executable.png
Executable extends objects allowing them to be used in workflows, enqued, and run by an App. Executable objects specify a method that gets called upon execution; in essence Executable wraps this method and adds workflow support for dependencies, joins, batches, and auditing. Any method may be made executable, and so any method can participate in a workflow (see Object#_method).
Tasks are constructed such that process is the executable method (actually execute_with_callbacks is used to wrap process with before/after execute callbacks, but effectively this is the case). Hence process is the standard method overridden in Task subclasses.
==== {Configurable}[http://tap.rubyforge.org/configurable/]
http://tap.rubyforge.org/images/Configurable.png
Tap uses the {Configurable}[http://tap.rubyforge.org/configurable/] module to declare class configurations and make them available in for use in contexts like the command line. Configurations essentially consist of a reader, writer, and default value, but they may additionally be accessed through a hash-like config object. For instance:
class ConfigClass
include Configurable
config :key, 'value' do |input|
input.upcase
end
def initialize
initialize_config
end
end
Is basically the same as:
class RegularClass
attr_reader :key
def key=(input)
@key = input.upcase
end
def initialize
self.key = 'value'
end
end
And as you can see here:
c = ConfigClass.new
c.key # => 'VALUE'
c.config[:key] = 'new value'
c.key # => 'NEW VALUE'
c.key = 'another value'
c.config[:key] # => 'ANOTHER VALUE'
This setup is both fast and convenient.
==== {Configurable::Validation}[http://tap.rubyforge.org/configurable/classes/Configurable/Validation.html]
When configurations are set from the command line, the writer method inevitably receives a string, even though a non-string input may be desired. The {Validation}[http://tap.rubyforge.org/configurable/classes/Configurable/Validation.html] module provides standard blocks for validating and transforming inputs, accessible through the c method (ex: c.integer or c.regexp). These blocks (generally) load string inputs as YAML and validate that the result is the correct class; non-string inputs are simply validated.
class ValidatingClass
include Configurable
config :int, 1, &c.integer # assures the input is an integer
config :int_or_nil, 1, &c.integer_or_nil # integer or nil only
config :array, [], &c.array # you get the idea
end
vc = ValidatingClass.new
vc.array = [:a, :b, :c]
vc.array # => [:a, :b, :c]
vc.array = "[1, 2, 3]"
vc.array # => [1, 2, 3]
vc.array = "string" # !> ValidationError
Validation blocks sometimes imply metadata. For instance c.flag makes a config into a flag on the command line.
==== {Lazydoc}[http://tap.rubyforge.org/lazydoc]
Ah lazydoc. {Lazydoc}[http://tap.rubyforge.org/lazydoc] fits into the space between live code and code documentation. Lazydoc can scan a file (code or not) and pull documentation into the object space where it can be utilized. Lazydoc uses a key-value syntax like this:
# ::key value
Lazydoc parses a constant name, the key, the value, and any comment following the value until a non-comment line or an end key. For example:
[lazydoc_file.rb]
# Name::Space::key value
#
# This documentation
# gets parsed.
#
# Name::Space::another another value
# This gets parsed.
# Name::Space::another-
#
# This does not.
require 'tap'
lazydoc = Lazydoc[__FILE__]
lazydoc.resolve
lazydoc['Name::Space']['key'].comment # => "This documentation gets parsed."
lazydoc['Name::Space']['another'].value # => "another value"
Furthermore, Lazydoc can register specific lines for documentation. These lines are parsed to echo what happens in RDoc.
[another_lazydoc_file.rb]
# documentation
# for the method
def method
end
require 'tap'
lazydoc = Lazydoc[__FILE__]
code_comment = lazydoc.register(2)
lazydoc.resolve
code_comment.subject # => "def method"
code_comment.to_s # => "documentation for the method"
Tap uses Lazydoc to indicate when a file contains a Task (::manifest) or a generator (::generator), and for config documentation. Tap::Env uses this information to facilitate lookup and instantiation of task classes.
When no constant name is specified for a Lazydoc key, Env uses a constant based on the file name.
[lib/sample/task.rb]
# ::manifest sample task description
#
# This manifest is expected to apply to the Sample::Task class.
# If more than one task is defined in this file, or if Sample::Task
# is not defined by loading this file, Tap will run into trouble.
However, the best practice is to include the namespace explicitly. See the {Lazydoc}[http://tap.rubyforge.org/lazydoc] documentation for more information.
=== Tap::Task
http://tap.rubyforge.org/images/Task.png
Running a task from the command line using tap (or rap) instantiates a task, configures it, enques it, and runs an App to pass inputs to the process method. Tasks do not have to be used this way; they are perfectly capable as objects in free-standing scripts.
Task instances may be interned with a block that acts as a stand-in for process:
t = Tap::Task.intern {|task| 1 + 2 }
t.process # => 3
t = Tap::Task.intern {|task, x, y| x + y }
t.process(1, 2) # => 3
Tasks can be configured,
runlist = []
t1 = Tap::Task.intern(:key => 'one') do |task, input|
runlist << task
"#{input}:#{task.config[:key]}"
end
joined into dependency-based workflows,
t0 = Tap::Task.intern {|task| runlist << task }
t1.depends_on(t0)
imperative workflows,
t2 = Tap::Task.intern do |task, input|
runlist << task
"#{input}:two"
end
t1.sequence(t2)
and batched.
t3 = t1.initialize_batch_obj(:key => 'three')
t1.batch # => [t1, t3]
Batched tasks enque together, and therefore execute sequentially with the same inputs. Results are aggregated into the underlying Tap::App.
t1.enq('input')
app = Tap::App.instance
app.run
runlist # => [t0, t1, t2, t3, t2]
app.results(t2) # => ["input:one:two", "input:three:two"]
Tracking the evolution of a result through a workflow can get complex; Tap audits workflows to help. In the audit trail, the tasks are identified by name. Lets set the names of the tasks and take a look at the audit trails of the t2 results:
t1.name = 'un'
t2.name = 'deux'
t3.name = 'trois'
app._results(t2).collect do |_result|
_result.dump
end.join("---\n")
# =>
# o-[] "input"
# o-[un] "input:one"
# o-[deux] "input:one:two"
# ---
# o-[] "input"
# o-[trois] "input:three"
# o-[deux] "input:three:two"
== Apps
==== Tap::Root
http://tap.rubyforge.org/images/Root.png
A Root represents the base of a directory structure. Roots allow you to alias relative paths, basically allowing you to develop code for a conceptual directory structure that can be defined later.
root = Tap::Root.new '/path/to/root'
root.root # => '/path/to/root'
root['config'] # => '/path/to/root/config'
root.filepath('config', 'sample.yml') # => '/path/to/root/config/sample.yml'
While simple, this ability to alias paths is useful, powerful, and forms the basis of the Tap execution environment.
==== Tap::Support::ExecutableQueue
http://tap.rubyforge.org/images/ExecutableQueue.png
Apps coordinate the execution of tasks through a queue. The queue is just a stack of Executable objects, basically methods, and the inputs to those methods; during a run the enqued methods are sequentially executed with the inputs.
==== Tap::Support::Dependencies
Dependencies coordinate the registration and resolution of dependencies, which may be shared across multiple tasks.
==== Tap::Support::Audit
Tap tracks inputs as they are modified by various tasks, again through Executable. At the end of a run, any individual result can be tracked back to it's original value with references to the source of each change (ie the task). This auditing can be very useful when workflows diverge, as they often do.
Auditing is largely invisible except in on_complete blocks. on_complete blocks receive the audited results so that this information can be used, as needed, to make decisions.
Task.new.on_complete do |_result| # _result is an Audit instance
_result.value # the current value
end
To help indicate when a result is actually a result and when it is an audit, Tap uses a convention whereby a leading underscore signals auditing is involved.
==== Tap::Support::Aggregator
When a task completes, it executes it's on_complete block to handle the results, perhaps passing them on to other tasks. Aggregators collect results when no on_complete block is specified. Results are collected per-task into an array; a single task executed many times will have it's results aggregated into this single array.
=== Tap::App
http://tap.rubyforge.org/images/App.png
Instances of Tap::App coordinate the execution of tasks. Apps are basically a subclass of Root with an ExecutableQueue, Dependencies, and an Aggregator. Task initialization requires an App, which is by default Tap::App.instance. Tasks use their app for logging, dependency-resolution, checks, and to enque themselves. Normally a script will only need and use a single instance (often Tap::App.instance), but there is no reason why multiple instances could not be used.
log = StringIO.new
app = Tap::App.instance
app.logger = Logger.new(log)
app.logger.formatter = lambda do |severity, time, progname, msg|
" %s %s: %s\n" % [severity[0,1], progname, msg]
end
t = Tap::Task.intern {|task, *inputs| inputs }
t.log 'action', 'to app'
log.string # => " I action: to app\n"
t.enq(1)
t.enq(2,3)
app.queue.to_a # => [[t, [1]], [t, [2,3]]]
app.run
app.results(t) # => [[1], [2,3]]
As shown, apps also aggregate results for tasks, which is important for workflows.
== Envs
==== Tap::Env
http://tap.rubyforge.org/images/Env.png
Basically a wrapper for a Root, Envs define methods to generate manifests for a type of file-based resource (tasks, generators, etc). Furthermore they provide methods to uniquely identify the resource by path or, more specifically, minimized base paths. In this directory structure:
path
`- to
|- another
| `- file.rb
|- file-0.1.0.rb
|- file-0.2.0.rb
`- file.rb
The minimal paths that uniquely identify these files are (respectively):
'another/file'
'file-0.1.0'
'file-0.2.0'
'file.rb'
Envs facilitate mapping the minimal path, which might be provided by the command line, to the actual path, and hence to the resource. Envs can be nested so that manifests span multiple directories. Indeed, this is how tap accesses tasks and generators within gems; the gem directories are initialized as Envs and nested within the Env for the working directory.
http://tap.rubyforge.org/images/Nested-Env.png
To prevent conflicts between similarly-named resources under two Envs, Env allows selection of Envs, also by minimized paths. Say you installed the 'sample_tasks' gem.
% tap manifest
--------------------------------------------------------------------------------
Desktop: (/Users/username/Desktop)
--------------------------------------------------------------------------------
sample_tasks: (/Library/Ruby/Gems/1.8/gems/sample_tasks-0.10.0)
tasks
concat (lib/tap/tasks/concat.rb)
copy (lib/tap/tasks/copy.rb)
grep (lib/tap/tasks/grep.rb)
print_tree (lib/tap/tasks/print_tree.rb)
--------------------------------------------------------------------------------
tap: (/Library/Ruby/Gems/1.8/gems/tap-0.10.8)
generators
command (lib/tap/generator/generators/command/command_generator.rb)
config (lib/tap/generator/generators/config/config_generator.rb)
file_task (lib/tap/generator/generators/file_task/file_task_generator.rb)
generator (lib/tap/generator/generators/generator/generator_generator.rb)
root (lib/tap/generator/generators/root/root_generator.rb)
task (lib/tap/generator/generators/task/task_generator.rb)
commands
console (cmd/console.rb)
destroy (cmd/destroy.rb)
generate (cmd/generate.rb)
manifest (cmd/manifest.rb)
run (cmd/run.rb)
server (cmd/server.rb)
tasks
dump (lib/tap/tasks/dump.rb)
load (lib/tap/tasks/load.rb)
rake (lib/tap/tasks/rake.rb)
--------------------------------------------------------------------------------
Desktop
|- sample_tasks
`- tap
In this printout of the manifest, you can see the resources available to tap on the Desktop (none), in the sample_tasks gem, and in tap itself. Since there aren't any conflicts among tasks, the minipath of any of the tasks is sufficient for identification:
% tap run -- print_tree
% tap run -- dump
If there were a conflict, you'd have to specify the environment minipath like:
% tap run -- sample_tasks:print_tree
% tap run -- tap:dump
Note the same rules apply for rap:
% rap print_tree
% rap tap:dump
==== Tap::Exe
http://tap.rubyforge.org/images/Run-Env.png
The tap (and rap) executable environment. Tap::Exe adds several configurations (ex before/after) which only get loaded for the present directory, and methods for building and executing workflows from command line inputs. Tap::Exe is a singleton, and is special because it wraps Tap::App.