= GEARMAN "Gearman provides a generic application framework to farm out work to other machines or processes that are better suited to do the work. It allows you to do work in parallel, to load balance processing, and to call functions between languages. It can be used in a variety of applications, from high-availability web sites to the transport of database replication events. In other words, it is the nervous system for how distributed processing communicates." - http://www.gearman.org/ == Setting up a basic environment A very basic Gearman environment will look like this: ---------- | Client | ---------- | -------------- | Job Server | -------------- | ---------------------------------------------- | | | | ---------- ---------- ---------- ---------- | Worker | | Worker | | Worker | | Worker | ---------- ---------- ---------- ---------- And the behavior will be the following: * JobServer: Acts as a message passing point. * Client: Sends tasks to the JobServer. Will be connected to only one JobServer in case more than one exits for failover purposes. * Worker: Anounce his 'abilities' to the JobServer and waits for tasks. For the JobServer we recommend to use the offical Perl version, there's also a more performant C implementation of the server with support for persistent queues, bells and whistles but is not stable enough for production use at the time of this document was wrote. The Client and the Worker can be implemented in any language. This way you can send tasks from a Ruby client server, to a Perl or C worker in order to get better performance. == Installing the required software For the JobServer we recommend to use the offical Perl version, to install it: * Mac OS X: sudo port install p5-gearman-server * Debian/Ubuntu: sudo apt-get install gearman-server To get the Ruby libraries by Xing: git clone git://github.com/xing/gearman-ruby.git == Gearman demo Now you're ready for you first experience with Gearman. In the cloned repository you'll find an 'examples' directory. Run the 'gearman_environment.sh' to build an environment like the one showed in the diagram above. * Client: Will ask you for an arithmetic operation, like: 2+3 The code of the client is in: 'examples/calculus_client.rb' * JobServer: The Perl server. * Workers: You'll have 4 worker, one for each of the basic arithmetic operations. The code of the worker is in: 'examples/calculus_worker.rb' There are other demos in the examples folder you can give a look at. Each demo usually consist in the client and server scripts. === Creating clients and tasks In order to get a job scheduled by a Gearman server using the gearman ruby library, there are three main objects you must interact with: Gearman::Client, Gearman::Task and Gearman::TaskSet. Let's review all of them briefly: - Gearman::Client -> the portion of the library storing the data about the connection to the Gearman server. - Gearman::Task -> a job execution request that will be dispatched by the Gearman server to worker and whose result data will be returned to the client. - Gearman::TaskSet -> a collection of tasks to be executed. The Taskset object will track the execution of the tasks with the info returned from the Gearman server and notify the client with the results or errors when all the tasks have completed their execution. To send a new task to the Gearman server, the client must build a new Gearman::Task object, add it to a Gearman::TaskSet that must hold a reference to a Gearman::Client and send the wait message to the TaskSet. The following code taken from examples/client.rb shows the process: ---------------------------------------------------- servers = ['localhost:4730', 'localhost:4731'] client = Gearman::Client.new(servers) taskset = Gearman::TaskSet.new(client) task = Gearman::Task.new('sleep', 20) task.on_complete {|d| puts d } taskset.add_task(task) taskset.wait(100) ---------------------------------------------------- The name of the function to be executed is the first parameter to the constructor of the Task object. Take into account that the string you pass as a parameter will be used 'as it' by the Gearman server to locate a suitable worker for that function. The second parameter is the argument to be sent to the worker that will execute, if the arguments for the task are complex, a serialization format like YAML or XML must be agreeded with the workers. The last and optional parameter is a hash of options. The following options are currently available: - :priority -> (:high | :low) the priority of the job, a high priority job is executed before a low on - :background -> (true | false) a background task will return no further information to the client. The execution of a task in a Gearman remote worker can fail, the worker can throw an exception, etc. All these events can be handled by the client of the ruby library registering callback blocks. The following events are currently available: - on_complete -> the task was executed succesfully - on_fail -> the task fail for some unknown reason - on_retry -> after failure, the task is gonna be retried, the number of retries is passed - on_exception -> the remote worker send an exception notification, the exception text is passed - on_stauts -> a status update is sent by the remote worker In order to receive exception notifications in the client, this option must be sent in the server, the method option_request can be used this task. The following example, extracted from the examples/client_exception.rb demo script shows the process: ---------------------------------------------------- client = Gearman::Client.new(servers) #try this out client.option_request("exceptions") ---------------------------------------------------- This feature will only works if the server and workers have implemented support for the OPT_REQ and WORK_EXCEPTION messages of the Gearman protocol. Enjoy.