Sha256: 752652294fcc37864a35d0c9f01fe6c90fdb6c9bdb5eb5762bf0353d2890e600
Contents?: true
Size: 1.5 KB
Versions: 1
Compression:
Stored size: 1.5 KB
Contents
Kinds of Oozie Jobs: Workers versus Drivers --------------------------------------------------------------------------------- The are two distinct and toplevel concerns when dealing with oozie workflows: 1. Defining the data flows and how each transforms and produces data 2. Giving these independent data flows a context and schedule to run on A "worker" is an Oozie workflow that does the former. It isn't concerned about what context it runs within, how many times it has retried, whether SLA alerts are being reported on its results, when it runs etc. It does one straight-forward job: translate input data to output data. A "driver" is an Oozie workflow and coordinator pair, that runs one or more workers as sub-workflows, passing them the aforementioned context as parameters. Drivers concern themselves with everthing workers do not, like what order multiple workers should run in, reporting SLA violations when its child workers fail, deciding between an hourly or daily run schedule etc. Hodor expects your Hadoop project get repo to be structured as follows: <git_repo> - workers/ - w1/ workflow.xml hive_scripts etc - w2 workflow.xml hive_scripts etc. - drivers/ - d1/ workflow.xml - d2/ workflow.xml The workers directory contains all workers, organized as you see fit. While the drivers directory contains all drivers, also organized according to preference.
Version data entries
1 entries across 1 versions & 1 rubygems
Version | Path |
---|---|
hodor-1.0.2 | topics/oozie/workers_and_drivers.txt |