README.rdoc in jruby-on-hadoop-0.0.3 vs README.rdoc in jruby-on-hadoop-0.0.4

- old
+ new

@@ -1,36 +1,48 @@ = JRuby on Hadoop JRuby on Hadoop is a thin wrapper for Hadoop Mapper / Reducer by JRuby. +We recommend to use this with hadoop-rubydsl on the github / gemcutter. +== Description + == Install Required gems are all on GemCutter. 1. Upgrade your rubygem to 1.3.5 2. Install gems $ gem install jruby-on-hadoop -== Description +== Usage 1. Run Hadoop cluster on your machines and set HADOOP_HOME env variable. 2. put files into your hdfs. ex) test/inputs/file1 3. Now you can run 'joh' like below: $ joh examples/wordcount.rb test/inputs test/outputs You can get Hadoop job results in your hdfs test/outputs/part-* -Script example. (see also examples/wordcount.rb) +== Example +see also examples/wordcount.rb def setup(conf) # setup jobconf end - def map(script, key, value, output, reporter) + def map(key, value, output, reporter) # mapper process + # (wordcount example) + value.split.each do |word| + output.collect(word, 1) + end end - def reduce(script, key, values, output, reporter) + def reduce(key, values, output, reporter) # reducer process + # (wordcount example) + sum = 0 + values.each {|v| sum += v } + output.collect(key, sum) end == Build You can build hadoop-ruby.jar by "ant".