Sha256: f3a1b3922d74d45d5be14cfe5cce00241ec2c5a388677cddf579c491545aa033

Contents?: true

Size: 623 Bytes

Versions: 3

Compression:

Stored size: 623 Bytes

Contents

#! /usr/bin/env jruby
$: << File.join(File.dirname(__FILE__), '..', 'lib')

require 'cascading'
require 'samples/cascading'

cascade 'logwordcount' do
  flow 'logwordcount' do
    source 'input', tap('http://www.gutenberg.org/files/20417/20417-8.txt')

    assembly 'input' do
      # TODO: create a helper for RegexSplitGenerator
      each 'line', :function => regex_split_generator('word', :pattern => /[.,]*\s+/)
      group_by 'word' do
        count
      end
      group_by 'count', :reverse => true
    end

    sink 'input', tap('output/logwordcount', :sink_mode => :replace)
  end
end.complete(sample_properties)

Version data entries

3 entries across 3 versions & 1 rubygems

Version Path
cascading.jruby-0.0.6 samples/logwordcount.rb
cascading.jruby-0.0.5 samples/logwordcount.rb
cascading.jruby-0.0.4 samples/logwordcount.rb