Sha256: b96a230be9916e9de3abdd4e7de7be9555ceca9145a0de9b42efde01d7f983fa

Contents?: true

Size: 961 Bytes

Versions: 3

Compression:

Stored size: 961 Bytes

Contents

#!/usr/bin/env ruby
require 'rubygems'
require 'wukong/script'
require 'wukong/streamer/list_reducer'

module PageRank
  class Script < Wukong::Script
    #
    # Input format is
    #
    #   rsrc    src_id  dest_id  [... junk ...]
    #
    # All we want from the line are its src and dest IDs.
    #
    def map_command
      %Q{/usr/bin/cut -d"\t" -f2,3}
    end
  end

  #
  # Accumulate the dests list in memory, dump as a whole. Multiple edges between
  # any two nodes are permitted, and will accumulate pagerank according to the
  # edge's multiplicity.
  #
  class Reducer < Wukong::Streamer::ListReducer
    def accumulate src, dest
      @values << dest
    end

    # Emit src, initial pagerank, and flattened dests list
    def finalize
      @values = ['dummy'] if @values.blank?
      yield [key, 1.0, @values.to_a.join(",")]
    end
  end

  # Execute the script
  Script.new(nil, PageRank::Reducer, :io_sort_record_percent => 0.25).run
end



Version data entries

3 entries across 3 versions & 1 rubygems

Version Path
wukong-3.0.0.pre old/examples/pagerank/pagerank_initialize.rb
wukong-2.0.2 examples/pagerank/pagerank_initialize.rb
wukong-2.0.1 examples/pagerank/pagerank_initialize.rb