Sha256: ae008ab638aabf7ab141ce6f60c8904404c142f948d5a01238d0afd3d70cdfcb

Contents?: true

Size: 611 Bytes

Versions: 1

Compression:

Stored size: 611 Bytes

Contents

require_relative 'simple'
require 'lingua/stemmer'
module Preprocessor
  #
  # Preprocessor Base Class
  #
  # @author Andreas Eger
  #
  class Stemming < Simple

    def initialize(args={})
      super
      @stemmer = Lingua::Stemmer.new(language: @language)
    end
    def label
      "with_stemming"
    end

    def clean_description desc
      super.map{|w| @stemmer.stem(w) }
    end
    private
    def process_job job
      PreprocessedData.new(
        data: [clean_title(job[:title]), clean_description(job[:description])],
        id: job[:id],
        label: job[:label]
      )
    end
  end
end

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
svm_helper-0.2.1 lib/svm_helper/preprocessors/stemming.rb