Sha256: 944451be5b2da0f2fb765d27b34fccbae97901645e03b0194efd4aeaefe754f5
Contents?: true
Size: 801 Bytes
Versions: 1
Compression:
Stored size: 801 Bytes
Contents
module Company module Mapping class TermFrequency def initialize(tokenizer) @tokenizer = tokenizer end #Calculates the raw term frequency given the contents of the document. def calculate(text) return rawFrequency(text) end def info return "Raw term frequency (number of times a token appears in a given string - document)" end protected def rawFrequency(contents) _tokens = @tokenizer.tokenize(contents) _tf = Hash.new _tokens.each { |_token| if (!_tf.has_key?(_token)) _tf[_token] = 1 else _tf[_token] = _tf[_token] + 1 end } return _tf end end end end
Version data entries
1 entries across 1 versions & 1 rubygems
Version | Path |
---|---|
company-mapping-0.1.0 | lib/company/mapping/tfidf/tf/term_frequency.rb |