Sha256: d228fc010a2b8399c1d627d640be4254a256887c2bbca3550ac67626c59ac5f3
Contents?: true
Size: 598 Bytes
Versions: 4
Compression:
Stored size: 598 Bytes
Contents
module Company module Mapping # Raw term frequency (number of times a token appears in a given string - document) class TermFrequency def initialize(tokenizer) @tokenizer = tokenizer end #Calculates the raw term frequency given the contents of the document. def calculate(text) rawFrequency(text) end protected def rawFrequency(contents) @tokenizer.tokenize(contents).each_with_object({}) do |token, tf| tf[token] ||= 0 tf[token] += 1 end end end end end
Version data entries
4 entries across 4 versions & 1 rubygems