Sha256: 53e947189fe738a8e259f96311a12627911c51f51ea9cfde1d00d9fec8b48402

Contents?: true

Size: 791 Bytes

Versions: 4

Compression:

Stored size: 791 Bytes

Contents

module Company
  module Mapping
    # A simple text document
    class TextDocument
      attr_accessor :id, :contents, :tokenizer

      def initialize(id = SecureRandom.uuid, contents = "", tokenizer = BasicTokenizer.new)
        @id, @contents, @tokenizer = id, contents, tokenizer
      end

      def bag_of_words
        @tf = TermFrequency.new(@tokenizer)
        @bag_of_words = @tf.calculate(@contents)
        @bag_of_words
      end

      def equal?(o)
        o.class == self.class && o.state == self.state
      end

      def ==(o)
        o.class == self.class && o.state == self.state
      end

      def to_s
        "TextDocument:{#{id},#{contents}}"
      end

      protected
      def state
        [@id]
      end
    end
  end
end

Version data entries

4 entries across 4 versions & 1 rubygems

Version Path
company-mapping-0.2.3 lib/company/mapping/document_utils/text_document.rb
company-mapping-0.2.2 lib/company/mapping/document_utils/text_document.rb
company-mapping-0.2.1 lib/company/mapping/document_utils/text_document.rb
company-mapping-0.2.0 lib/company/mapping/document_utils/text_document.rb