Sha256: 867e0e8c76973463e20021a13b80f7f02d8bbce5424cfd2f3880a6eff09a2768

Contents?: true

Size: 835 Bytes

Versions: 1

Compression:

Stored size: 835 Bytes

Contents

module Company
  module Mapping

    class TextDocument
      attr_accessor :id, :contents, :tokenizer

      def initialize(id = SecureRandom.uuid, contents = "", tokenizer = BasicTokenizer.new)
        @id, @contents, @tokenizer = id, contents, tokenizer
      end

      def bag_of_words
        @tf = TermFrequency.new(@tokenizer)
        @bag_of_words = @tf.calculate(@contents)
        @bag_of_words
      end

      def equal?(o)
        o.class == self.class && o.state == self.state
      end

      def ==(o)
        o.class == self.class && o.state == self.state
      end

      def info
        return "A simple text document"
      end

      def to_s
        "TextDocument:{#{id},#{contents}}"
      end

      protected
      def state
        [@id]
      end
    end

  end
end

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
company-mapping-0.1.0 lib/company/mapping/document_utils/text_document.rb