Sha256: 512f135236e535a14e432f0a8f3451e858cf684d7cf8c32bc5c77dd084cd4643

Contents?: true

Size: 757 Bytes

Versions: 1

Compression:

Stored size: 757 Bytes

Contents

module Company
  module Mapping
    class CompanyCorpus < Corpus
      def initialize(path=nil)
        super()
        import_csv path if path
      end

      # build a corpus from a csv file
      def import_csv path
        CSV.foreach(path) do |row|
          array = row.first.split(";")

          push doc(array[1], array.first)
          array[2..-1].each_with_index do |company_alias, i|
            push doc(company_alias, "#{array.first}_#{i}")
          end
        end
        @corpus
      end

      private

      def doc content, id
        alias_doc = TextDocument.new
        alias_doc.contents = content.gsub(",", "").gsub(".", "")
        alias_doc.id = id
        alias_doc
      end
    end
  end
end

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
company-mapping-0.2.0 lib/company/mapping/document_utils/company_corpus.rb