Sha256: 465ba3f26e1c51956ae041392cb64e3a66f62d9e0f41b9a1a1f81cbbd9e02ee5

Contents?: true

Size: 1.2 KB

Versions: 2

Compression:

Stored size: 1.2 KB

Contents

module Sufia
  module GenericFile
    module FullTextIndexing
      extend ActiveSupport::Concern

      included do
        has_file_datastream 'full_text', versionable: false
      end

      def extract_content
        url = Blacklight.solr_config[:url] ? Blacklight.solr_config[:url] : Blacklight.solr_config["url"] ? Blacklight.solr_config["url"] : Blacklight.solr_config[:fulltext] ? Blacklight.solr_config[:fulltext]["url"] : Blacklight.solr_config[:default]["url"]
        uri = URI("#{url}/update/extract?extractOnly=true&wt=json&extractFormat=text")
        req = Net::HTTP.new(uri.host, uri.port)
        resp = req.post(uri.to_s, self.content.content, {
            'Content-type' => "#{self.mime_type};charset=utf-8",
            'Content-Length' => self.content.content.size.to_s
          })
        raise "URL '#{uri}' returned code #{resp.code}" unless resp.code == "200"
        self.content.content.rewind if self.content.content.respond_to?(:rewind)
        extracted_text = JSON.parse(resp.body)[''].rstrip
        full_text.content = extracted_text if extracted_text.present?
      rescue => e
        logger.error("Error extracting content from #{self.pid}: #{e.inspect}")
      end
    end
  end
end

Version data entries

2 entries across 2 versions & 2 rubygems

Version Path
sufia-4.0.0.rc2 sufia-models/app/models/concerns/sufia/generic_file/full_text_indexing.rb
sufia-models-4.0.0.rc2 app/models/concerns/sufia/generic_file/full_text_indexing.rb