Sha256: 183838413c41526b241ab3ca9a4c378116cccfa9f44b310e2ece490c48d2fd1e

Contents?: true

Size: 1.05 KB

Versions: 15

Compression:

Stored size: 1.05 KB

Contents

require 'traject/nokogiri_reader'
require 'traject/macros/nokogiri_macros'
require 'traject/oai_pmh_nokogiri_reader'

module Traject
  class Indexer
    # An indexer sub-class for XML, where the source records in the pipeline are
    # Nokogiri::XML::Document objects. It sets a default reader of NokogiriReader, and
    # includes Traject::Macros::Nokogiri (with `extract_xpath`).
    #
    # See docs on XML use. (TODO)
    class NokogiriIndexer < ::Traject::Indexer
      include Traject::Macros::NokogiriMacros

      def self.default_settings
        @default_settings ||= super.merge("reader_class_name" => "Traject::NokogiriReader")
      end

      # Overridden from base Indexer, try an `id` attribute or element on record.
      def source_record_id_proc
        @source_record_id_proc ||= lambda do |source_xml_record|
          if ( source_xml_record &&
               source_xml_record.kind_of?(Nokogiri::XML::Node) )
            source_xml_record['id'] || (el = source_xml_record.at_xpath('./id') && el.text)
          end
        end
      end
    end
  end
end

Version data entries

15 entries across 15 versions & 1 rubygems

Version Path
traject-3.8.3 lib/traject/indexer/nokogiri_indexer.rb
traject-3.8.2 lib/traject/indexer/nokogiri_indexer.rb
traject-3.8.1 lib/traject/indexer/nokogiri_indexer.rb
traject-3.8.0 lib/traject/indexer/nokogiri_indexer.rb
traject-3.7.0 lib/traject/indexer/nokogiri_indexer.rb
traject-3.6.0 lib/traject/indexer/nokogiri_indexer.rb
traject-3.5.0 lib/traject/indexer/nokogiri_indexer.rb
traject-3.4.0 lib/traject/indexer/nokogiri_indexer.rb
traject-3.3.0 lib/traject/indexer/nokogiri_indexer.rb
traject-3.2.0 lib/traject/indexer/nokogiri_indexer.rb
traject-3.1.0 lib/traject/indexer/nokogiri_indexer.rb
traject-3.1.0.rc1 lib/traject/indexer/nokogiri_indexer.rb
traject-3.0.0 lib/traject/indexer/nokogiri_indexer.rb
traject-3.0.0.alpha.2 lib/traject/indexer/nokogiri_indexer.rb
traject-3.0.0.alpha.1 lib/traject/indexer/nokogiri_indexer.rb