Sha256: ee2d59cd642f72150d28a92506e30159d69ad9b6ca6f50fa7a97190bcf603d98

Contents?: true

Size: 901 Bytes

Versions: 7

Compression:

Stored size: 901 Bytes

Contents

require "xmlsimple"

module Solrizer::XML::Extractor

  #
  # This method extracts solr fields from simple xml
  # If you want to do anything more nuanced with the xml, use TerminologyBasedSolrizer instead.
  #
  # @param [xml] text xml content to index
  # @param [Hash] solr_doc
  def xml_to_solr( text, solr_doc=Hash.new )
    doc = XmlSimple.xml_in( text )
    
    doc.each_pair do |name, value|
      if value.kind_of?(Array) 
        if value.first.kind_of?(Hash)
          # This deals with the way xml-simple handles nodes with attributes
          solr_doc.merge!({:"#{name}_t" => "#{value.first["content"]}"})
        elsif value.length > 1
          solr_doc.merge!({:"#{name}_t" => value})
        else
          solr_doc.merge!({:"#{name}_t" => "#{value}"})
        end
      else
        solr_doc.merge!({:"#{name}_t" => "#{value}"})
      end
    end

    return solr_doc
  end
  
end

Version data entries

7 entries across 7 versions & 1 rubygems

Version Path
solrizer-1.1.2 lib/solrizer/xml/extractor.rb
solrizer-1.1.1 lib/solrizer/xml/extractor.rb
solrizer-1.1.0 lib/solrizer/xml/extractor.rb
solrizer-1.0.4 lib/solrizer/xml/extractor.rb
solrizer-1.0.3 lib/solrizer/xml/extractor.rb
solrizer-1.0.2 lib/solrizer/xml/extractor.rb
solrizer-1.0.1 lib/solrizer/xml/extractor.rb