Sha256: 2d0520953ab4b47e42a501bb6cfff1ea5d22d8a910e8e2db0c87737729aa3f4d

Contents?: true

Size: 907 Bytes

Versions: 14

Compression:

Stored size: 907 Bytes

Contents

require "xmlsimple"

module Solrizer::XML::Extractor

  #
  # This method extracts solr fields from simple xml
  # If you want to do anything more nuanced with the xml, use TerminologyBasedSolrizer instead.
  #
  # @param [xml] text xml content to index
  # @param [Hash] solr_doc
  def xml_to_solr( text, solr_doc=Hash.new )
    doc = XmlSimple.xml_in( text )
    
    doc.each_pair do |name, value|
      if value.kind_of?(Array) 
        if value.first.kind_of?(Hash)
          # This deals with the way xml-simple handles nodes with attributes
          solr_doc.merge!({:"#{name}_t" => "#{value.first["content"]}"})
        elsif value.length > 1
          solr_doc.merge!({:"#{name}_t" => value})
        else
          solr_doc.merge!({:"#{name}_t" => "#{value.first}"})
        end
      else
        solr_doc.merge!({:"#{name}_t" => "#{value}"})
      end
    end

    return solr_doc
  end
  
end

Version data entries

14 entries across 14 versions & 1 rubygems

Version Path
solrizer-2.2.0 lib/solrizer/xml/extractor.rb
solrizer-2.1.0 lib/solrizer/xml/extractor.rb
solrizer-2.1.0.rc1 lib/solrizer/xml/extractor.rb
solrizer-2.0.0 lib/solrizer/xml/extractor.rb
solrizer-2.0.0.rc7 lib/solrizer/xml/extractor.rb
solrizer-2.0.0.rc6 lib/solrizer/xml/extractor.rb
solrizer-2.0.0.rc5 lib/solrizer/xml/extractor.rb
solrizer-2.0.0.rc4 lib/solrizer/xml/extractor.rb
solrizer-2.0.0.rc3 lib/solrizer/xml/extractor.rb
solrizer-2.0.0.rc2 lib/solrizer/xml/extractor.rb
solrizer-2.0.0.rc1 lib/solrizer/xml/extractor.rb
solrizer-1.2.2 lib/solrizer/xml/extractor.rb
solrizer-1.2.1 lib/solrizer/xml/extractor.rb
solrizer-1.2.0 lib/solrizer/xml/extractor.rb