Sha256: ac727d7157bc69ee63be9d3360d04a8171f4b8d931031c7567ac675c3d394f68

Contents?: true

Size: 1.07 KB

Versions: 6

Compression:

Stored size: 1.07 KB

Contents

# frozen_string_literal: true

module Html2rss
  module ItemExtractors
    ##
    # Return the text content of the attribute. This is the default extractor used,
    # when no extractor is explicitly given.
    #
    # Example HTML structure:
    #
    #     <p>Lorem <b>ipsum</b> dolor ...</p>
    #
    # YAML usage example:
    #
    #    selectors:
    #      description:
    #        selector: p
    #        extractor: text
    #
    # Would return:
    #    'Lorem ipsum dolor ...'
    class Text
      # The available options for the text extractor.
      Options = Struct.new('TextOptions', :selector, keyword_init: true)

      ##
      # Initializes the Text extractor.
      #
      # @param xml [Nokogiri::XML::Element]
      # @param options [Options]
      def initialize(xml, options)
        @element = ItemExtractors.element(xml, options.selector)
      end

      ##
      # Retrieves and returns the text content of the element.
      #
      # @return [String] The text content.
      def get
        @element.text.to_s.strip.gsub(/\s+/, ' ')
      end
    end
  end
end

Version data entries

6 entries across 6 versions & 1 rubygems

Version Path
html2rss-0.15.0 lib/html2rss/item_extractors/text.rb
html2rss-0.14.0 lib/html2rss/item_extractors/text.rb
html2rss-0.13.0 lib/html2rss/item_extractors/text.rb
html2rss-0.12.0 lib/html2rss/item_extractors/text.rb
html2rss-0.11.0 lib/html2rss/item_extractors/text.rb
html2rss-0.10.0 lib/html2rss/item_extractors/text.rb