Sha256: 0a8eb127ab6a23f3d211f32f574b2f77184cf77b599c09faccf762f698bb3e71

Contents?: true

Size: 1.01 KB

Versions: 2

Compression:

Stored size: 1.01 KB

Contents

module Html2rss
  module ItemExtractors
    ##
    # Returns the value of the +href+ attribute.
    # It always returns absolute URLs. If the extracted +href+ value is a
    # relative URL, it prepends the channel's URL.
    #
    # Imagine this +a+ HTML element with a +href+ attribute:
    #
    #     <a href="/posts/latest-findings">...</a>
    #
    # YAML usage example:
    #    channel:
    #      url: http://blog-without-a-feed.example.com
    #      ...
    #    selectors:
    #      link:
    #        selector: a
    #        extractor: href
    #
    # Would return:
    #    'http://blog-without-a-feed.example.com/posts/latest-findings'
    class Href
      def initialize(xml, options)
        @options = options
        element = ItemExtractors.element(xml, options)
        @href = Html2rss::Utils.sanitize_url(element.attr('href'))
      end

      # @return [URI::HTTPS, URI::HTTP]
      def get
        Html2rss::Utils.build_absolute_url_from_relative(@href, @options[:channel][:url])
      end
    end
  end
end

Version data entries

2 entries across 2 versions & 1 rubygems

Version Path
html2rss-0.9.0 lib/html2rss/item_extractors/href.rb
html2rss-0.8.2 lib/html2rss/item_extractors/href.rb