Sha256: 3f9f9a0c909df636663897663532c182fea4317825f43e44cf07858836d41023

Contents?: true

Size: 1.11 KB

Versions: 1

Compression:

Stored size: 1.11 KB

Contents

require 'uri'

module Spidr
  class Agent

    # Specifies whether the Agent will strip URI fragments
    attr_accessor :strip_fragments

    # Specifies whether the Agent will strip URI queries
    attr_accessor :strip_query

    #
    # Sanitizes a URL based on filtering options.
    #
    # @param [URI::HTTP, URI::HTTPS, String] url
    #   The URL to be sanitized
    #
    # @return [URI::HTTP, URI::HTTPS]
    #   The new sanitized URL.
    #
    # @since 0.2.2
    #
    def sanitize_url(url)
      url = URI(url)

      url.fragment = nil if @strip_fragments
      url.query    = nil if @strip_query

      return url
    end

    protected

    #
    # Initializes the Sanitizer rules.
    #
    # @param [Boolean] strip_fragments
    #   Specifies whether or not to strip the fragment component from URLs.
    #
    # @param [Boolean] strip_query
    #   Specifies whether or not to strip the query component from URLs.
    #
    # @since 0.2.2
    #
    def initialize_sanitizers(strip_fragments: true, strip_query: false)
      @strip_fragments = strip_fragments
      @strip_query     = strip_query
    end

  end
end

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
spidr-0.7.0 lib/spidr/agent/sanitizers.rb