### 0.7.1 / 2024-01-25

* Switched to using `require_relative` to improve load-times.
* Added `# frozen_string_literal: true` to all files.
* Use keyword arguments for {Spidr.domain}.
* Rescue `URI::Error` instead of `Exception` when calling `URI::HTTP#merge` in
  {Spidr::Page#to_absolute}.

### 0.7.0 / 2022-12-31

* Added {Spidr.domain} and {Spidr::Agent.domain}.
* Added {Spidr::Page#gif?}.
* Added {Spidr::Page#jpeg?}.
* Added {Spidr::Page#icon?} and {Spidr::Page#ico?}.
* Added {Spidr::Page#png?}.
* {Spidr.proxy=} and {Spidr::Agent#proxy=} can now accept a `String` or a
  `URI::HTTP` object.

### 0.6.1 / 2019-10-24

* Check for the opaque component of URIs before attempting to set the path
  component (@kyaroch). This fixes `URI::InvalidURIError: path conflicts with
  opaque` exceptions.
* Fix `@robots` instance variable warning (@spk).

### 0.6.0 / 2016-08-04

* Added {Spidr::Proxy}.
* Added more options to {Spidr::Agent#initialize}:
  * `:default_headers`: specifies the default headers to set in all requests
    (@maccman).
  * `:limit`: specify the maximum number of links to visit.
  * `:open_timeout`, `:read_timeout`, `:ssl_timeout`, `:continue_timeout`,
    and `:keep_alive_timeout`: sets `Net::HTTP` timeouts.
* Allow {Spidr::Settings::Proxy#proxy= Spidr.proxy=} to accept `nil`.
* Use `Net::HTTPResponse#get_fields` in {Spidr::Page} to correctly return
  multiple values for repeated headers.
* Fixed a bug in {Spidr::Page#method_missing} where method names were not being
  correctly converted to header names.
* Fixed a bug in {Spidr::Page#cookie_params} where `Set-Cookie` flags were not
  being filtered out.
* Rewrote the specs to use webmock and increased spec coverage.

### 0.5.0 / 2016-01-03

* Added support for respecting `robots.txt` files.

      Spidr.site('http://reddit.com/', robots: true)

* Added {Spidr.robots=} and {Spidr.robots?}.
* Added {Spidr::Page#each_mailto} and {Spidr::Page#mailtos}.
* Fixed a bug in {Spidr::Agent.host} that limited spidering to only `http://`
  URIs.
* Rescue `Zlib::Error` to catch `Zlib::DataError` and `Zlib::BufError`
  exceptions caused by web servers that use incompatible gzip compression.
* Fixed a bug in {URI.expand_path} where `/../foo` was being expanded to `foo`
  instead of `/foo`.

### 0.4.1 / 2011-12-08

* Catch `OpenSSL::SSL::SSLError` exceptions when initiated HTTPS Sessions.

### 0.4.0 / 2011-08-07

* Added `Spidr::Headers#content_charset`.
* Pass the Page `url` and `content_charset` to Nokogiri in `Spidr::Body#doc`.
  This ensures that Nokogiri will preserve the body encoding.
* Made `Spidr::Headers#is_content_type?` public.
* Allow `Spidr::Headers#is_content_type?` to match the full Content-Type
  or the sub-type.

### 0.3.2 / 2011-06-20

* Added separate intitialize methods for `Spidr::Actions`, `Spidr::Events`,
  `Spidr::Filters` and `Spidr::Sanitizers`.
* Aliased `Spidr::Events#urls_like` to `Spidr::Events#every_url_like`.
* Reduce usage of `self.included` and `module_eval`.
* Reduce usage of nested-blocks.
* Reduce usage of `return`.

### 0.3.1 / 2011-04-22

* Require `set` in `spidr/headers.rb`.

### 0.3.0 / 2011-04-14

* Switched from Jeweler to [Ore](http://github.com/ruby-ore/ore).
* Split all header related methods out of {Spidr::Page} and into
  `Spidr::Headers`.
* Split all body related methods out of {Spidr::Page} and into
  `Spidr::Body`.
* Split all link related methods out of {Spidr::Page} and into
  `Spidr::Links`.
* Added `Spidr::Headers#directory?`.
* Added `Spidr::Headers#json?`.
* Added `Spidr::Links#each_url`.
* Added `Spidr::Links#each_link`.
* Added `Spidr::Links#each_redirect`.
* Added `Spidr::Links#each_meta_redirect`.
* Aliased `Spidr::Headers#raw_cookie` to `Spidr::Headers#cookie`.
* Aliased `Spidr::Body#to_s` to `Spidr::Body#body`.
* Also check for `application/xml` in `Spidr::Headers#xml?`.
* Catch all exceptions when merging URIs in `Spidr::Links#to_absolute`.
* Always prepend a `/` to all FTP URI paths. Fixes a Ruby 1.8 specific
  bug, where it expects an absolute path for all FTP URIs.
* Refactored {URI.expand_path}.
* Start the session in {Spidr::SessionCache#[]} to prevent multiple
  `CONNECT` commands being sent to HTTP Proxies (thanks falaise).

### 0.2.7 / 2010-08-17

* Added {Spidr::CookieJar#cookies_for_host} (thanks zapnap).
* Renamed `Spidr::Page#cookie` to `Spidr::Page#raw_cookie`.
* Rescue `URI::InvalidComponentError` exceptions in
  `Spidr::Page#to_absolute` (thanks zapnap).

### 0.2.6 / 2010-07-05

* Fixed a bug in `Spidr::Page#meta_redirect`, by calling
  `Nokogiri::XML::Element#get_attribute` instead of `attr`.

### 0.2.5 / 2010-07-02

* Added `Spidr::Page#meta_redirect`.
* Added `Spidr::Page#meta_redirect?`.
* Manage development dependencies with Bundler.
* Support following "old-school" meta-refresh redirects (thanks zapnap).
* Allow {Spidr::CookieJar} inherit cookies set by a parent domain.
* Fixed a constant lookup issue in {Spidr::Agent}.
* Use `yield` instead of `block.call` when necessary.

### 0.2.4 / 2010-05-05

* Added `Spidr::Filters#visit_urls`.
* Added `Spidr::Filters#visit_urls_like`.
* Added `Spidr::Filters#ignore_urls`.
* Added `Spidr::Filters#ignore_urls_like`.
* Added `Spidr::Page#is_content_type?`.
* Default `Spidr::Page#body` to an empty String.
* Default `Spidr::Page#content_type` to an empty String.
* Default `Spidr::Page#content_types` to an empty Array.
* Improved reliability of {Spidr::Page#is_redirect?}.
* Improved content type detection in {Spidr::Page} to handle `Content-Type`
  headers containing charsets (thanks Josh Lindsey).

### 0.2.3 / 2010-02-27

* Migrated to Jeweler, for the packaging and releasing RubyGems.
* Switched to MarkDown formatted YARD documentation.
* Added `Spidr::Events#every_link`.
* Added {Spidr::SessionCache#active?}.
* Added specs for {Spidr::SessionCache}.

### 0.2.2 / 2010-01-06

* Require Web Spider Obstacle Course (WSOC) >= 0.1.1.
* Integrated the new WSOC into the specs.
* Removed the built-in Web Spider Obstacle Course.
* Added `Spidr::Page#content_types`.
* Added `Spidr::Page#cookie`.
* Added `Spidr::Page#cookies`.
* Added `Spidr::Page#cookie_params`.
* Added `Spidr::Sanitizers`.
* Added {Spidr::SessionCache}.
* Added {Spidr::CookieJar} (thanks Nick Plante).
* Added {Spidr::AuthStore} (thanks Nick Plante).
* Added {Spidr::Agent#post_page} (thanks Nick Plante).
* Renamed `Spidr::Agent#get_session` to {Spidr::SessionCache#[]}.
* Renamed `Spidr::Agent#kill_session` to {Spidr::SessionCache#kill!}.

### 0.2.1 / 2009-11-25

* Added `Spidr::Events#every_ok_page`.
* Added `Spidr::Events#every_redirect_page`.
* Added `Spidr::Events#every_timedout_page`.
* Added `Spidr::Events#every_bad_request_page`.
* Added `Spidr::Events#every_unauthorized_page`.
* Added `Spidr::Events#every_forbidden_page`.
* Added `Spidr::Events#every_missing_page`.
* Added `Spidr::Events#every_internal_server_error_page`.
* Added `Spidr::Events#every_txt_page`.
* Added `Spidr::Events#every_html_page`.
* Added `Spidr::Events#every_xml_page`.
* Added `Spidr::Events#every_xsl_page`.
* Added `Spidr::Events#every_doc`.
* Added `Spidr::Events#every_html_doc`.
* Added `Spidr::Events#every_xml_doc`.
* Added `Spidr::Events#every_xsl_doc`.
* Added `Spidr::Events#every_rss_doc`.
* Added `Spidr::Events#every_atom_doc`.
* Added `Spidr::Events#every_javascript_page`.
* Added `Spidr::Events#every_css_page`.
* Added `Spidr::Events#every_rss_page`.
* Added `Spidr::Events#every_atom_page`.
* Added `Spidr::Events#every_ms_word_page`.
* Added `Spidr::Events#every_pdf_page`.
* Added `Spidr::Events#every_zip_page`.
* Fixed a bug where {Spidr::Agent#delay} was not being used to delay
  requesting pages.
* Spider `link` and `script` tags in HTML pages (thanks Nick Plante).

### 0.2.0 / 2009-10-10

* Added {URI.expand_path}.
* Added `Spidr::Page#search`.
* Added `Spidr::Page#at`.
* Added `Spidr::Page#title`.
* Added {Spidr::Agent#failures=}.
* Added a HTTP session cache to {Spidr::Agent}, per suggestion of falter.
  * Added `Spidr::Agent#get_session`.
  * Added `Spidr::Agent#kill_session`.
* Added {Spidr::Settings::Proxy#proxy= Spidr.proxy=}.
* Added {Spidr::Settings::Proxy#disable_proxy! Spidr.disable_proxy!}.
* Aliased `Spidr::Page#txt?` to `Spidr::Page#plain_text?`.
* Aliased `Spidr::Page#ok?` to `Spidr::Page#is_ok?`.
* Aliased `Spidr::Page#redirect?` to `Spidr::Page#is_redirect?`.
* Aliased `Spidr::Page#unauthorized?` to `Spidr::Page#is_unauthorized?`.
* Aliased `Spidr::Page#forbidden?` to `Spidr::Page#is_forbidden?`.
* Aliased `Spidr::Page#missing?` to `Spidr::Page#is_missing?`.
* Split URL filtering code out of {Spidr::Agent} and into
  `Spidr::Filters`.
* Split URL / Page event code out of {Spidr::Agent} and into
  `Spidr::Events`.
* Split pause! / continue! / skip_link! / skip_page! methods out of
  {Spidr::Agent} and into `Spidr::Actions`.
* Fixed a bug in `Spidr::Page#code`, where it was not returning an Integer.
* Make sure `Spidr::Page#doc` returns `Nokogiri::XML::Document` objects for
  RSS/RDF/Atom pages as well.
* Fixed the handling of the Location header in `Spidr::Page#links`
  (thanks falter).
* Fixed a bug in `Spidr::Page#to_absolute` where trailing `/` characters on
  URI paths were not being preserved (thanks falter).
* Fixed a bug where the URI query was not being sent with the request
  in {Spidr::Agent#get_page} (thanks Damian Steer).
* Fixed a bug where SSL sessions were not being properly setup
  (thanks falter).
* Switched {Spidr::Agent#history} to be a Set, to improve search-time
  of the history (thanks falter).
* Switched {Spidr::Agent#failures} to a Set.
* Allow a block to be passed to {Spidr::Agent#run}, which will receive all
  pages visited.
* Allow `Spidr::Agent#start_at` and `Spidr::Agent#continue!` to pass blocks
  to {Spidr::Agent#run}.
* Made {Spidr::Agent#visit_page} public.
* Moved to YARD based documentation.

### 0.1.9 / 2009-06-13

* Upgraded to Hoe 2.0.0.
  * Use Hoe.spec instead of Hoe.new.
  * Use the Hoe signing task for signed gems.
* Added the `Spidr::Agent#schemes` and `Spidr::Agent#schemes=` methods.
* Added a warning message if 'net/https' cannot be loaded.
* Allow the list of acceptable URL schemes to be passed into
  {Spidr::Agent#initialize}.
* Allow history and queue information to be passed into
  {Spidr::Agent#initialize}.
* {Spidr::Agent#start_at} no longer clears the history or the queue.
* Fixed a bug in the sanitization of semi-escaped URLs.
* Fixed a bug where https URLs would be followed even if 'net/https'
  could not be loaded.
* Removed Spidr::Agent::SCHEMES.

### 0.1.8 / 2009-05-27

* Added the `Spidr::Agent#pause!` and `Spidr::Agent#continue!` methods.
* Added the `Spidr::Agent#running?` and `Spidr::Agent#paused?` methods.
* Added an alias for pending_urls to the queue methods.
* Added {Spidr::Agent#queue} to provide read access to the queue.
* Added {Spidr::Agent#queue=} and {Spidr::Agent#history=} for setting the
  queue and history.
* Added {Spidr::Agent#to_hash} which returns a Hash of the agents queue and
  history.
* Made {Spidr::Agent#enqueue} and {Spidr::Agent#queued?} public.
* Added more specs.

### 0.1.7 / 2009-04-24

* Added `Spidr::Agent#all_headers`.
* Fixed a bug where {Spidr::Page#headers} was always `nil`.
* {Spidr::Agent} will now follow the Location header in HTTP 300,
  301, 302, 303 and 307 Redirects.
* {Spidr::Agent} will now follow iframe and frame tags.

### 0.1.6 / 2009-04-14

* Added {Spidr::Agent#failures}, a list of URLs which could not be visited.
* Added {Spidr::Agent#failed?}.
* Added `Spidr::Agent#every_failed_url`.
* Added {Spidr::Agent#clear}, which clears the history and failures URL
  lists.
* Improved fault tolerance in {Spidr::Agent#get_page}.
  * If a Network or HTTP error is encountered, the URL will be added to
    the failures list and the next URL will be visited.
* Fixed a typo in `Spidr::Agent#ignore_exts_like`.
* Updated the Web Spider Obstacle Course with links that always fail to be
  visited.

### 0.1.5 / 2009-03-22

* Catch malformed URIs in `Spidr::Page#to_absolute` and return `nil`.
* Filter out `nil` URIs in `Spidr::Page#urls`.

### 0.1.4 / 2009-01-15

* Use Nokogiri for HTML and XML parsing.

### 0.1.3 / 2009-01-10

* Added the `:host` options to {Spidr::Agent#initialize}.
* Added the Web Spider Obstacle Course files to the Manifest.
* Aliased {Spidr::Agent#visited_urls} to {Spidr::Agent#history}.

### 0.1.2 / 2008-11-06

* Fixed a bug in `Spidr::Page#to_absolute` where URLs with no path were not
  receiving a default path of `/`.
* Fixed a bug in `Spidr::Page#to_absolute` where URL paths were not being
  expanded, in order to remove `..` and `.` directories.
* Fixed a bug where absolute URLs could have a blank path, thus causing
  {Spidr::Agent#get_page} to crash when it performed the HTTP request.
* Added RSpec spec tests.
* Created a Web-Spider Obstacle Course
  (http://spidr.rubyforge.org/course/start.html) which is used in the spec
  tests.

### 0.1.1 / 2008-10-04

* Added a reader method for the response instance variable in Page.
* Fixed a bug in {Spidr::Page#method_missing}.

### 0.1.0 / 2008-05-23

* Initial release.
  * Black-list or white-list URLs based upon:
    * Host name
    * Port number
    * Full link
    * URL extension
  * Provides call-backs for:
    * Every visited Page.
    * Every visited URL.
    * Every visited URL that matches a specified pattern.