Sha256: eff429a6d1fc7428a3a45948e12ba90854b510d2d067ff04e7827818054d797d

Contents?: true

Size: 852 Bytes

Versions: 3

Compression:

Stored size: 852 Bytes

Contents

= Anemone

Anemone is a web spider framework that can spider a domain and collect useful
information about the pages it visits. It is versatile, allowing you to
write your own specialized spider tasks quickly and easily.

See http://anemone.rubyforge.org for more information.

== Features
* Multi-threaded design for high performance
* Tracks 301 HTTP redirects to understand a page's aliases
* Built-in BFS algorithm for determining page depth
* Allows exclusion of URLs based on regular expressions
* Choose the links to follow on each page with focus_crawl()
* HTTPS support
* Records response time for each page
* CLI program can list all pages in a domain, calculate page depths, and more

== Examples
See the scripts under the <tt>lib/anemone/cli</tt> directory for examples of several useful Anemone tasks.

== Requirements
* nokogiri
* robots

Version data entries

3 entries across 3 versions & 3 rubygems

Version Path
spk-anemone-0.2.4 README.rdoc
shingara-anemone-0.2.4 README.rdoc
anemone-0.2.3 README.rdoc