###Site Checker [![Code Climate](https://codeclimate.com/badge.png)](https://codeclimate.com/github/ZsoltFabok/site_checker) [![Build Status](https://travis-ci.org/ZsoltFabok/site_checker.png)](https://travis-ci.org/ZsoltFabok/site_checker) [![Dependency Status](https://gemnasium.com/ZsoltFabok/site_checker.png)](https://gemnasium.com/ZsoltFabok/site_checker) Site Checker is a simple ruby gem, which helps you check the integrity of your website by recursively visiting the referenced pages and images. I use it in my test environments to make sure that my websites don't have any dead links. ### Install gem install site_checker ### Usage #### In Test Code First, you have to load the `site_checker` by adding this line to the file where you would like to use it: require 'site_checker' If you want to use it for testing, the line should goto the `test_helper.rb`. The usage is quite simple: check_site("http://localhost:3000/app", "http://localhost:3000") puts collected_remote_pages.inspect puts collected_local_pages.inspect puts collected_remote_images.inspect puts collected_local_images.inspect puts collected_problems.inspect The snippet above will open the `http://localhost:3000/app` link and will look for links and images. If it finds a link to a local page, it will recursively checkout out that page, too. The second argument - `http://localhost:3000` - defines the starting reference of your website. In case you don't want to use a DSL like API you can still do the following: SiteChecker.check("http://localhost:3000/app", "http://localhost:3000") puts SiteChecker.remote_pages.inspect puts SiteChecker.local_pages.inspect puts SiteChecker.remote_images.inspect puts SiteChecker.local_images.inspect puts SiteChecker.problems.inspect ##### Using on Generated Content If you have a static website (e.g. generated by [octopress](https://github.com/imathis/octopress)) you can tell `site_checker` to use folders from the file system. With this approach, you don't need a webserver for verifying your website: check_site("./public", "./public") puts collected_problems.inspect ##### Configuration You can instruct `site_checker` to ignore certain links: SiteChecker.configure do |config| config.ignore_list = ["/", "/atom.xml"] end By default it won't check the conditions of the remote links and images - e.g. 404 or 500 -, but you can change it like this: SiteChecker.configure do |config| config.visit_references = true end Too deep recursive calls may be expensive, so you can configure the maximum depth of the recursion with the following attribute: SiteChecker.configure do |config| config.max_recursion_depth = 3 end ##### Examples Make sure that there are no local dead links on the website (I'm using [rspec](https://github.com/rspec/rspec) syntax): before(:each) do SiteChecker.configure do |config| config.ignore_list = ["/atom.xml", "/rss"] end end it "should not have dead local links" do check_site("http://localhost:3000", "http://localhost:3000") # this will print out the difference and I don't have to re-run with print collected_problems.should be_empty end Check that all the local pages can be reached with maximum two steps: before(:each) do SiteChecker.configure do |config| config.ignore_list = ["/atom.xml", "/rss"] config.max_recursion_depth = 2 end @number_of_local_pages = 100 end it "all the local pages have to be visited" do check_site("http://localhost:3000", "http://localhost:3000") collected_local_pages.size.should eq @number_of_local_pages end #### Command line From version 0.3.0 the site checker can be used from the command line as well. Here is the list of the available options: ~ % site_checker -h Visits the and prints out the list of those URLs which cannot be found Usage: site_checker [options] -e, --visit-external-references Visit external references (may take a bit longer) -m, --max-recursion-depth N Set the depth of the recursion -r, --root URL The root URL of the path -i, --ignore URL Ignore the provided URL (can be applied several times) -p, --print-local-pages Prints the list of the URLs of the collected local pages -x, --print-remote-pages Prints the list of the URLs of the collected remote pages -y, --print-local-images Prints the list of the URLs of the collected local images -z, --print-remote-images Prints the list of the URLs of the collected remote images -h, --help Show a short description and this message -v, --version Show version ### Troubleshooting #### undefined method 'new' for SiteChecker:Module This error occurs when the test code calls v0.1.1 methods, but a newer version of the gem has already been installed. Update your test code following the examples above. ### Copyright Copyright (c) 2012 Zsolt Fabok and Contributors. See LICENSE for details.