# Tarantula DSL-based API examples I'm planning to change the Tarantula configuration API to be more Ruby-ish, declarative and DSL-like. This document contains examples of what I'm aiming for, as a record of ideas and for discussion. **This is not a promise to implement all of these features.** I think all of these feature ideas would be useful, but I already know some of them will be very costly and difficult to implement. I'm including them all here because I want to come up with an API syntax and style that will accommodate a lot of different things. I don't yet know whether I'll maintain the current association with test cases; it seems that it might be better to have a standalone `tarantula_config.rb` file or something like that, with a custom Tarantula runner that doesn't depend on test/unit or RSpec. ## Basic crawl, starting from '/' Tarantula.crawl ## Basic crawl, starting from '/' and '/admin' Tarantula.crawl('both') do |t| t.root_page '/' t.root_page '/admin' end ## Crawl with the Tidy handler # the operand to the crawl method, if supplied, will be used # as the tab label in the report. Tarantula.crawl("tidy") do |t| t.add_handler :tidy end ## Reorder requests on the queue This is necessary to fix [this bug](http://github.com/relevance/tarantula/issues#issue/3) Tarantula.crawl do |t| # Treat the following controllers as "resourceful", # reordering appropriately (see my comment on # ) t.resources 'post', 'comment' # For the 'news' controller, order the actions this way: t.reorder_for 'news', :actions => %w{show read unread mark_read} # For the 'history' controller, supply a comparison function: t.reorder_for 'history', :compare => lambda{|x, y| ... } end (Unlike most of the declarations in this example document, these will need to be reusable across multiple crawl blocks somehow.) ## Selectively allowing errors Tarantula.crawl("ignoring not-found users") do |t| t.allow_errors :not_found, %r{/users/\d+/} # or t.allow_errors :not_found, :controller => 'users', :action => 'show' end ## Attacks Tarantula.crawl("attacks") do |t| t.attack :xss, :input => "", :output => :input t.attack :sql_injection, :input => "a'; DROP TABLE posts;" t.times_to_crawl 2 end We should have prepackaged attack suites that understand various techniques. Tarantula.crawl("xss suite") do |t| t.attack :xss, :suite => 'standard' end Tarantula.crawl("sql injection suite") do |t| t.attack :sql_injection, :suite => 'standard' end ## Timeout Tarantula.crawl do |t| t.times_to_crawl 2 t.stop_after 2.minutes end ## Fuzzing Tarantula.crawl do |t| # :valid input uses SQL types and knowledge of model validations # to attempt to generate valid input. You can override the defaults. t.fuzz_with :valid_input do |f| f.fuzz(Post, :title) { random_string(1..40) } f.fuzz(Person, :phone) { random_string("%03d-%03d-%04d") } end # The point of fuzzing is to keep trying a lot of things to # see if you can find breakage. t.crawl_for 45.minutes end Tarantula.crawl do |t| # :typed_input uses SQL types to generate "reasonable" but probably # invalid input (e.g., numeric fields will get strings of digits, # but they'll be too large or negative; date fields will get dates, # but very far in the past or future; string fields will get very # large strings.) t.fuzz_with :typed_input t.crawl_for 30.minutes end Tarantula.crawl do |t| # :random_input just plugs in random strings everywhere. t.fuzz_with :random_input t.crawl_for 2.hours end