Sha256: a7c82c8375a7ddc5a8c59c13b5a4fb90a354d4ea7361b1ec8b8672c030a77f6d
Contents?: true
Size: 1.03 KB
Versions: 1
Compression:
Stored size: 1.03 KB
Contents
require 'wordlist/builder' require 'spidr' module Wordlist module Builders class Website < Builder # Host to spider attr_accessor :host # # Creates a new Website builder object with the specified _path_ # and _host_. If a _block_ is given, it will be passed the new created # Website builder object. # def initialize(path,host,&block) @host = host super(path,&block) end # # Builds the word-list file by spidering the +host+ and parsing the # inner-text from all HTML pages. If a _block_ is given, it will be # called before all HTML pages on the +host+ have been parsed. # def build!(&block) super(&block) Spidr.host(@host) do |spidr| spidr.every_page do |page| if page.html? page.doc.search('//h1|//h2|//h3|//h4|//h5|//p|//span').each do |element| parse(element.inner_text) end end end end end end end end
Version data entries
1 entries across 1 versions & 1 rubygems
Version | Path |
---|---|
wordlist-0.1.0 | lib/wordlist/builders/website.rb |