Sha256: ef715ff884e56ac05ab470c7004eef5953f0a18f00ec57b291029dc2e930a4de
Contents?: true
Size: 713 Bytes
Versions: 1
Compression:
Stored size: 713 Bytes
Contents
require 'list_spider' DOWNLOAD_DIR = 'coolshell/'.freeze @next_list = [] def parse_index_item(e) content = File.read(e.local_path) doc = Nokogiri::HTML(content) list_group = doc.css('h2.entry-title') link_list = list_group.css('a') link_list.each do |link| href = link['href'] local_path = DOWNLOAD_DIR + link.content + '.html' # or you can save them to database for later use @next_list << TaskStruct.new(href, local_path) end end task_list = [] task_list << TaskStruct.new( 'https://coolshell.cn/', DOWNLOAD_DIR + 'index.html', parse_method: method(:parse_index_item) ) ListSpider.get_list(task_list) ListSpider.get_list(@next_list, max: 60)
Version data entries
1 entries across 1 versions & 1 rubygems
Version | Path |
---|---|
list_spider-2.3.0 | spider_example_2.rb |