Sha256: ef715ff884e56ac05ab470c7004eef5953f0a18f00ec57b291029dc2e930a4de

Contents?: true

Size: 713 Bytes

Versions: 1

Compression:

Stored size: 713 Bytes

Contents

require 'list_spider'

DOWNLOAD_DIR = 'coolshell/'.freeze

@next_list = []

def parse_index_item(e)
  content = File.read(e.local_path)
  doc = Nokogiri::HTML(content)
  list_group = doc.css('h2.entry-title')
  link_list = list_group.css('a')

  link_list.each do |link|
    href = link['href']
    local_path = DOWNLOAD_DIR + link.content + '.html'
    # or you can save them to database for later use
    @next_list << TaskStruct.new(href, local_path)
  end
end

task_list = []
task_list << TaskStruct.new(
  'https://coolshell.cn/',
  DOWNLOAD_DIR + 'index.html',
  parse_method: method(:parse_index_item)
)

ListSpider.get_list(task_list)
ListSpider.get_list(@next_list, max: 60)

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
list_spider-2.3.0 spider_example_2.rb