Sha256: 155104d7665ffe3063608e75b5b1f14ef88eae8a2fa94e1ec081578a7861d59b

Contents?: true

Size: 1.81 KB

Versions: 1

Compression:

Stored size: 1.81 KB

Contents

Husc
=======

A simple crawling utility for Ruby.


## Description
This project enables site crawling and data extraction with xpath and css selectors. You can also send forms such as text data, files, and checkboxes.


## Requirement

- Ruby 2.3 or above


## Usage
### Simple Example
```ruby
require 'husc'

url = 'http://www.example.com/'
doc = Husc(url)

# access another url
doc.get('another url')

# get current url
doc.url

# get current site's html
doc.html

# get <table> tags as dict
doc.tables
# ex) doc.tables['予約・お問い合わせ'] => 050-5596-6465
```

### Scraping Example
```ruby
# search for nodes by css selector
# tag   : css('name')
# class : css('.name')
# id    : css('#name')
doc.css('div')
doc.css('.main-text')
doc.css('#tadjs')

# search for nodes by xpath
doc.xpath('//*[@id="top"]/div[1]')

# other example
doc.css('div').css('a')[2].attr('href') # => string object
doc.css('p').innerText() # => string object
# You do not need to specify "[]" to access the first index
```

### Submitting Form Example
1. Specify target node's attribute
2. Specify value(int or str) / check(bool) / file_name(str)
3. call submit() with form attribute specified
```ruby
# login
doc.send(id:'id attribute', value:'value to send')
doc.send(id:'id attribute', value:'value to send')
doc.submit(id:'id attribute') # submit

# post file
doc.send(id:'id attribute', file_name:'target file name')

# checkbox
doc.send(id:'id attribute', check:True)  # check
doc.send(id:'id attribute', check:False) # uncheck

# example of specify other attribute
doc.send(name:'name attribute', value:'hello')
doc.send(class:'class attribute', value:100)
```




## Installation
```sh
$ gem install husc
```


## Contributing
Bug reports and pull requests are welcome on GitHub at [https://github.com/AjxLab/PyCrawl](https://github.com/AjxLab/PyCrawl).

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
husc-0.1.1 README.md