README.md in PageRankr-2.0.4 vs README.md in PageRankr-3.0.0
- old
+ new
@@ -7,15 +7,19 @@
[1]: http://isitpopular.heroku.com
[2]: https://github.com/blatyo/is_it_popular
## Get it!
+``` bash
gem install PageRankr
+```
## Use it!
+``` ruby
require 'page_rankr'
+```
### Backlinks
Backlinks are the result of doing a search with a query like "link:www.google.com". The number of returned results indicates how many sites point to that url. If a site is not tracked then `nil` is returned.
@@ -100,10 +104,34 @@
PageRankr.rank_trackers #=> [:alexa_global, :alexa_us, :compete, :google]
```
Alexa and Compete ranks are descending where 1 is the most popular. Google page ranks are in the range 0-10 where 10 is the most popular. If a site is unindexed then the rank will be nil.
+## Use it a la carte!
+
+From versions >= 3, everything should be usable in a much more a la carte manner. If all you care about is google page rank (which I speculate is common) you can get that all by itself:
+
+``` ruby
+ require 'page_rankr/ranks/google'
+
+ tracker = PageRankr::Ranks::Google.new("myawesomesite.com")
+ tracker.run #=> 2
+```
+
+Also, once a tracker has run three values will be accessible from it:
+
+``` ruby
+ # The value extracted. Tracked is aliased to rank for PageRankr::Ranks, backlink for PageRankr::Backlinks, and index for PageRankr::Indexes.
+ tracker.tracked #=> 2
+
+ # The value extracted with the jsonpath, xpath, or regex before being cleaned.
+ tracker.raw #=> "2"
+
+ # The body of the response
+ tracker.body #=> "<html><head>..."
+```
+
## Fix it!
If you ever find something is broken it should now be much easier to fix it with version >= 1.3.0. For example, if the xpath used to lookup a backlink is broken, just override the method for that class to provide the correct xpath.
``` ruby
@@ -121,31 +149,43 @@
## Extend it!
If you ever come across a site that provides a rank or backlinks you can hook that class up to automatically be use with PageRankr. PageRankr does this by looking up all the classes namespaced under Backlinks, Indexes, and Ranks.
``` ruby
+ require 'page_rankr/backlink'
+
module PageRankr
class Backlinks
class Foo
include Backlink
- def request
- @request ||= Typhoeus::Request.new("http://example.com/",
- :params => {:q => @site.to_s})
+ # This method is required
+ def url
+ "http://example.com/"
end
+ # This method specifies the parameters for the url. It is optional, but likely required for the class to be useful.
+ def params
+ {:q => @site.to_s}
+ end
+
+ # You can use a method named either xpath, jsonpath, or regex with the appropriate query type
def xpath
"//backlinks/text()"
end
- def clean(backlink_count)
- #do some of my own cleaning
- super(backlink_count) # strips letters, commas, and a few other nasty things and converts it to an integer
- end
+ # Optionally, you could override the clean method if the current implementation isn't sufficient
+ # def clean(backlink_count)
+ # #do some of my own cleaning
+ # super(backlink_count) # strips non-digits and converts it to an integer or nil
+ # end
end
end
end
+
+ PageRankr::Backlinks::Foo.new("myawesomesite.com").run #=> 3
+ PageRankr.backlinks("myawesomesite.com", :foo)[:foo] #=> 3
```
Then, just make sure you require the class and PageRankr and whenever you call PageRankr.backlinks it'll be able to use your class.
## Note on Patches/Pull Requests
@@ -156,15 +196,11 @@
future version unintentionally.
* Commit, do not mess with rakefile, version, or history.
(if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
* Send me a pull request. Bonus points for topic branches.
-## TODO Version 3
+## TODO Version 3-4
* Use API's where possible
-* Configuration
- * Optionally use API keys
- * Maybe allow API key cycling to get around query limits
-* Google search API is deprecated
* New Compete API
* Some search engines throttle the amount of queries. It would be nice to know when this happens. Probably throw an exception.
## Contributors
* [Dru Ibarra](https://github.com/Druwerd) - Use Google Search API instead of scraping.
\ No newline at end of file