README.rdoc in jsl-feedzirra-0.0.12.7 vs README.rdoc in jsl-feedzirra-0.0.12.8
- old
+ new
@@ -1,55 +1,30 @@
== Feedzirra
=== Description
Feedzirra is a feed library that is designed to get and update many feeds as quickly as possible. This includes using libcurl-multi through the
-taf2-curb[link:http://github.com/taf2/curb/tree/master] gem for faster http gets, and libxml through nokogiri[link:http://github.com/tenderlove/nokogiri/tree/master]
-and sax-machine[link:http://github.com/pauldix/sax-machine/tree/master] for faster parsing.
+taf2-curb[link:http://github.com/taf2/curb/tree/master] gem for faster http gets, and libxml through
+nokogiri[link:http://github.com/tenderlove/nokogiri/tree/master] and sax-machine[link:http://github.com/pauldix/sax-machine/tree/master] for
+faster parsing.
-It allows for easy customization of feed parsing options through the definition of custom parsing classes, and allows you to take as little or as much control as you want in updating feeds. Feedzirra
-makes it easy to figure out which content in feeds is new by providing simple 'backends' so that Feedzirra can track the last contents fetched from a particular feed. Out of the box, Feedzirra can
-store this information in the filesystem, Memcached or Tokyo Cabinet. If you want to keep track of new or updated feeds on your own, just use the default backend which will will let you set options
-for conditional fetching of feeds without the help of Feedzirra.
+It allows for easy customization of feed parsing options through the definition of custom parsing classes, and allows you to take as little or as
+much control as you want in updating feeds. Feedzirra makes it easy to figure out which content in feeds is new by storing the previous retrieval
+of a feed in a key-value store. Feedzirra uses the the "moneta" gem, which is a unified interface to key-value storage systems, in order to provide
+access to many different types of stores depending on your requirements.
=== Installation
-For now Feedzirra exists only on github. It also has a few gem requirements that are only on github. Before you start you need to have libcurl[link:http://curl.haxx.se/] and
-libxml[link:http://xmlsoft.org/] installed. If you're on Leopard you have both. Otherwise, you'll need to grab them. Once you've got those libraries, these are the gems that get
-used: nokogiri, pauldix-sax-machine, taf2-curb (note that this is a fork that lives on github and not the Ruby Forge version of curb), and pauldix-feedzirra. The feedzirra gemspec has all
-the dependencies so you should be able to get up and running with the standard github gem install routine:
+For now Feedzirra exists only on github. It also has a few gem requirements that are only on github. Before you start you need to have
+libcurl[link:http://curl.haxx.se/] and libxml[link:http://xmlsoft.org/] installed. If you're on Leopard you have both. Otherwise, you'll need to
+grab them. Once you've got those libraries, these are the gems that get used: nokogiri, pauldix-sax-machine, taf2-curb (note that this is a fork
+that lives on github and not the Ruby Forge version of curb), and pauldix-feedzirra. The feedzirra gemspec has all the dependencies so you should
+be able to get up and running with the standard github gem install routine:
- gem sources -a http://gems.github.com # if you haven't already
- gem install pauldix-feedzirra
+ gem sources -a http://gems.github.com # if you haven't already
+ gem install pauldix-feedzirra
-==== Troubleshooting Installation
-
-*NOTE:*Some people have been reporting a few issues related to installation. First, the Ruby Forge version of curb is not what you want. It will not work. Nor will the curl-multi gem that lives on
-Ruby Forge. You have to get the taf2-curb[link:http://github.com/taf2/curb/tree/master] fork installed.
-
-If you see this error when doing a require:
-
- /Library/Ruby/Site/1.8/rubygems/custom_require.rb:31:in `gem_original_require': no such file to load -- curb_core (LoadError)
-
-It means that the taf2-curb gem didn't build correctly. To resolve this you can do a git clone git://github.com/taf2/curb.git then run rake gem in the curb directory, then sudo gem
-install pkg/curb-0.2.4.0.gem. After that you should be good.
-
-If you see something like this when trying to run it:
-
- NoMethodError: undefined method `on_success' for #<Curl::Easy:0x1182724>
- from ./lib/feedzirra/feed.rb:88:in `add_url_to_multi'
-
-This means that you are requiring curl-multi or the Ruby Forge version of Curb somewhere. You can't use those and need to get the taf2 version up and running.
-
-If you're on Debian or Ubuntu and getting errors while trying to install the taf2-curb gem, it could be because you don't have the latest version of libcurl installed. Do this to fix:
-
- sudo apt-get install libcurl4-gnutls-dev
-
-Another problem could be if you are running Mac Ports and you have libcurl installed through there. You need to uninstall it for curb to work! The version in Mac Ports is old and doesn't play nice with curb. If you're running Leopard, you can just uninstall and you should be golden. If you're on an older version of OS X, you'll then need to {download curl}[http://curl.haxx.se/download.html] and build from source. Then you'll have to install the taf2-curb gem again. You might have to perform the step above.
-
-If you're still having issues, please let me know on the mailing list. Also, {Todd Fisher (taf2)}[link:http://github.com/taf2] is working on fixing the gem install. Please send him a full error report.
-
=== Usage
This experimental branch offers a new interface to feed fetching with persistent back-end stores. This allows you to
easily run a script retrieving the feeds once per hour or once per day, and it will remember which feeds have been seen
before and which are new. This features uses the Feedzirra::Reader interface.
@@ -62,11 +37,11 @@
The Reader object can take a single URL or a list of URLs followed by a Hash of options. The options hash
allows configuration of the backend store, as well as fetching options for the list of urls. Following is
an example of configuration with the Memcache store connected to Tokyo Tyrant (the front-end for Tokyo Cabinet):
reader = Feedzirra::Reader.new('http://www.pauldix.net/atom.xml', :backend =>
- { :class => Feedzirra::Backend::Memcache, :port => 1978, :server => 'localhost' })
+ { :moneta_klass => Moneta::Memcache, :port => 1978, :server => 'localhost' })
Other options that may be put in the options hash follow the original API described below.
Running reader.fetch will first check the back-end store to see if this feed was fetched previously. If it was previously fetched,
Feedzirra uses this information to avoid fetching the whole body if it has already been downloaded based on etag. If the feed
@@ -148,16 +123,46 @@
feedzirra fetch and parse 4.010000 0.710000 4.720000 ( 15.110101)
feedzirra update 0.660000 0.280000 0.940000 ( 5.152709)
=== Discussion
-I'd like feedback on the api and any bugs encountered on feeds in the wild. I've set up a {google group here}[http://groups.google.com/group/feedzirra].
+I'd like feedback on the api and any bugs encountered on feeds in the wild. I've set up a
+{google group here}[http://groups.google.com/group/feedzirra].
+==== Troubleshooting Installation
+
+*NOTE:*Some people have been reporting a few issues related to installation. First, the Ruby Forge version of curb is not what you want. It will not work. Nor will the curl-multi gem that lives on
+Ruby Forge. You have to get the taf2-curb[link:http://github.com/taf2/curb/tree/master] fork installed.
+
+If you see this error when doing a require:
+
+ /Library/Ruby/Site/1.8/rubygems/custom_require.rb:31:in `gem_original_require': no such file to load -- curb_core (LoadError)
+
+It means that the taf2-curb gem didn't build correctly. To resolve this you can do a git clone git://github.com/taf2/curb.git then run rake gem in the curb directory, then sudo gem
+install pkg/curb-0.2.4.0.gem. After that you should be good.
+
+If you see something like this when trying to run it:
+
+ NoMethodError: undefined method `on_success' for #<Curl::Easy:0x1182724>
+ from ./lib/feedzirra/feed.rb:88:in `add_url_to_multi'
+
+This means that you are requiring curl-multi or the Ruby Forge version of Curb somewhere. You can't use those and need to get the taf2 version up and running.
+
+If you're on Debian or Ubuntu and getting errors while trying to install the taf2-curb gem, it could be because you don't have the latest version of libcurl installed. Do this to fix:
+
+ sudo apt-get install libcurl4-gnutls-dev
+
+Another problem could be if you are running Mac Ports and you have libcurl installed through there. You need to uninstall it for curb to work! The version in Mac Ports is old and doesn't play nice with curb. If you're running Leopard, you can just uninstall and you should be golden. If you're on an older version of OS X, you'll then need to {download curl}[http://curl.haxx.se/download.html] and build from source. Then you'll have to install the taf2-curb gem again. You might have to perform the step above.
+
+If you're still having issues, please let me know on the mailing list. Also, {Todd Fisher (taf2)}[link:http://github.com/taf2] is working on fixing the gem install. Please send him a full error report.
+
=== TODO
-This thing needs to hammer on many different feeds in the wild. I'm sure there will be bugs. I want to find them and crush them. I didn't bother using the test suite for feedparser. i wanted to start fresh.
+This thing needs to hammer on many different feeds in the wild. I'm sure there will be bugs. I want to find them and crush them. I didn't bother
+using the test suite for feedparser. i wanted to start fresh.
Here are some more specific TODOs.
+
* Make a feedzirra-rails gem to integrate feedzirra seamlessly with Rails and ActiveRecord.
* Add support for authenticated feeds.
* Create a super sweet DSL for defining new parsers.
* Test against Ruby 1.9.1 and fix any bugs.
* I'm not keeping track of modified on entries. Should I add this?
\ No newline at end of file