= PostgreSQLCursor PostgreSQL Cursor is an extension to the ActiveRecord PostgreSQLAdapter for very large result sets. It provides a cursor open/fetch/close interface to access data without loading all rows into memory, and instead loads the result rows in "chunks" (default of 10_000 rows), buffers them, and returns the rows one at a time. For web pages, an application would not want to process a large amount of data, usually employing a Pagination scheme to present it to the users. Background processes sometimes need to generate a large amount of data, and the ActiveRecord approach to load all data into memory is not the best fit here. Previous solutions employ pagination to fetch each block, then re-running the query for the next "page". This gem avoids re-executing the query by using PostgreSQL's cursor feature. The ActiveRecord methods for the cursor return instances of the model class by default, however you can pass in a block to override this strategy. For instance, returning Hashes instead of AR instances is faster. Julian's benchmarks showed returning instances was a factor of 4 slower than returning a hash. NOTE: This gem is intended to replace the 'postgresql-cursor' (with hyphen, not underscore) library. I also plan on supporing ActiveRecord 3.x in the near future. ==Installation [sudo] gem install postgresql_cursor This does not require Rails to work, just ActiveRecord < 3.0.0 and the 'pg' gem. Rails 3 support is forthcoming. You can then configure your Rails application by adding to the config/environment.rb file config.gem 'postgresql_cursor' or require the gem on your non-rails application require 'rubygems' require 'postgresql_cursor' ==Usage This library is intended to be used via ActiveRecord. It provides two methods on ActiveRecord::Base like the find() method, that will return cursor objects instead of an array of model instances. Calling each() the returned cursor will yield a record to the block for each row. It handles a transaction block, open, buffering, and closing the cursor. In this way, it operates like an Array object. Note that it does not implement Enumerable, so no other methods are available. * *find_with_cursor*( find options, :cursor=>{cursor options}, &block) returns a cursor for the data matching the find options. It takes an optional block accepting a column=>value hash which returns the object desired via each or next, instead of an instance of the model. * *find_by_sql_with_cursor*(select statement, cursor options, &block) takes a custom SQL statement and returns each row as above. ==Examples Account.find_with_cursor(:conditions=>["status = ?", 'good']).each do |row| puts row.to_json end Account.find_by_sql_with_cursor("select ...", :buffer_size=>1000).each do |row| row.process end Account.transaction do cursor = Account.find_with_cursor(...) { |record| record.symbolize_keys } while record = cursor.next do record.process # => {:column=>value, ...} cursor.close if cursor.count > 1000 # Halts loop after 1000 records end end Account.find_with_cursor(...) { |record| record.symbolize_keys }.each do |row| row.process end ==Authors Allen Fair, allen.fair@gmail.com, http://github.com/afair Thank you to: * Iulian Dogariu, http://github.com/iulianu (Fixes) * Julian Mehnle, http://www.mehnle.net (Suggestions) == Note on Patches/Pull Requests * Fork the project. * Make your feature addition or bug fix. * Add tests for it. This is important so I don't break it in a future version unintentionally. * Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull) * Send me a pull request. Bonus points for topic branches. == Copyright Copyright (c) 2010 Allen Fair. See LICENSE for details.