README.md in upsert-0.2.0 vs README.md in upsert-0.2.1

- old
+ new

@@ -1,46 +1,56 @@ # Upsert -Finally, all those SQL MERGE tricks codified so that you can do "upsert" on MySQL, PostgreSQL, and Sqlite. +Finally, all those SQL MERGE tricks codified so that you can do "upsert" on MySQL, PostgreSQL, and SQLite. -## Usage +You pass a selector that uniquely identifies a row, whether it exists or not. You pass a set of attributes that should be set on that row. Based on what database is being used, one of a number of SQL MERGE-like tricks are used. -Let's say you have... +The second argument is currently (mis)named a "document" because this was inspired by [mongo-ruby-driver's update method](http://api.mongodb.org/ruby/1.6.4/Mongo/Collection.html#update-instance_method). - class Pet < ActiveRecord::Base - # col :name - # col :breed - end +## Usage -### One at a time +### One by one +Faster than just doing `Pet.create`... 85% faster on PostgreSQL, for example. But no validations or anything. + upsert = Upsert.new Pet.connection, Pet.table_name - selector = {:name => 'Jerry'} - document = {:breed => 'beagle'} - upsert.row selector, document + upsert.row({:name => 'Jerry'}, :breed => 'beagle') + upsert.row({:name => 'Pierre'}, :breed => 'tabby') -### Streaming upserts (fastest) +### Streaming -Rows are buffered in memory until it's efficient to send them to the database. +Rows are buffered in memory until it's efficient to send them to the database. Currently this only provides an advantage on MySQL because it uses `ON DUPLICATE KEY UPDATE`... but if a similar method appears in PostgreSQL, the same code will still work. Upsert.stream(Pet.connection, Pet.table_name) do |upsert| - # [...] upsert.row({:name => 'Jerry'}, :breed => 'beagle') - # [...] upsert.row({:name => 'Pierre'}, :breed => 'tabby') - # [...] end -### With a helper method +### `ActiveRecord::Base.upsert` (optional) For bulk upserts, you probably still want to use `Upsert.stream`. - # be sure to require 'upsert/active_record_upsert' - it's not required by default - selector = {:name => 'Jerry'} - document = {:breed => 'beagle'} - Pet.upsert selector, document + require 'upsert/active_record_upsert' + Pet.upsert({:name => 'Jerry'}, :breed => 'beagle') + Pet.upsert({:name => 'Pierre'}, :breed => 'tabby') +### Gotchas + +Currently, the first row you pass in determines the columns that will be used. That's useful for mass importing of many rows with the same columns, but is surprising if you're trying to use a single `Upsert` object to add arbitrary data. For example, this won't work: + + Upsert.stream(Pet.connection, Pet.table_name) do |upsert| + upsert.row({:name => 'Jerry'}, :breed => 'beagle') + upsert.row({:tag_number => 456}, :spiel => 'great cat') # won't work - doesn't use same columns + end + +You would need to use a new `Upsert` object. On the other hand, this is totally fine: + + Pet.upsert({:name => 'Jerry'}, :breed => 'beagle') + Pet.upsert({:tag_number => 456}, :spiel => 'great cat') + +Please send in a pull request if you think there's a better way! + ## Real-world usage <p><a href="http://brighterplanet.com"><img src="https://s3.amazonaws.com/static.brighterplanet.com/assets/logos/flush-left/inline/green/rasterized/brighter_planet-160-transparent.png" alt="Brighter Planet logo"/></a></p> We use `upsert` for [big data processing at Brighter Planet](http://brighterplanet.com/research) and in production at @@ -201,13 +211,9 @@ You could also use [activerecord-import](https://github.com/zdennis/activerecord-import) to upsert: Pet.import columns, all_values, :timestamps => false, :on_duplicate_key_update => columns This, however, only works on MySQL and requires ActiveRecord&mdash;and if all you are doing is upserts, `upsert` is tested to be 40% faster. And you don't have to put all of the rows to be upserted into a single huge array - you can stream them using `Upsert.stream`. - -### Loosely based on mongo-ruby-driver's upsert functionality - -The `selector` and `document` arguments are inspired by the upsert functionality of the [mongo-ruby-driver's update method](http://api.mongodb.org/ruby/1.6.4/Mongo/Collection.html#update-instance_method). ## Copyright Copyright 2012 Brighter Planet, Inc.