README.md in multisert-0.0.2 vs README.md in multisert-0.0.3

- old
+ new

@@ -25,11 +25,11 @@ CREATE TABLE IF NOT EXISTS some_database.some_table ( field_1 int default null, field_2 int default null, field_3 int default null, field_4 int default null -); +) ENGINE=InnoDB DEFAULT CHARSET=latin1; ``` Now let's say we want to insert 1,000,000 records after running the current iterator through `some_magical_calculation` into our table from above. Let's assume that `some_magical_calculation` takes a single integer input and @@ -55,26 +55,28 @@ (0..1_000_000).each do |i| res = some_magical_calculation(i) buffer << res end -buffer.flush! +buffer.write! ``` We start by creating a new Multisert instance, providing the database connection, database and table, and fields as attributes. Next, as we get the results from `some_magical_calculation`, we shovel each into the Multisert instance. As we iterate through, the Multisert instance will build up the -records and then flush itself to the specified database table when it hits an +records and then write itself to the specified database table when it hits an internal count (default is 10_000, but can be set via the `max_buffer_count` -attribute). One last thing to note is the `buffer.flush!` at the end of the +attribute). One last thing to note is the `buffer.write!` at the end of the script. This ensures that any pending entries are written to the database table -that were not automatically taken care of by the auto-flush that will kick in +that were not automatically taken care of by the auto-write that will kick in during the iteration. ## Performance +### Individual vs Buffer + The gem has a quick performance test built in that can be run via: ```bash $ ruby ./performance/multisert_performance_test ``` We ran the performance test (with some modification to iterate the test 5 @@ -114,9 +116,39 @@ Number of Processors: 1 Total Number of Cores: 2 L2 Cache (per Core): 256 KB L3 Cache: 3 MB Memory: 4 GB + +All data was written to a mysql instance on localhost. + +### Buffer Sizes + +Let's take a look at how buffer size comes into play. + +We ran 3 separate and independent tests on the same computer as above. +Additionally, also note that a buffer size of 0 and 1 are basically identical. + +If we look at using a buffer size ranging from 0 - 10, we see the following +performance: + +<img src="https://raw.github.com/jeffreyiacono/images/master/multisert/multisert-performance-test-0-10.png" width="900" alt="Buffer size: 0 - 10" /> + +If we take a step back and look at buffer sizes ranging from 0 - 100, we see the +following performance: + +<img src="https://raw.github.com/jeffreyiacono/images/master/multisert/multisert-performance-test-0-100.png" width="900" alt="Buffer size: 0 - 100" /> + +Finally, if we look at buffer sizes ranging from 0 - 1,000 and 0 - 10,000 we see +the following performance (spoiler alert: not much difference, just more data +points!): + +<img src="https://raw.github.com/jeffreyiacono/images/master/multisert/multisert-performance-test-0-1000.png" width="900" alt="Buffer size: 0 - 100" /> + +<img src="https://raw.github.com/jeffreyiacono/images/master/multisert/multisert-performance-test-0-10000.png" width="900" alt="Buffer size: 0 - 100" /> + +As can be seen, we see vastly improved performance as we increment our buffer +from 0 - 100, but then level off thereafter. ## FAQ ### Packet Too Large / Connection Lost Errors