h1. Scrooge A Framework and ORM agnostic Model / record attribute tracker to ensure production Ruby applications only fetch the database content needed to minimize wire traffic and reduce conversion overheads to native Ruby types. This is mostly an experiment into unobtrusive tracking, respecting development workflows and understanding Rack internals better. h2. Why bother ? * Object conversion and moving unnecessary data is both expensive and tax existing infrastructure in high load setups * Manually extracting and scoping SELECT clauses is not sustainable in a clean and painless manner with iterative development, even less so in large projects. h2. Suggested Use There's 3 basic modes of operation : * Track : Track attribute access to dump a representative scope profile. * Scope : Scope the process and related resources to a previously persisted scope profile. * Track then scope : A multi-stage strategy that tracks attribute access for h2. Resources A resource is : * A controller and action endpoint ( inferred through framework specific routing ) * A content type / format - a PDF representation may have different Model attribute requirements than a vanilla ERB view. * Request method - typically popular public facing GET requests All Model to attribute mappings is tracked on a per Resource basis.Multiple Models per Resource is supported. h2. Strategies h4. Tracking In tracking mode Scrooge installs filters ( either through Rack middleware or framework specific hooks ) that track attribute access on a per Resource basis. A Kernel#at_exit callback dumps and timestamps this profile ( or scope ) to eg. *framework_configuration_directory/config/scopes/1234147851/scope.yml* This typically works well with functional or integration testing and can yield a substantial birds eye view of attribute use.The accuracy is directly proportional to test coverage and the quality of the test suite. Example log output : <pre> <code> Processing HotelsController#index (for 0.0.0.0 at 2009-02-09 02:55:55) [GET] Parameters: {"action"=>"index", "controller"=>"hotels"} Hotel Load (0.3ms) SELECT * FROM `hotels` LIMIT 0, 15 Rendering template within layouts/application Rendering hotels/index Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 491) LIMIT 1 [Scrooge] read attribute updated_at Rendered hotels/_hotel (2.7ms) Rendered shared/_header (0.1ms) Rendered shared/_navigation (0.3ms) Missing template hotels/_index_sidebar.erb in view path app/views Rendered shared/_sidebar (0.1ms) Rendered shared/_footer (0.1ms) Completed in 91ms (View: 90, DB: 1) | 200 OK [http://test.host/hotels] SQL (0.3ms) ROLLBACK SQL (0.1ms) BEGIN </code> </pre> An example scope / profile, saved to disk : <pre> <code> --- - hotels_show_get: :action: show :controller: hotels :method: :get :format: "*/*" :models: - Address: - line1 - line2 - created_at - postcode - updated_at - country_id - county - location_id - town - hotel_id - Hotel: - important_notes - location_id - locations_index_get: :action: index :controller: locations :method: :get :format: "*/*" :models: - Location: - name - created_at - code - updated_at - level - id - countries_index_get: :action: index :controller: countries :method: :get :format: "*/*" :models: - Country: - name - created_at - code - updated_at - id - location_id - continent_id - hotels_index_get: :action: index :controller: hotels :method: :get :format: "*/*" :models: - Hotel: - from_price - narrative - star_rating - latitude - created_at - hotel_name - updated_at - important_notes - id - apt - location_id - nearest_tube - longitude - telephone - nearest_rail - location_name - distance - Image: - thumbnail_width - created_at - title - updated_at - url - thumbnail_height - height - thumbnail_url - has_thumbnail - hotel_id - width </code> </pre> h4. Scope A previously persisted scope / profile can be restored from disk and injected to the applicable Resources.Database content retrieved will match that of the given scope timestamp. This is typically pushed to production where a hybrid ( track then scope strategy) mode of operation is frowned upon and adjusted for each major release or deployment. Example log output : <pre> <code> Processing HotelsController#index (for 0.0.0.0 at 2009-02-09 02:59:41) [GET] Parameters: {"action"=>"index", "controller"=>"hotels"} Hotel Load (0.4ms) SELECT hotels.narrative, hotels.from_price, hotels.created_at, hotels.latitude, hotels.star_rating, hotels.hotel_name, hotels.updated_at, hotels.important_notes, hotels.apt, hotels.id, hotels.nearest_tube, hotels.location_id, hotels.nearest_rail, hotels.telephone, hotels.longitude, hotels.distance, hotels.location_name FROM `hotels` LIMIT 0, 15 Rendering template within layouts/application Rendering hotels/index Image Load (0.2ms) SELECT images.created_at, images.thumbnail_width, images.title, images.updated_at, images.url, images.id, images.thumbnail_height, images.height, images.thumbnail_url, images.has_thumbnail, images.width, images.hotel_id FROM `images` WHERE (`images`.hotel_id = 491) LIMIT 1 Rendered hotels/_hotel (2.8ms) Rendered shared/_header (0.1ms) Rendered shared/_navigation (0.3ms) Missing template hotels/_index_sidebar.erb in view path app/views Rendered shared/_sidebar (0.1ms) Rendered shared/_footer (0.1ms) Completed in 90ms (View: 5, DB: 1) | 200 OK [http://test.host/hotels] SQL (0.1ms) ROLLBACK SQL (0.1ms) BEGIN </code> </pre> h4. Track then scope Multi-stage and self configuring strategy that tracks attribute access for a given warmup period, synchronize the results across n-1 processes, aggregate the results to be representative of the whole cluster ( or seamless fallback to a single process ), remove the tracking filters and install functionality that scopes database access to that of the tracking phase. Recommended for production use. Example log output : <pre> <code> Processing HotelsController#index (for 127.0.0.1 at 2009-02-16 00:00:58) [GET] Parameters: {"action"=>"index", "controller"=>"hotels"} Hotel Load (0.5ms) SELECT * FROM `hotels` LIMIT 0, 15 Hotel Columns (7.7ms) SHOW FIELDS FROM `hotels` SQL (3.9ms) SELECT count(*) AS count_all FROM `hotels` Rendering template within layouts/application Rendering hotels/index Image Load (0.5ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 11381) LIMIT 1 Image Columns (3.6ms) SHOW FIELDS FROM `images` Rendered hotels/_hotel (200.2ms) Image Load (0.4ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 11382) LIMIT 1 Rendered hotels/_hotel (2.4ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 11697) LIMIT 1 Rendered hotels/_hotel (1.8ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 12693) LIMIT 1 Rendered hotels/_hotel (1.7ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 12738) LIMIT 1 Rendered hotels/_hotel (1.6ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 12886) LIMIT 1 Rendered hotels/_hotel (1.9ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13007) LIMIT 1 Rendered hotels/_hotel (1.8ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13074) LIMIT 1 Rendered hotels/_hotel (1.5ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13077) LIMIT 1 Rendered hotels/_hotel (1.6ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13078) LIMIT 1 Rendered hotels/_hotel (1.8ms) Image Load (0.3ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13079) LIMIT 1 Rendered hotels/_hotel (2.4ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13080) LIMIT 1 Rendered hotels/_hotel (1.8ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13082) LIMIT 1 Rendered hotels/_hotel (1.5ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13085) LIMIT 1 Rendered hotels/_hotel (1.8ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13105) LIMIT 1 Rendered hotels/_hotel (1.6ms) Rendered shared/_header (0.4ms) Rendered shared/_navigation (0.8ms) Missing template hotels/_index_sidebar.erb in view path app/views Rendered shared/_sidebar (0.4ms) Rendered shared/_footer (0.3ms) Completed in 270ms (View: 243, DB: 20) | 200 OK [http://localhost/hotels] [Scrooge] Execute stage :synchronize ... [Scrooge] Uninstalling tracking middleware ... [Scrooge] Stop tracking ... [Scrooge] Synchronize results with other processes ... Cache write: 17619400_63223_756033 Cache read: scrooge_tracker_aggregation Cache write: scrooge_tracker_aggregation [Scrooge] Execute stage :aggregate ... [Scrooge] Aggregate results from other processes ... Processing HotelsController#index (for 127.0.0.1 at 2009-02-16 00:01:37) [GET] Parameters: {"action"=>"index", "controller"=>"hotels"} Hotel Load (0.5ms) SELECT * FROM `hotels` LIMIT 0, 15 SQL (0.2ms) SELECT count(*) AS count_all FROM `hotels` Rendering template within layouts/application Rendering hotels/index Image Load (0.3ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 11381) LIMIT 1 Rendered hotels/_hotel (2.0ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 11382) LIMIT 1 Rendered hotels/_hotel (1.8ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 11697) LIMIT 1 Rendered hotels/_hotel (1.7ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 12693) LIMIT 1 Rendered hotels/_hotel (1.6ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 12738) LIMIT 1 Rendered hotels/_hotel (1.5ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 12886) LIMIT 1 Rendered hotels/_hotel (1.8ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13007) LIMIT 1 Rendered hotels/_hotel (1.8ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13074) LIMIT 1 Rendered hotels/_hotel (1.4ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13077) LIMIT 1 Rendered hotels/_hotel (1.6ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13078) LIMIT 1 Rendered hotels/_hotel (1.6ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13079) LIMIT 1 Rendered hotels/_hotel (1.8ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13080) LIMIT 1 Rendered hotels/_hotel (1.7ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13082) LIMIT 1 Rendered hotels/_hotel (1.5ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13085) LIMIT 1 Rendered hotels/_hotel (1.7ms) Image Load (0.2ms) SELECT * FROM `images` WHERE (`images`.hotel_id = 13105) LIMIT 1 Rendered hotels/_hotel (81.1ms) Rendered shared/_header (0.1ms) Rendered shared/_navigation (0.3ms) Missing template hotels/_index_sidebar.erb in view path app/views Rendered shared/_sidebar (0.1ms) Rendered shared/_footer (0.1ms) Completed in 113ms (View: 107, DB: 4) | 200 OK [http://localhost/hotels] Cache read: scrooge_tracker_aggregation Cache read: 17619400_63223_756033 [Scrooge] Execute stage :scope ... [Scrooge] Scope ... Processing HotelsController#index (for 127.0.0.1 at 2009-02-16 00:01:53) [GET] Parameters: {"action"=>"index", "controller"=>"hotels"} Hotel Load (0.7ms) SELECT hotels.narrative, hotels.from_price, hotels.created_at, hotels.latitude, hotels.star_rating, hotels.hotel_name, hotels.updated_at, hotels.important_notes, hotels.apt, hotels.id, hotels.nearest_tube, hotels.location_id, hotels.nearest_rail, hotels.telephone, hotels.longitude, hotels.distance, hotels.location_name FROM `hotels` LIMIT 0, 15 SQL (0.2ms) SELECT count(*) AS count_all FROM `hotels` Rendering template within layouts/application Rendering hotels/index Image Load (0.3ms) SELECT images.created_at, images.thumbnail_width, images.title, images.updated_at, images.url, images.id, images.thumbnail_height, images.height, images.thumbnail_url, images.has_thumbnail, images.width, images.hotel_id FROM `images` WHERE (`images`.hotel_id = 11381) LIMIT 1 Rendered hotels/_hotel (2.0ms) Image Load (0.2ms) SELECT images.created_at, images.thumbnail_width, images.title, images.updated_at, images.url, images.id, images.thumbnail_height, images.height, images.thumbnail_url, images.has_thumbnail, images.width, images.hotel_id FROM `images` WHERE (`images`.hotel_id = 11382) LIMIT 1 Rendered hotels/_hotel (1.7ms) Image Load (0.2ms) SELECT images.created_at, images.thumbnail_width, images.title, images.updated_at, images.url, images.id, images.thumbnail_height, images.height, images.thumbnail_url, images.has_thumbnail, images.width, images.hotel_id FROM `images` WHERE (`images`.hotel_id = 11697) LIMIT 1 Rendered hotels/_hotel (1.7ms) Image Load (0.2ms) SELECT images.created_at, images.thumbnail_width, images.title, images.updated_at, images.url, images.id, images.thumbnail_height, images.height, images.thumbnail_url, images.has_thumbnail, images.width, images.hotel_id FROM `images` WHERE (`images`.hotel_id = 12693) LIMIT 1 Rendered hotels/_hotel (1.7ms) Image Load (0.2ms) SELECT images.created_at, images.thumbnail_width, images.title, images.updated_at, images.url, images.id, images.thumbnail_height, images.height, images.thumbnail_url, images.has_thumbnail, images.width, images.hotel_id FROM `images` WHERE (`images`.hotel_id = 12738) LIMIT 1 Rendered hotels/_hotel (1.5ms) Image Load (0.2ms) SELECT images.created_at, images.thumbnail_width, images.title, images.updated_at, images.url, images.id, images.thumbnail_height, images.height, images.thumbnail_url, images.has_thumbnail, images.width, images.hotel_id FROM `images` WHERE (`images`.hotel_id = 12886) LIMIT 1 Rendered hotels/_hotel (1.7ms) Image Load (0.2ms) SELECT images.created_at, images.thumbnail_width, images.title, images.updated_at, images.url, images.id, images.thumbnail_height, images.height, images.thumbnail_url, images.has_thumbnail, images.width, images.hotel_id FROM `images` WHERE (`images`.hotel_id = 13007) LIMIT 1 Rendered hotels/_hotel (1.8ms) Image Load (0.2ms) SELECT images.created_at, images.thumbnail_width, images.title, images.updated_at, images.url, images.id, images.thumbnail_height, images.height, images.thumbnail_url, images.has_thumbnail, images.width, images.hotel_id FROM `images` WHERE (`images`.hotel_id = 13074) LIMIT 1 Rendered hotels/_hotel (1.4ms) Image Load (0.2ms) SELECT images.created_at, images.thumbnail_width, images.title, images.updated_at, images.url, images.id, images.thumbnail_height, images.height, images.thumbnail_url, images.has_thumbnail, images.width, images.hotel_id FROM `images` WHERE (`images`.hotel_id = 13077) LIMIT 1 Rendered hotels/_hotel (1.5ms) Image Load (0.2ms) SELECT images.created_at, images.thumbnail_width, images.title, images.updated_at, images.url, images.id, images.thumbnail_height, images.height, images.thumbnail_url, images.has_thumbnail, images.width, images.hotel_id FROM `images` WHERE (`images`.hotel_id = 13078) LIMIT 1 Rendered hotels/_hotel (1.8ms) Image Load (0.3ms) SELECT images.created_at, images.thumbnail_width, images.title, images.updated_at, images.url, images.id, images.thumbnail_height, images.height, images.thumbnail_url, images.has_thumbnail, images.width, images.hotel_id FROM `images` WHERE (`images`.hotel_id = 13079) LIMIT 1 Rendered hotels/_hotel (1.9ms) Image Load (0.2ms) SELECT images.created_at, images.thumbnail_width, images.title, images.updated_at, images.url, images.id, images.thumbnail_height, images.height, images.thumbnail_url, images.has_thumbnail, images.width, images.hotel_id FROM `images` WHERE (`images`.hotel_id = 13080) LIMIT 1 Rendered hotels/_hotel (1.7ms) Image Load (0.2ms) SELECT images.created_at, images.thumbnail_width, images.title, images.updated_at, images.url, images.id, images.thumbnail_height, images.height, images.thumbnail_url, images.has_thumbnail, images.width, images.hotel_id FROM `images` WHERE (`images`.hotel_id = 13082) LIMIT 1 Rendered hotels/_hotel (1.5ms) Image Load (0.2ms) SELECT images.created_at, images.thumbnail_width, images.title, images.updated_at, images.url, images.id, images.thumbnail_height, images.height, images.thumbnail_url, images.has_thumbnail, images.width, images.hotel_id FROM `images` WHERE (`images`.hotel_id = 13085) LIMIT 1 Rendered hotels/_hotel (1.7ms) Image Load (0.2ms) SELECT images.created_at, images.thumbnail_width, images.title, images.updated_at, images.url, images.id, images.thumbnail_height, images.height, images.thumbnail_url, images.has_thumbnail, images.width, images.hotel_id FROM `images` WHERE (`images`.hotel_id = 13105) LIMIT 1 Rendered hotels/_hotel (1.7ms) Rendered shared/_header (0.1ms) Rendered shared/_navigation (0.2ms) Missing template hotels/_index_sidebar.erb in view path app/views Rendered shared/_sidebar (0.0ms) Rendered shared/_footer (0.0ms) Completed in 34ms (View: 27, DB: 4) | 200 OK [http://localhost/hotels] </code> </pre> h2. Installation h4. As a Rails plugin ( Recommended ) ./script/plugin install git://github.com/methodmissing/scrooge.git h4. From Git git pull git://github.com/methodmissing/scrooge.git h4. As a Gem sudo gem install methodmissing-scrooge -s http://gems.github.com h2. Configuration Scrooge installs ( see recommended installation above ) a configuration file with the following format within *framework_configuration_directory/scrooge.yml ( RAILS_ROOT/config/scrooge.yml for a Rails setup ) : <pre> <code> production: orm: :active_record storage: :memory strategy: :track_then_scope warmup: 600 # warmup / track for 10 minutes scope: on_missing_attribute: :reload # or :raise enabled: true development: orm: :active_record storage: :memory strategy: :track warmup: 600 # warmup / track for 10 minutes scope: on_missing_attribute: :reload # or :raise enabled: true test: orm: :active_record storage: :memory strategy: :track warmup: 600 # warmup / track for 10 minutes scope: on_missing_attribute: :reload # or :raise enabled: true </code> </pre> h4. ORM Scrooge is ORM agnostic and ships with an ActiveRecord layer. orm: :active_record h4. Storage backend Tracking results can be persisted to a given backend or storage option.Ships with a memory store, but can be extended to file system, memcached etc. as all Tracker components is designed to be Marshal friendly. storage: :memory h4. Strategy One of :track, :scope or :track_then_scope .Only the :track_then_scope strategy respects the :warmup configuration option. h4. Warmup The designated warmup period for the :track_then_scope strategy, given in seconds.Typically 600 to 3600. h4. Scope A scope is a reference to a timestamped Scrooge run where access to Model attributes is tracked on a per Resource basis. scope: 1234567891 If not scope is given in the configuration, ENV['scope'] would also be considered to facilitate configuration through Capistrano etc. h4. Handling Missing Attributes When the contents for a given Model attribute has not been retrieved from the database, most ORM frameworks raise an error by default.This is configurable to reloading the model with all it's columns or raise instead. on_missing_attribute: :reload # or :raise h4. Status Scrooge can be disabled with : enabled: false h2. Rails specific rake tasks. Ships with tasks to assist in inspecting results. <pre> <code> methodmissing:superbreak_app lourens$ rake scrooge:list (in /Users/lourens/projects/superbreak_app) - 1234735663 - 1234735722 - 1234735744 - 1234735790 - 1234738880 methodmissing:superbreak_app lourens$ rake scope=1234735790 scrooge:inspect (in /Users/lourens/projects/superbreak_app) #<GET :hotels/show (*/*) - #<Hotel :important_notes, :location_id> - #<Address :line1, :created_at, :line2, :postcode, :updated_at, :country_id, :county, :location_id, :town, :hotel_id> #<GET :countries/index (*/*) - #<Country :name, :created_at, :code, :updated_at, :id, :location_id, :continent_id> #<GET :locations/index (*/*) - #<Location :name, :created_at, :code, :updated_at, :level, :id> #<GET :hotels/index (*/*) - #<Image :created_at, :thumbnail_width, :title, :updated_at, :url, :thumbnail_height, :height, :thumbnail_url, :has_thumbnail, :width, :hotel_id> - #<Hotel :narrative, :from_price, :created_at, :latitude, :star_rating, :hotel_name, :updated_at, :important_notes, :apt, :id, :nearest_tube, :location_id, :nearest_rail, :telephone, :longitude, :distance, :location_name> </code> </pre> h2. Notes This is an initial release, has not yet been battle tested in production and is pending Ruby 1.9.1 compatibility.