= Advanced Associations
Sequel::Model has the most powerful and flexible associations of any ruby ORM.
== Background: Sequel::Model association options
There are a bunch of advanced association options that are available to
handle more complex cases. First we'll go over some of
the simpler ones:
All associations take a block that can be used to further filter/modify the
default dataset. There's also an :eager_block option if you want to use
a different block when eager loading via Dataset#eager. Association blocks are
useful for things like:
Artist.one_to_many :gold_albums, :class=>:Album do |ds|
ds.where{copies_sold > 500000}
end
There are a whole bunch of options for changing how the association is eagerly
loaded via Dataset#eager_graph: :graph_block, :graph_conditions,
:graph_only_conditions, :graph_join_type (and :graph_join_table_* ones for
JOINing to the join table in a many_to_many association).
:graph_join_type :: The type of join to do (:inner, :left, :right)
:graph_conditions :: Additional conditions to put on join (needs to be a
hash or array of all two pairs). Automatically assumes unqualified symbols
or first element of the pair to be columns of the associated model, and
unqualified symbols of the second element of the pair to be columns of the
current model.
:graph_block :: A block passed to +join_table+, allowing you to specify
conditions other than equality, or to use OR, or set up any arbitrary
condition. The block is passed the associated table alias, current table
alias, and an array of previous joins clause objects.
:graph_only_conditions :: Use these conditions instead of the standard
association conditions. This is necessary when you don't want to have an
equal condition between the foreign key and primary key of the tables.
You can also use this to have a JOIN USING (array of symbols), or a NATURAL
or CROSS JOIN (nil, with the appropriate :graph_join_type).
These can be used like this:
# Makes Artist.eager_graph(:required_albums).all not return artists that
# don't have any albums
Artist.one_to_many :required_albums, :class=>:Album, :graph_join_type=>:inner
# Makes sure all returned albums have the active flag set
Artist.one_to_many :active_albums, :class=>:Album, \
:graph_conditions=>{:active=>true}
# Only returns albums that have sold more than 500,000 copies
Artist.one_to_many :gold_albums, :class=>:Album, \
:graph_block=>proc{|j,lj,js| Sequel.qualify(j, :copies_sold) > 500000}
# Handles the case where the tables are associated by a case insensitive name string
Artist.one_to_many :albums, :key=>:artist_name, \
:graph_only_conditions=>nil, \
:graph_block=>proc{|j,lj,js| {Sequel.function(:lower, Sequel.qualify(j, :artist_name))=>Sequel.function(:lower, Sequel.qualify(lj, :name))}}
# Handles the case where both key columns have the name artist_name, and you want to use
# a JOIN USING
Artist.one_to_many :albums, :key=>:artist_name, :graph_only_conditions=>[:artist_name]
Remember, using +eager_graph+ is generally only necessary when you need to
filter/order based on columns in an associated table, it is recommended to
use +eager+ for eager loading if possible.
One advantage of using +eager_graph+ is that you can easily filter/order
on columns in an associated table on a per-query basis, using regular
Sequel dataset methods. For example, if you only want to retrieve artists
who have albums that start with A, and eager load just those albums,
ordered by the albums name, you can do:
albums = Artist.
eager_graph(:albums).
where(Sequel.like(:albums__name, 'A%')).
order(:albums__name).
all
For lazy loading (e.g. Model[1].association), the :dataset option can be used
to specify an arbitrary dataset (one that uses different keys, multiple keys,
joins to other tables, etc.).
For eager loading via +eager+, the :eager_loader option can be used to specify
how to eagerly load a complex association. This is an extremely powerful
option. Though it can often be verbose (compared to other things in Sequel),
it allows you complete control over how to eagerly load associations for a
group of objects.
:eager_loader should be a proc that takes a single hash argument, which will
have at least the following keys:
:id_map :: A mapping of key values to arrays of current model instances,
usage described below
:rows :: An array of model objects
:associations :: A hash of dependent associations to eagerly load
:self :: The dataset that is doing the eager loading
:eager_block :: A dynamic callback for this eager load.
Since you are given all of the records, you can do things like filter on
associations that are specified by multiple keys, or do multiple
queries depending on the content of the records (which would be
necessary for polymorphic associations). Inside the :eager_loader
proc, you should get the related objects and populate the
associations cache for all objects in the array of records. The hash
of dependent associations is available for you to cascade the eager
loading down multiple levels, but it is up to you to use it.
The id_map is a performance enhancement that is used by the default
association loaders and is also available to you. It is a hash with keys
foreign/primary key values, and values being arrays of current model
objects having the foreign/primary key value associated with the key.
This may be hard to visualize, so I'll give an example. Let's say you
have the following associations
Album.many_to_one :artist
Album.one_to_many :tracks
and the following three albums in the database:
album1 = Album.create(:artist_id=>3) # id: 1
album2 = Album.create(:artist_id=>3) # id: 2
album3 = Album.create(:artist_id=>2) # id: 3
If you try to eager load this dataset:
Album.eager(:artist, :tracks).all
Then the id_map provided to the artist :eager_loader proc would be:
{3=>[album1, album2], 2=>[album3]}
The artist id_map contains a mapping of artist_id values to arrays of
album objects. Since both album1 and album2 have the same artist_id,
the are both in the array related to that key. album3 has a different
artist_id, so it is in a different array. Eager loading of artists is
done by looking for any artist having one of the keys in the hash:
artists = Artist.where(:id=>id_map.keys).all
When the artists are retrieved, you can iterate over them, find entries
with matching keys, and manually associate them to the albums:
artists.each do |artist|
# Find related albums using the artist_id_map
if albums = id_map[artist.id]
# Iterate over the albums
albums.each do |album|
# Manually set the artist association for each album
album.associations[:artist] = artist
end
end
end
The id_map provided to the tracks :eager_loader proc would be:
{1=>[album1], 2=>[album2], 3=>[album3]}
Now the id_map contains a mapping of id values to arrays of album objects (in this
case each array only has a single object, because id is the primary key). So when
looking for tracks to eagerly load, you only need to look for ones that have an
album_id with one of the keys in the hash:
tracks = Track.where(:album_id=>id_map.keys).all
When the tracks are retrieved, you can iterate over them, find entries with matching
keys, and manually associate them to the albums:
tracks.each do |track|
if albums = id_map[track.album_id]
albums.each do |album|
album.associations[:tracks] << track
end
end
end
=== Two basic example eager loaders
Putting the code in the above examples together, you almost have enough for a basic
working eager loader. The main important thing that is missing is you need to set
initial values for the eagerly loaded associations. For the artist association, you
need to initial the values to nil:
# rows here is the :rows entry in the hash passed to the eager loader
rows.each{|album| album.associations[:artist] = nil}
For the tracks association, you set the initial value to an empty array:
rows.each{|album| album.associations[:track] = []}
These are done so that if an album currently being loaded doesn't have an associated
artist or any associated tracks, the lack of them will be cached, so calling the
artist or tracks method on the album will not do another database lookup.
So putting everything together, the artist eager loader looks like:
:eager_loader=>(proc do |eo_opts|
eo_opts[:rows].each{|album| album.associations[:artist] = nil}
id_map = eo_opts[:id_map]
Artist.where(:id=>id_map.keys).all do |artist|
if albums = id_map[artist.id]
albums.each do |album|
album.associations[:artist] = artist
end
end
end
end)
and the tracks eager loader looks like:
:eager_loader=>(proc do |eo_opts|
eo_opts[:rows].each{|album| album.associations[:tracks] = []}
id_map = eo_opts[:id_map]
Track.where(:id=>id_map.keys).all do |tracks|
if albums = id_map[track.album_id]
albums.each do |album|
album.associations[:tracks] << track
end
end
end
end)
Now, these are both overly simplistic eager loaders that don't respect cascaded
associations or any of the association options. But hopefully they both
provide simple examples that you can more easily build and learn from, as
the custom eager loaders described later in this page are more complex.
Basically, the eager loading steps can be broken down into:
1. Set default association values (nil/[]) for each of the current objects
2. Return just related associated objects by filtering the associated class
to include only rows with keys present in the id_map.
3. Iterating over the returned associated objects, indexing into the id_map
using the foreign/primary key value in the associated object to get
current values associated to that specific object.
4. For each of those current values, updating the cached association value to
include that specific object.
Using the :eager_loader proc, you should be able to eagerly load all associations
that can be eagerly loaded, even if Sequel doesn't natively support such eager
loading.
== ActiveRecord associations
Sequel supports all of associations that ActiveRecord supports, though some
require different approaches or custom :eager_loader options.
=== Association callbacks
Sequel supports the same callbacks that ActiveRecord does for +one_to_many+ and
+many_to_many+ associations: :before_add, :before_remove, :after_add, and
:after_remove. For +many_to_one+ associations and +one_to_one+ associations, Sequel
supports the :before_set and :after_set callbacks. On all associations,
Sequel supports :after_load, which is called after the association has been
loaded.
Each of these options can be a symbol specifying an instance method
that takes one argument (the associated object), or a proc that takes
two arguments (the current object and the associated object), or an
array of symbols and procs. For :after_load with a *_to_many association,
the associated object argument is an array of associated objects.
If any of the before callbacks return +false+, the adding/removing
does not happen and it either raises a Sequel::BeforeHookFailed (the default), or
returns false (if +raise_on_save_failure+ is false).
=== Association extensions
All associations come with an association_dataset method that can be further filtered or
otherwise modified:
class Author < Sequel::Model
one_to_many :authorships
end
Author.first.authorships_dataset.where{number < 10}.first
You can extend a dataset with a module using the :extend association option. You can reference
the model object that created the association dataset via the dataset's
+model_object+ method, and the related association reflection via the dataset's
+association_reflection+ method:
module FindOrCreate
def find_or_create(vals)
first(vals) || model.create(vals.merge(association_reflection[:key]=>model_object.id))
end
end
class Author < Sequel::Model
one_to_many :authorships, :extend=>FindOrCreate
end
Author.first.authorships_dataset.find_or_create(:name=>'Blah', :number=>10)
=== has_many :through associations
+many_to_many+ handles the usual case of a has_many :through with a +belongs_to+ in
the associated model. It doesn't break on the case where the join table is a
model table, unlike ActiveRecord's +has_and_belongs_to_many+.
ActiveRecord:
class Author < ActiveRecord::Base
has_many :authorships
has_many :books, :through => :authorships
end
class Authorship < ActiveRecord::Base
belongs_to :author
belongs_to :book
end
@author = Author.find :first
@author.books
Sequel::Model:
class Author < Sequel::Model
one_to_many :authorships
many_to_many :books, :join_table=>:authorships
end
class Authorship < Sequel::Model
many_to_one :author
many_to_one :book
end
@author = Author.first
@author.books
If you use an association other than +belongs_to+ in the associated model, such as a +has_many+,
you still use a +many_to_many+ association, but you need to use some options:
ActiveRecord:
class Firm < ActiveRecord::Base
has_many :clients
has_many :invoices, :through => :clients
end
class Client < ActiveRecord::Base
belongs_to :firm
has_many :invoices
end
class Invoice < ActiveRecord::Base
belongs_to :client
has_one :firm, :through => :client
end
Firm.find(:first).invoices
Sequel::Model:
class Firm < Sequel::Model
one_to_many :clients
many_to_many :invoices, :join_table=>:clients, :right_key=>:id, :right_primary_key=>:client_id
end
class Client < Sequel::Model
many_to_one :firm
one_to_many :invoices
end
class Invoice < Sequel::Model
many_to_one :client
# has_one :through equivalent 1
# eager load with :eager=>:firm option on :client association, and eager loading :client
def firm
client.firm if client
end
# has_one :through equivalent 2
# eager load the usual way
many_to_many :firms, :join_table=>:clients, :left_key=>:id, :left_primary_key=>:client_id, :right_key=>:firm_id
def firm
firms.first
end
# has_one :through equivalent 3
# eager loading requires custom :eager_loader proc
many_to_one :firm, :dataset=>proc{Firm.join(:clients, :firm_id=>:id, :id=>client_id).select_all(:firms)}
end
Firm.first.invoices
=== Polymorphic Associations
Sequel discourages the use of polymorphic associations, which is the reason they
are not supported by default. All polymorphic associations can be made non-polymorphic
by using additional tables and/or columns instead of having a column
containing the associated class name as a string.
Polymorphic associations break referential integrity and are significantly more
complex than non-polymorphic associations, so their use is not recommended unless
you are stuck with an existing design that uses them.
If you must use them, look for the sequel_polymorphic external plugin, as it makes using
polymorphic associations in Sequel about as easy as it is in ActiveRecord. However,
here's how they can be done using Sequel's custom associations (the sequel_polymorphic
plugin is just a generic version of this code):
ActiveRecord:
class Asset < ActiveRecord::Base
belongs_to :attachable, :polymorphic => true
end
class Post < ActiveRecord::Base
has_many :assets, :as => :attachable
end
class Note < ActiveRecord::Base
has_many :assets, :as => :attachable
end
@asset.attachable = @post
@asset.attachable = @note
Sequel::Model:
class Asset < Sequel::Model
many_to_one :attachable, :reciprocal=>:assets,
:setter=>(proc do |attachable|
self[:attachable_id] = (attachable.pk if attachable)
self[:attachable_type] = (attachable.class.name if attachable)
end),
:dataset=>(proc do
klass = attachable_type.constantize
klass.where(klass.primary_key=>attachable_id)
end),
:eager_loader=>(proc do |eo|
id_map = {}
eo[:rows].each do |asset|
asset.associations[:attachable] = nil
((id_map[asset.attachable_type] ||= {})[asset.attachable_id] ||= []) << asset
end
id_map.each do |klass_name, id_map|
klass = klass_name.constantize
klass.where(klass.primary_key=>id_map.keys).all do |attach|
id_map[attach.pk].each do |asset|
asset.associations[:attachable] = attach
end
end
end
end)
end
class Post < Sequel::Model
one_to_many :assets, :key=>:attachable_id, :reciprocal=>:attachable, :conditions=>{:attachable_type=>'Post'},
:adder=>proc{|asset| asset.update(:attachable_id=>pk, :attachable_type=>'Post')},
:remover=>proc{|asset| asset.update(:attachable_id=>nil, :attachable_type=>nil)},
:clearer=>proc{assets_dataset.update(:attachable_id=>nil, :attachable_type=>nil)}
end
class Note < Sequel::Model
one_to_many :assets, :key=>:attachable_id, :reciprocal=>:attachable, :conditions=>{:attachable_type=>'Note'},
:adder=>proc{|asset| asset.update(:attachable_id=>pk, :attachable_type=>'Note')},
:remover=>proc{|asset| asset.update(:attachable_id=>nil, :attachable_type=>nil)},
:clearer=>proc{assets_dataset.update(:attachable_id=>nil, :attachable_type=>nil)}
end
@asset.attachable = @post
@asset.attachable = @note
== Other advanced associations
=== Joining on multiple keys
Let's say you have two tables that are associated with each other with multiple
keys. This can be handled using Sequel's built in composite key support for
associations:
# Both of these models have an album_id, number, and disc_number fields.
# All FavoriteTracks have an associated track, but not all tracks have an
# associated favorite track
class Track < Sequel::Model
many_to_one :favorite_track, :key=>[:disc_number, :number, :album_id], :primary_key=>[:disc_number, :number, :album_id]
end
class FavoriteTrack < Sequel::Model
one_to_one :tracks, :key=>[:disc_number, :number, :album_id], :primary_key=>[:disc_number, :number, :album_id]
end
=== Tree - All Ancestors and Descendents
Let's say you want to store a tree relationship in your database, it's pretty
simple:
class Node < Sequel::Model
many_to_one :parent, :class=>self
one_to_many :children, :key=>:parent_id, :class=>self
end
You can easily get a node's parent with node.parent, and a node's children with
node.children. You can even eager load the relationship up to a certain depth:
# Eager load three generations of generations of children for a given node
Node.filter(:id=>1).eager(:children=>{:children=>:children}).all.first
# Load parents and grandparents for a group of nodes
Node.filter{id < 10}.eager(:parent=>:parent).all
What if you want to get all ancestors up to the root node, or all descendents,
without knowing the depth of the tree?
class Node < Sequel::Model
many_to_one :ancestors, :class=>self,
:eager_loader=>(proc do |eo|
# Handle cases where the root node has the same parent_id as primary_key
# and also when it is NULL
non_root_nodes = eo[:rows].reject do |n|
if [nil, n.pk].include?(n.parent_id)
# Make sure root nodes have their parent association set to nil
n.associations[:parent] = nil
true
else
false
end
end
unless non_root_nodes.empty?
id_map = {}
# Create an map of parent_ids to nodes that have that parent id
non_root_nodes.each{|n| (id_map[n.parent_id] ||= []) << n}
# Doesn't cause an infinte loop, because when only the root node
# is left, this is not called.
Node.where(Node.primary_key=>id_map.keys).eager(:ancestors).all do |node|
# Populate the parent association for each node
id_map[node.pk].each{|n| n.associations[:parent] = node}
end
end
end)
many_to_one :descendants, :eager_loader=>(proc do |eo|
id_map = {}
eo[:rows].each do |n|
# Initialize an empty array of child associations for each parent node
n.associations[:children] = []
# Populate identity map of nodes
id_map[n.pk] = n
end
# Doesn't cause an infinite loop, because the :eager_loader is not called
# if no records are returned. Exclude id = parent_id to avoid infinite loop
# if the root note is one of the returned records and it has parent_id = id
# instead of parent_id = NULL.
Node.where(:parent_id=>id_map.keys).exclude(:id=>:parent_id).eager(:descendants).all do |node|
# Get the parent from the identity map
parent = id_map[node.parent_id]
# Set the child's parent association to the parent
node.associations[:parent] = parent
# Add the child association to the array of children in the parent
parent.associations[:children] << node
end
end)
end
Note that unlike ActiveRecord, Sequel supports common table expressions, which allows you to use recursive queries.
The results are not the same as in the above case, as all descendents are stored in a single association,
but all descendants can be both lazy loaded or eager loaded in a single query (assuming your database
supports recursive common table expressions). Sequel ships with an +rcte_tree+ plugin that makes
this easy:
class Node < Sequel::Model
plugin :rcte_tree
end
=== Joining multiple keys to a single key, through a third table
Let's say you have a database of songs, lyrics, and artists. Each song
may or may not have a lyric (most songs are instrumental). The lyric can be
associated to an artist in each of four ways: composer, arranger, vocalist,
or lyricist. These may all be the same, or they could all be different, and
none of them are required. The songs table has a lyric_id field to associate
it to the lyric, and the lyric table has four fields to associate it to the
artist (composer_id, arranger_id, vocalist_id, and lyricist_id).
What you want to do is get all songs for a given artist, ordered by the song's
name, with no duplicates?
class Artist < Sequel::Model
one_to_many :songs, :order=>:songs__name, \
:dataset=>proc{Song.select_all(:songs).join(Lyric, :id=>:lyric_id, id=>[:composer_id, :arranger_id, :vocalist_id, :lyricist_id])}, \
:eager_loader=>(proc do |eo|
h = eo[:id_map]
ids = h.keys
eo[:rows].each{|r| r.associations[:songs] = []}
Song.select_all(:songs).
select_append(:lyrics__composer_id, :lyrics__arranger_id, :lyrics__vocalist_id, :lyrics__lyricist_id).
join(Lyric, :id=>:lyric_id){Sequel.or(:composer_id=>ids, :arranger_id=>ids, :vocalist_id=>ids, :lyricist_id=>ids)}.
order(:songs__name).all do |song|
[:composer_id, :arranger_id, :vocalist_id, :lyricist_id].each do |x|
recs = h[song.values.delete(x)]
recs.each{|r| r.associations[:songs] << song} if recs
end
end
eo[:rows].each{|r| r.associations[:songs].uniq!}
end)
end
=== Statistics Associations (Sum of Associated Table Column)
In addition to getting associated records, you can use Sequel's association support
to get aggregate information for columns in associated tables (sums, averages, etc.).
Let's say you have a database with projects and tickets. A project can have many
tickets, and each ticket has a number of hours associated with it. You can use the
association support to create a Project association that gives the sum of hours for all
associated tickets.
class Project < Sequel::Model
one_to_many :tickets
many_to_one :ticket_hours, :read_only=>true, :key=>:id,
:dataset=>proc{Ticket.where(:project_id=>id).select{sum(hours).as(hours)}},
:eager_loader=>(proc do |eo|
eo[:rows].each{|p| p.associations[:ticket_hours] = nil}
Ticket.where(:project_id=>eo[:id_map].keys).
select_group(:project_id).
select_append{sum(hours).as(hours)}.
all do |t|
p = eo[:id_map][t.values.delete(:project_id)].first
p.associations[:ticket_hours] = t
end
end)
# The association method returns a Ticket object with a single aggregate
# sum-of-hours value, but you want it to return an Integer/Float of just the
# sum of hours, so you call super and return just the sum-of-hours value.
# This works for both lazy loading and eager loading.
def ticket_hours
if s = super
s[:hours]
end
end
end
class Ticket < Sequel::Model
many_to_one :project
end
Note that it is often better to use a sum cache instead of this approach. You can implement
a sum cache using +after_create+ and +after_delete+ hooks, or preferrably using a database trigger.