= Security Considerations with Sequel

When using Sequel, there are some security areas you should be aware of:

* Code Execution
* SQL Injection
* Denial of Service
* Mass Assignment
* General Parameter Handling

== Code Execution

The most serious security vulnerability you can have in any library is
a code execution vulnerability.  Sequel should not be vulnerable to this,
as it never calls eval on a string that is derived from user input.
However, some Sequel methods used for creating methods via metaprogramming
could conceivably be abused to do so:

* Sequel::Schema::CreateTableGenerator.add_type_method
* Sequel::Dataset.def_mutation_method
* Sequel::Dataset.def_sql_method
* Sequel::Model::Plugins.def_dataset_methods
* Sequel.def_adapter_method (private)
* Sequel::SQL::Expression.to_s_method (private)
* Sequel::Plugins::HookClassMethods::ClassMethods#add_hook_type

As long as you don't call those with user input, you should not be
vulnerable to code execution.

== SQL Injection

The primary security concern in SQL database libraries is SQL injection.
Because Sequel promotes using ruby objects for SQL concepts instead
of raw SQL, it is less likely to be vulnerable to SQL injection.
However, because Sequel still makes it easy to use raw SQL, misuse of the
library can result in SQL injection in your application.

There are basically two kinds of possible SQL injections in Sequel:

* SQL code injections
* SQL identifier injections

=== SQL Code Injections

==== Full SQL Strings

Some Sequel methods are designed to execute raw SQL strings, including:

* Sequel::Database#execute
* Sequel::Database#run
* Sequel::Database#<<
* Sequel::Dataset#fetch_rows
* Sequel::Dataset#with_sql_all
* Sequel::Dataset#with_sql_delete
* Sequel::Dataset#with_sql_each
* Sequel::Dataset#with_sql_first
* Sequel::Dataset#with_sql_insert
* Sequel::Dataset#with_sql_single_value
* Sequel::Dataset#with_sql_update

Here are some examples of use:

  DB.run 'SQL'
  DB << 'SQL'
  DB.execute 'SQL'
  DB.fetch_rows('SQL'){|row| }
  DB.dataset.with_sql_all('SQL')
  DB.dataset.with_sql_delete('SQL')
  DB.dataset.with_sql_each('SQL'){|row| }
  DB.dataset.with_sql_first('SQL')
  DB.dataset.with_sql_insert('SQL')
  DB.dataset.with_sql_single_value('SQL')
  DB.dataset.with_sql_update('SQL')

If you pass a string to these methods that is derived from user input, you open
yourself up to SQL injection.  These methods are not designed to work at all
with user input.  If you must call them with user input, you should escape the
user input manually via Sequel::Database#literal. Example:

  DB.run "SOME SQL #{DB.literal(params[:user].to_s)}"

==== Full SQL Strings, With Possible Placeholders

Other Sequel methods are designed to support execution of raw SQL strings that may contain placeholders:

* Sequel::Database#[]
* Sequel::Database#fetch
* Sequel::Dataset#with_sql

Here are some examples of use:

  DB['SQL'].all
  DB.fetch('SQL').all
  DB.dataset.with_sql('SQL').all

With these methods you should use placeholders, in which case Sequel automatically escapes the input:

  DB['SELECT * FROM foo WHERE bar = ?', params[:user].to_s]

==== Manually Created Literal Strings

Sequel generally treats ruby strings as SQL strings (escaping them correctly), and
not as raw SQL.  However, you can convert a ruby string to a literal string, and
Sequel will then treat it as raw SQL.  This is typically done through String#lit
if the {core_extensions}[rdoc-ref:doc/core_extensions.rdoc] are in use,
or Sequel.lit[rdoc-ref:Sequel::SQL::Builders#lit] if they are not in use.

  'a'.lit
  Sequel.lit('a')

Using String#lit or Sequel.lit[rdoc-ref:Sequel::SQL::Builders#lit] to turn a ruby string into a literal string results
in SQL injection if the string is derived from user input.  With both of these
methods, the strings can contain placeholders, which you can use to safely include
user input inside a literal string:

  'a = ?'.lit(params[:user_id].to_s)
  Sequel.lit('a = ?', params[:user_id].to_s)

Even though they have similar names, note that Sequel::Database#literal operates very differently from
String#lit or Sequel.lit[rdoc-ref:Sequel::SQL::Builders#lit].
Sequel::Database#literal is for taking any supported object,
and getting an SQL representation of that object, while
String#lit or Sequel.lit[rdoc-ref:Sequel::SQL::Builders#lit] are for treating
a ruby string as raw SQL.  For example:

  DB.literal(Date.today) # "'2013-03-22'"
  DB.literal('a') # "'a'"
  DB.literal(Sequel.lit('a')) # "a"
  DB.literal(:a => 'a') # "(\"a\" = 'a')"
  DB.literal(:a => Sequel.lit('a')) # "(\"a\" = a)"

==== SQL Filter Fragments

The most common way to use raw SQL with Sequel is in filters:

  DB[:table].where("name > 'M'")

If a filter method is passed a string as the first argument, it treats the rest of
the arguments (if any) as placeholders to the string.  So you should never do:

  DB[:table].where("name > #{params[:id].to_s}") # SQL Injection!

Instead, you should use a placeholder:

  DB[:table].where("name > ?", params[:id].to_s) # Safe
 
Note that for that type of query, Sequel generally encourages the following form:

  DB[:table].where{|o| o.name > params[:id].to_s} # Safe

Sequel's DSL supports a wide variety of SQL concepts, so it's possible to
code most applications without ever using raw SQL.

A large number of dataset methods ultimately pass down their arguments to a filter
method, even some you may not expect, so you should be careful.  At least the
following methods pass their arguments to a filter method:

* Sequel::Dataset#where
* Sequel::Dataset#having
* Sequel::Dataset#filter
* Sequel::Dataset#exclude
* Sequel::Dataset#exclude_where
* Sequel::Dataset#exclude_having
* Sequel::Dataset#and
* Sequel::Dataset#or
* Sequel::Dataset#first
* Sequel::Dataset#last
* Sequel::Dataset#[]

The Model.find[rdoc-ref:Sequel::Model::ClassMethods#find] and Model.find_or_create[rdoc-ref:Sequel::Model::ClassMethods#find_or_create]
class methods also call down to the filter methods.

The no_auto_string_literals extension can be used to remove the default support 
for plain strings as literal strings in filter methods.

==== SQL Fragment passed to Dataset#update

Similar to the filter methods, Sequel::Dataset#update also treats a
string argument as raw SQL:

  DB[:table].update("column = 1")

So you should not do:

  DB[:table].update("column = #{params[:value].to_s}") # SQL Injection!

Instead, you should do:

  DB[:table].update(:column => params[:value].to_s) # Safe

The no_auto_string_literals extension can also be used to remove the default support
for plain strings as literal strings in update methods.

==== SQL Fragment passed to Dataset#lock_style

The Sequel::Dataset#lock_style method also treats an input string 
as SQL code. This method should not be called with user input.

==== SQL Type Names

In general, most places where Sequel needs to use an SQL type that should
be specified by the user, it allows you to use a ruby string, and that
string is used verbatim as the SQL type.  You should not use user input
for type strings.

==== SQL Function Names

In most cases, Sequel does not quote SQL function names.  You should not use
user input for function names.

=== SQL Identifier Injections

Usually, Sequel treats ruby symbols as SQL identifiers, and ruby
strings as SQL strings.  However, there are some parts of Sequel
that treat ruby strings as SQL identifiers if an SQL string would
not make sense in the same context.

For example, Sequel::Database#from and Sequel::Dataset#from will treat a string as
a table name:

  DB.from('t') # SELECT * FROM "t"

Another place where Sequel treats ruby strings as identifiers are
the Sequel::Dataset#insert and Sequel::Dataset#update methods:

  DB[:t].update('b'=>1) # UPDATE "t" SET "b" = 1
  DB[:t].insert('b'=>1) # INSERT INTO "t" ("b") VALUES (1)

Note how the identifier is still quoted in these cases.  Sequel quotes identifiers by default
on most databases.  However, it does not quote identifiers by default on DB2 and Informix.
On those databases using an identifier derived from user input can lead to SQL injection.
Similarly, if you turn off identifier quoting manually on other databases, you open yourself
up to SQL injection if you use identifiers derived from user input.

When Sequel quotes identifiers, using an identifier derived from user input does not lead to
SQL injection, since the identifiers are also escaped when quoting.
Exceptions to this are Oracle (can't escape <tt>"</tt>) and Microsoft Access
(can't escape <tt>]</tt>).

In general, even if doesn't lead to SQL Injection, you should avoid using identifiers
derived from user input unless absolutely necessary.

Sequel also allows you to create identifiers using
Sequel.identifier[rdoc-ref:Sequel::SQL::Builders#identifier] for plain identifiers,
Sequel.qualify[rdoc-ref:Sequel::SQL::Builders#qualify] for qualified identifiers, and
Sequel.as[rdoc-ref:Sequel::SQL::Builders#as] for aliased expressions.  So if you
pass any of those values derived from user input, you are dealing with the same scenario.

Note that the issues with SQL identifiers do not just apply to places where
strings are used as identifiers, they also apply to all places where Sequel
uses symbols as identifiers.  However, if you are creating symbols from user input,
you at least have a denial of service vulnerability in ruby <2.2, and possibly a
more serious vulnerability.

== Denial of Service

Sequel converts some strings to symbols.  Because symbols in ruby <2.2 are not
garbage collected, if the strings that are converted to symbols are
derived from user input, you have a denial of service vulnerability due to
memory exhaustion.

The strings that Sequel converts to symbols are generally not derived
from user input, so Sequel in general is not vulnerable to this.  However,
users should be aware of the cases in which Sequel creates symbols, so
they do not introduce a vulnerability into their application.

=== Column Names/Aliases

Sequel returns SQL result sets as an array of hashes with symbol keys.  The
keys are derived from the name that the database server gives the column. These
names are generally static.  For example:

  SELECT column FROM table

The database will generally use "column" as the name in the result set.

If you use an alias:

  SELECT column AS alias FROM table

The database will generally use "alias" as the name in the result set. So
if you allow the user to control the alias name:

  DB[:table].select(:column.as(params[:alias]))

Then you have a denial of service vulnerability.  In general, such a vulnerability
is unlikely, because you are probably indexing into the returned hash(es) by name,
and if an alias was used and you didn't expect it, your application wouldn't work.

=== Database Connection Options

All database connection options are converted to symbols.  For a
connection URL, the keys are generally fixed, but the scheme is turned
into a symbol and the query option keys are used as connection option
keys, so they are converted to symbols as well.  For example:

  postgres://host/database?option1=foo&option2=bar

Will result in :postgres, :option1, and :option2 symbols being created.

Certain option values are also converted to symbols.  In the general case,
the sql_log_level option value is, but some adapters treat additional
options similarly.

This is not generally a risk unless you are allowing the user to control
the connection URLs or are connecting to arbitrary databases at runtime.

== Mass Assignment

Mass assignment is the practice of passing a hash of columns and values
to a single method, and having multiple column values for a given object set
based on the content of the hash.
The security issue here is that mass assignment may allow the user to
set columns that you didn't intend to allow.

The Model#set[rdoc-ref:Sequel::Model::InstanceMethods#set] and Model#update[rdoc-ref:Sequel::Model::InstanceMethods#update] methods do mass
assignment.  The default configuration of Sequel::Model allows all model
columns except for the primary key column(s) to be set via mass assignment.

Example:

  album = Album.new
  album.set(params[:album]) # Mass Assignment

Both Model.new[rdoc-ref:Sequel::Model::InstanceMethods::new] and Model.create[rdoc-ref:Sequel::Model::ClassMethods#create]
call Model#set[rdoc-ref:Sequel::Model::InstanceMethods#set] internally, so
they also allow mass assignment:

  Album.new(params[:album]) # Mass Assignment
  Album.create(params[:album]) # Mass Assignment

When the argument is derived from user input, instead of these methods, it is encouraged to either use 
Model#set_fields[rdoc-ref:Sequel::Model::InstanceMethods#set_fields] or
Model#update_fields[rdoc-ref:Sequel::Model::InstanceMethods#update_fields],
which allow you to specify which fields to allow on a per-call basis.  This
pretty much eliminates the chance that the user will be able to set a column
you did not intend to allow:

  album.set_fields(params[:album], [:name, :copies_sold])
  album.update_fields(params[:album], [:name, :copies_sold])

These two methods iterate over the second argument (+:name+ and +:copies_sold+ in
this example) instead of iterating over the entries in the first argument
(<tt>params[:album]</tt> in this example).

In addition to these two methods, you can also use 
Model#set_only[rdoc-ref:Sequel::Model::InstanceMethods#set_only] or
Model#update_only[rdoc-ref:Sequel::Model::InstanceMethods#update_only],
which are similar but iterate over the entries in the first argument, checking
the second argument to see if setting the entries is allowed.

  album.set_only(params[:album], [:name, :copies_sold])
  album.update_only(params[:album], [:name, :copies_sold])

If you expect all entries in the second argument to be present in the first
argument, use +set_fields+ or +update_fields+.  If you are not sure if all
arguments in the second argument will be present in the first argument, but
do not want to allow setting any column other than the ones listed in the
second argument, use +set_only+ or +update_only+.

You can override the columns to allow by default during mass assignment via
the Model.set_allowed_columns[rdoc-ref:Sequel::Model::ClassMethods#set_allowed_columns] class method.  This is a good
practice, though being explicit on a per-call basis is still recommended:

  Album.set_allowed_columns(:name, :copies_sold)
  Album.create(params[:album]) # Only name and copies_sold set

For more details on the mass assignment methods, see the {Mass Assignment Guide}[rdoc-ref:doc/mass_assignment.rdoc].

== General Parameter Handling

This issue isn't necessarily specific to Sequel, but it is a good general practice.
If you are using values derived from user input, it is best to be explicit about
their type.  For example:

  Album.where(:id=>params[:id])

is probably a bad idea.  Assuming you are using a web framework, <tt>params[:id]</tt> could
be a string, an array, a hash, or nil.

Assuming that +id+ is an integer field, you probably want to do:

  Album.where(:id=>params[:id].to_i)

If you are looking something up by name, you should try to enforce the value to be
a string:

  Album.where(:name=>params[:name].to_s)

If you are trying to use an IN clause with a list of id values based on input provided
on a web form:

  Album.where(:id=>params[:ids].to_a.map{|i| i.to_i})

Basically, be as explicit as possible. While there aren't any known security issues
in Sequel when you do:

  Album.where(:id=>params[:id])

It allows the attacker to choose to do any of the following queries:

  id IS NULL # nil
  id = '1' # '1'
  id IN ('1', '2', '3') # ['1', '2', '3']
  id = ('a' = 'b') # {'a'=>'b'}
  id = ('a' IN ('a', 'b') AND 'c' = '') # {'a'=>['a', 'b'], 'c'=>''}

While none of those allow for SQL injection, it's possible that they
might have an issue in your application.  For example, a long array
or deeply nested hash might cause the database to have to do a lot of
work that could be avoided.

In general, it's best to let the attacker control as little as possible,
and explicitly specifying types helps a great deal there.