= rack-rewrite

A rack middleware for defining and applying rewrite rules. In many cases you 
can get away with rack-rewrite instead of writing Apache mod_rewrite rules.

== References

* Source[http://github.com/jtrupiano/rack-rewrite]
* {Rack::Rewrite for Site Maintenance and Downtime}[http://blog.smartlogicsolutions.com/2009/11/16/rack-rewrite-for-site-maintenance-and-downtime/]
* {Rack::Rewrite + Google Analytics Makes Site Transitions Seamless}[http://blog.smartlogicsolutions.com/2009/11/24/rack-rewrite-google-analytics-makes-site-transitions-seamless/]

== Usage

=== Sample rackup file
  
  gem 'rack-rewrite', '~> 1.2.1'
  require 'rack/rewrite'
  use Rack::Rewrite do
    rewrite '/wiki/John_Trupiano', '/john'
    r301 '/wiki/Yair_Flicker', '/yair'
    r302 '/wiki/Greg_Jastrab', '/greg'
    r301 %r{/wiki/(\w+)_\w+}, '/$1'
  end

=== Sample usage in a rails app

  config.middleware.insert_before(Rack::Lock, Rack::Rewrite) do
    rewrite '/wiki/John_Trupiano', '/john'
    r301 '/wiki/Yair_Flicker', '/yair'
    r302 '/wiki/Greg_Jastrab', '/greg'
    r301 %r{/wiki/(\w+)_\w+}, '/$1'
  end

== Use Cases

=== Rebuild of existing site in a new technology

It's very common for sites built in older technologies to be rebuilt with the 
latest and greatest.  Let's consider a site that has already established quite
a bit of "google juice."  When we launch the new site, we don't want to lose
that hard-earned reputation.  By writing rewrite rules that issue 301's for
old URL's, we can "transfer" that google ranking to the new site.  An example
rule might look like:

  r301 '/contact-us.php', '/contact-us'
  r301 '/wiki/John_Trupiano', '/john'

=== Retiring old routes

As a web application evolves you will undoubtedly reach a point where you need
to change the name of something (a model, e.g.).  This name change will
typically require a similar change to your routing.  The danger here is that
any URL's previously generated (in a transactional email for instance) will
have the URL hard-coded.  In order for your rails app to continue to serve
this URL, you'll need to add an extra entry to your routes file.
Alternatively, you could use rack-rewrite to redirect or pass through requests
to these routes and keep your routes.rb clean.

  rewrite %r{/features(.*)}, '/facial_features$1'

=== CNAME alternative

In the event that you do not control your DNS, you can leverage Rack::Rewrite
to redirect to a canonical domain.  In the following rule we utilize the
$& substitution operator to capture the entire request URI.

  r301 %r{.*}, 'http://mynewdomain.com$&', :if => Proc.new {|rack_env|
    rack_env['SERVER_NAME'] != 'mynewdomain.com'
  }

=== Site Maintenance

Most capistrano users will be familiar with the following Apache rewrite rules:

  RewriteCond %{REQUEST_URI} !\.(css|jpg|png)$
  RewriteCond %{DOCUMENT_ROOT}/system/maintenance.html -f
  RewriteCond %{SCRIPT_FILENAME} !maintenance.html
  RewriteRule ^.*$ /system/maintenance.html [L]

This rewrite rule says to render a maintenance page for all non-asset requests
if the maintenance file exists.  In capistrano, you can quickly upload a 
maintenance file using:

  cap deploy:web:disable REASON=upgrade UNTIL=12:30PM
  
We can replace the mod_rewrite rules with the following Rack::Rewrite rule:

  maintenance_file = File.join(RAILS_ROOT, 'public', 'system', 'maintenance.html')
  send_file /.*/, maintenance_file, :if => Proc.new { |rack_env|
    File.exists?(maintenance_file) && rack_env['PATH_INFO'] !~ /\.(css|jpg|png)/
  }

If you're running Ruby 1.9, this rule is simplified:

  maintenance_file = File.join(RAILS_ROOT, 'public', 'system', 'maintenance.html')
  send_file /(.*)$(?<!css|png|jpg)/, maintenance_file, :if => Proc.new { |rack_env|
    File.exists?(maintenance_file)
  }

For those using the oniguruma gem with their ruby 1.8 installation, you can 
get away with:

  maintenance_file = File.join(RAILS_ROOT, 'public', 'system', 'maintenance.html')
  send_file Oniguruma::ORegexp.new("(.*)$(?<!css|png|jpg)"), maintenance_file, :if => Proc.new { |rack_env|
    File.exists?(maintenance_file)
  }

== Rewrite Rules

=== :rewrite

Calls to #rewrite will simply update the PATH_INFO, QUERY_STRING and 
REQUEST_URI HTTP header values and pass the request onto the next chain in 
the Rack stack.  The URL that a user's browser will show will not be changed.  
See these examples:

  rewrite '/wiki/John_Trupiano', '/john'   # [1]
  rewrite %r{/wiki/(\w+)_\w+}, '/$1'       # [2]

For [1], the user's browser will continue to display /wiki/John_Trupiano, but
the actual HTTP header values for PATH_INFO and REQUEST_URI in the request 
will be changed to /john for subsequent nodes in the Rack stack.  Rails
reads these headers to determine which routes will match.

Rule [2] showcases the use of regular expressions and substitutions.  [2] is a 
generalized version of [1] that will match any /wiki/FirstName_LastName URL's
and rewrite them as the first name only.  This is an actual catch-all rule we 
applied when we rebuilt our website in September 2009 
( http://www.smartlogicsolutions.com ).

=== :r301, :302

Calls to #r301 and #r302 have the same signature as #rewrite.  The difference,
however, is that these actually short-circuit the rack stack and send back
301's and 302's, respectively.  See these examples:

  r301 '/wiki/John_Trupiano', '/john'                # [1]
  r301 '/wiki/(.*)', 'http://www.google.com/?q=$1'   # [2]
  
Recall that rules are interpreted from top to bottom.  So you can install 
"default" rewrite rules if you like.  [2] is a sample default rule that
will redirect all other requests to the wiki to a google search.

=== :send_file, :x_send_file

Calls to #send_file and #x_send_file also have the same signature as #rewrite.
If the rule matches, the 'to' parameter is interpreted as a path to a file
to be rendered instead of passing the application call up the rack stack.

  send_file /*/, 'public/spammers.htm', :if => Proc.new { |rack_env|
    rack_env['HTTP_REFERER'] =~ 'spammers.com'
  }
  x_send_file /^blog\/.*/, 'public/blog_offline.htm', :if => Proc.new { |rack_env|
    File.exists?('public/blog_offline.htm')
  }

== Options Parameter

Each rewrite rule takes an optional options parameter.  The following options
are supported.

=== :host

Using the :host option you can match requests to a specific hostname.

  r301 "/features", "/facial_features", :host => "facerecognizer.com"

This rule will only match when the hostname is "facerecognizer.com"

=== :headers

Using the :headers option you can set custom response headers e.g. for HTTP 
caching instructions.

  r301 "/features", "/facial_features", :headers => {'Cache-Control' => 'no-cache'}

=== :method

Using the :method option you can restrict the matching of a rule by the HTTP 
method of a given request.

  # redirect GET's one way
  r301 "/players", "/current_players", :method => :get

  # and redirect POST's another way
  r302 "/players", "/no_longer_available.html?message=No&longer&supported", :method => :post

=== :if

Using the :if option you can define arbitrary rule guards.  Guards are any
object responding to #call that return true or false indicating whether the 
rule matches.  The following example demonstrates how the presence of a 
maintenance page on the filesystem can be utilized to take your site(s) offline.

  maintenance_file = File.join(RAILS_ROOT, 'public', 'system', 'maintenance.html')
  x_send_file /.*/, maintenance_file, :if => Proc.new { |rack_env| 
    File.exists?(maintenance_file)
  }

=== :not

Using the :not option you can negatively match against the path.  This can
be useful when writing a regular expression match is difficult.

  rewrite %r{^\/features}, '/facial_features', :not => '/features'

This will not match the relative URL /features but would match /features.xml.

== Tips

=== Keeping your querystring

When rewriting a URL, you may want to keep your querystring in tact (for 
example if you're tracking traffic sources).  You will need to include a
capture group and substitution pattern in your rewrite rule to achieve this.

  rewrite %r{/wiki/John_Trupiano(\?.*)?}, '/john$1'
  
This rule will store the querystring in a capture group (via '(?.*)' ) and
will substitute the querystring back into the rewritten URL (via $1).

=== Arbitrary Rewriting

All rules support passing a Proc as the second argument allowing you to
perform arbitrary rewrites.  The following rule will rewrite all requests
received between 12AM and 8AM to an unavailable page.

  rewrite %r{(.*)}, lambda { |match, rack_env|
    Time.now.hour < 8 ? "/unavailable.html" : match[1]
  }
  
== Copyright

Copyright (c) 2009-2011 John Trupiano. See LICENSE for details.