Sanitize ======== Sanitize is a whitelist-based HTML and CSS sanitizer. Given a list of acceptable elements, attributes, and CSS properties, Sanitize will remove all unacceptable HTML and/or CSS from a string. Using a simple configuration syntax, you can tell Sanitize to allow certain HTML elements, certain attributes within those elements, and even certain URL protocols within attributes that contain URLs. You can also whitelist CSS properties, @ rules, and URL protocols you wish to allow in elements or attributes containing CSS. Any HTML or CSS that you don't explicitly allow will be removed. Sanitize is based on [Google's Gumbo HTML5 parser][gumbo], which parses HTML exactly the same way modern browsers do, and [Crass][crass], which parses CSS exactly the same way modern browsers do. As long as your whitelist config only allows safe markup and CSS, even the most malformed or malicious input will be transformed into safe output. [![Build Status](https://travis-ci.org/rgrove/sanitize.svg?branch=master)](https://travis-ci.org/rgrove/sanitize) [![Gem Version](https://badge.fury.io/rb/sanitize.svg)](http://badge.fury.io/rb/sanitize) [crass]:https://github.com/rgrove/crass [gumbo]:https://github.com/google/gumbo-parser Links ----- * [Home](https://github.com/rgrove/sanitize/) * [API Docs](http://rubydoc.info/github/rgrove/sanitize/master) * [Issues](https://github.com/rgrove/sanitize/issues) * [Release History](https://github.com/rgrove/sanitize/blob/master/HISTORY.md#sanitize-history) * [Online Demo](https://sanitize.herokuapp.com/) * [Biased comparison of Ruby HTML sanitization libraries](https://github.com/rgrove/sanitize/blob/master/COMPARISON.md) Installation ------------- ``` gem install sanitize ``` Quick Start ----------- ```ruby require 'sanitize' # Clean up an HTML fragment using Sanitize's permissive but safe Relaxed config. # This also sanitizes any CSS in `
hello!
] Sanitize.fragment(html, :elements => ['div', 'style'], :attributes => {'div' => ['style']}, :css => { :properties => ['width'] } ) #=> %[ # # # # hello! # ] ``` ### Standalone CSS Sanitize will happily clean up a standalone CSS stylesheet or property string without needing to invoke the HTML parser. ```ruby css = %[ @import url(evil.css); a { text-decoration: none; } a:hover { left: expression(alert('xss!')); text-decoration: underline; } ] Sanitize::CSS.stylesheet(css, Sanitize::Config::RELAXED) # => %[ # # # # a { text-decoration: none; } # # a:hover { # # text-decoration: underline; # } # ] Sanitize::CSS.properties(%[ left: expression(alert('xss!')); text-decoration: underline; ], Sanitize::Config::RELAXED) # => %[ # # text-decoration: underline; # ] ``` Configuration ------------- In addition to the ultra-safe default settings, Sanitize comes with three other built-in configurations that you can use out of the box or adapt to meet your needs. ### Sanitize::Config::RESTRICTED Allows only very simple inline markup. No links, images, or block elements. ```ruby Sanitize.fragment(html, Sanitize::Config::RESTRICTED) # => 'foo' ``` ### Sanitize::Config::BASIC Allows a variety of markup including formatting elements, links, and lists. Images and tables are not allowed, links are limited to FTP, HTTP, HTTPS, and mailto protocols, and a `rel="nofollow"` attribute is added to all links to mitigate SEO spam. ```ruby Sanitize.fragment(html, Sanitize::Config::BASIC) # => 'foo' ``` ### Sanitize::Config::RELAXED Allows an even wider variety of markup, including images and tables, as well as safe CSS. Links are still limited to FTP, HTTP, HTTPS, and mailto protocols, while images are limited to HTTP and HTTPS. In this mode, `rel="nofollow"` is not added to links. ```ruby Sanitize.fragment(html, Sanitize::Config::RELAXED) # => 'foo' ``` ### Custom Configuration If the built-in modes don't meet your needs, you can easily specify a custom configuration: ```ruby Sanitize.fragment(html, :elements => ['a', 'span'], :attributes => { 'a' => ['href', 'title'], 'span' => ['class'] }, :protocols => { 'a' => {'href' => ['http', 'https', 'mailto']} } ) ``` You can also start with one of Sanitize's built-in configurations and then customize it to meet your needs. The built-in configs are deeply frozen to prevent people from modifying them (either accidentally or maliciously). To customize a built-in config, create a new copy using `Sanitize::Config.merge()`, like so: ```ruby # Create a customized copy of the Basic config, adding