Dryopteris
==========
Dryopteris erythrosora is the Japanese Shield Fern. It also can be used to sanitize HTML to help prevent XSS attacks.
* [Dryopteris erythrosora](http://en.wikipedia.org/wiki/Dryopteris_erythrosora)
* [XSS Attacks](http://en.wikipedia.org/wiki/Cross-site_scripting)
Usage
-----
Let's say you run a web site, and you allow people to post HTML snippets.
Let's also say some script-kiddie from Norland posts this to your site, in an effort to swipe some credit cards:
Oooh, that could be bad. Here's how to fix it:
safe_html_snippet = Dryopteris.sanitize(dangerous_html_snippet)
Yeah, it's that easy.
In this example, safe\_html\_snippet will have all of its __broken markup fixed__ by libxml2, and it will also be completely __sanitized of harmful tags and attributes__. That's twice as clean!
More Usage
-----
You're still here? Ok, let me tell you a little something about the two different methods of sanitizing the Dryopteris offers.
### Fragments
The first method is for _html fragments_, which are small snippets of markup such as those used in forum posts, emails and homework assignments.
Usage is the same as above:
safe_html_snippet = Dryopteris.sanitize(dangerous_html_snippet)
Generally speaking, unless you expect to have <html> and <body> tags in your HTML, this is the sanitizing method to use.
The only real limitation on this method is that the snippet must be a string object. (Support for IO objects was sacrificed at the altar of fixer-uppery-ness. If you need to sanitize data that's coming from an IO object, either socket or file, check out the next section on __Documents__).
### Documents
Sometimes you need to sanitize an entire HTML document. (Well, maybe not _you_, but other people, certainly.)
safe_html_document = Dryopteris.sanitize_document(dangerous_html_document)
The returned string will contain exactly one (1) well-formed HTML document, with all broken HTML fixed and all harmful tags and attributes removed.
Coolness: dangerous\_html\_document can be a string OR an IO object (a file, or a socket, or ...). Which makes it particularly easy to sanitize large numbers of docs.
Standing on the Shoulders of Giants
-----
Dryopteris uses [Nokogiri](http://nokogiri.rubyforge.org/) and [libxml2](http://xmlsoft.org/), so it's fast.
Dryopteris also takes its tag and tag attribute whitelists and its CSS sanitizer directly from [HTML5](http://code.google.com/p/html5lib/).
Authors
-----
* [Bryan Helmkamp](http://www.brynary.com/)
* [Mike Dalessio](http://mike.daless.io/) ([twitter](http://twitter.com/flavorjones))
Quotes About Dryopteris
-----
> "dryopteris shields you from xss attacks using nokogiri and NY attitude"
> - [hasmanyjosh](http://blog.hasmanythrough.com/)
> "I just wanted to say thank you for your dryopteris plugin. It is by far the best sanitization I've found."
> - [catalystmediastudios](http://github.com/catalystmediastudios)