Dryopteris ========== Dryopteris erythrosora is the Japanese Shield Fern. It also can be used to sanitize HTML to help prevent XSS attacks. * [Dryopteris erythrosora](http://en.wikipedia.org/wiki/Dryopteris_erythrosora) * [XSS Attacks](http://en.wikipedia.org/wiki/Cross-site_scripting) Usage ----- Let's say you run a web site, and you allow people to post HTML snippets. Let's also say some script-kiddie from Norland posts this to your site, in an effort to swipe some credit cards: Oooh, that could be bad. Here's how to fix it: safe_html_snippet = Dryopteris.sanitize(dangerous_html_snippet) Yeah, it's that easy. In this example, safe\_html\_snippet will have all of its __broken markup fixed__ by libxml2, and it will also be completely __sanitized of harmful tags and attributes__. That's twice as clean! More Usage ----- You're still here? Ok, let me tell you a little something about the two different methods of sanitizing the Dryopteris offers. ### Fragments The first method is for _html fragments_, which are small snippets of markup such as those used in forum posts, emails and homework assignments. Usage is the same as above: safe_html_snippet = Dryopteris.sanitize(dangerous_html_snippet) Generally speaking, unless you expect to have <html> and <body> tags in your HTML, this is the sanitizing method to use. The only real limitation on this method is that the snippet must be a string object. (Support for IO objects was sacrificed at the altar of fixer-uppery-ness. If you need to sanitize data that's coming from an IO object, either socket or file, check out the next section on __Documents__). ### Documents Sometimes you need to sanitize an entire HTML document. (Well, maybe not _you_, but other people, certainly.) safe_html_document = Dryopteris.sanitize_document(dangerous_html_document) The returned string will contain exactly one (1) well-formed HTML document, with all broken HTML fixed and all harmful tags and attributes removed. Coolness: dangerous\_html\_document can be a string OR an IO object (a file, or a socket, or ...). Which makes it particularly easy to sanitize large numbers of docs. Standing on the Shoulders of Giants ----- Dryopteris uses [Nokogiri](http://nokogiri.rubyforge.org/) and [libxml2](http://xmlsoft.org/), so it's fast. Dryopteris also takes its tag and tag attribute whitelists and its CSS sanitizer directly from [HTML5](http://code.google.com/p/html5lib/). Authors ----- * [Bryan Helmkamp](http://www.brynary.com/) * [Mike Dalessio](http://mike.daless.io/) ([twitter](http://twitter.com/flavorjones)) Quotes About Dryopteris ----- > "dryopteris shields you from xss attacks using nokogiri and NY attitude" > - [hasmanyjosh](http://blog.hasmanythrough.com/) > "I just wanted to say thank you for your dryopteris plugin. It is by far the best sanitization I've found." > - [catalystmediastudios](http://github.com/catalystmediastudios)