This code is under the Apache License 2.0. http://www.apache.org/licenses/LICENSE-2.0 This is a ruby port of arc90's readability project http://lab.arc90.com/experiments/readability/ Given a html document, it pulls out the main body text and cleans it up. Ruby port by starrhorne and iterationlabs