# CompareXML [![Gem Version](https://badge.fury.io/rb/compare-xml.svg)](https://rubygems.org/gems/compare-xml) CompareXML is a fast, lightweight and feature-rich tool that will solve your XML/HTML comparison or diffing needs. its purpose is to compare two instances of `Nokogiri::XML::Node` or `Nokogiri::XML::NodeSet` for equality or equivalency. **Features** - Fast, light-weight and highly customizable - Compares XML/HTML documents and document fragments - Can produce both detailed diffing discrepancies or execute silently - Has the ability to exclude specific nodes or attributes from all comparisons ## Installation Add this line to your application's Gemfile: gem 'compare-xml' And then execute: bundle Or install it yourself as: gem install compare-xml ## Usage Using CompareXML is as simple as ```ruby CompareXML.equivalent?(doc1, doc2) ``` where `doc1` and `doc2` are instances of `Nokogiri::XML::Node` or `Nokogiri::XML::NodeSet`. **Example** Suppose you have two files `1.html` and `2.html` that you would like to compare. You could do it as follows: ```ruby doc1 = Nokogiri::HTML(open('1.html')) doc2 = Nokogiri::HTML(open('2.html')) puts CompareXML.equivalent?(doc1, doc2) ``` The above code will print `true` or `false` depending on the result of the comparison. > If you are using CompareXML in a script, then you need to require it manually with: ```ruby require 'compare-xml' ``` ## Options at a Glance CompareXML has a variety of options that can be invoked as an optional argument, e.g.: ```ruby CompareXML.equivalent?(doc1, doc2, {collapse_whitespace: false, verbose: true, ...}) ``` - `collapse_whitespace: {true|false}` default: **`true`** [⇨ show examples ⇦](#collapse_whitespace) - when `true`, trims and collapses whitespace - `ignore_attr_order: {true|false}` default: **`true`** [⇨ show examples ⇦](#ignore_attr_order) - when `true`, ignores attribute order within tags - `ignore_attr_content: [string1, string2, ...]` default: **`[]`** [⇨ show examples ⇦](#ignore_attr_content) - when provided, ignores all attributes that contain substrings `string`, `string2`, etc. - `ignore_attrs: [css_selector1, css_selector1, ...]` default: **`[]`** [⇨ show examples ⇦](#ignore_attrs) - when provided, ignores specific *attributes* using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp) - `ignore_comments: {true|false}` default: **`true`** [⇨ show examples ⇦](#ignore_comments) - when `true`, ignores comments, such as `` - `ignore_nodes: [css_selector1, css_selector1, ...]` default: **`[]`** [⇨ show examples ⇦](#ignore_nodes) - when provided, ignores specific *nodes* using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp) - `ignore_text_nodes: {true|false}` default: **`false`** [⇨ show examples ⇦](#ignore_text_nodes) - when `true`, ignores all text content within a document - `verbose: {true|false}` default: **`false`** [⇨ show examples ⇦](#verbose) - when `true`, instead of a boolean, `CompareXML.equivalent?` returns an array of discrepancies. ## Options in Depth - `collapse_whitespace: {true|false}` default: **`true`** When `true`, all text content within the document is trimmed (i.e. space removed from left and right) and whitespace is collapsed (i.e. tabs, new lines, multiple whitespace characters are replaced by a single whitespace). **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {collapse_whitespace: true})` **Example:** When `true` the following HTML strings are considered equal: SOME TEXT CONTENT SOME TEXT CONTENT **Example:** When `true` the following HTML strings are considered equal:
` = `
`, `