README.md in compare-xml-0.6 vs README.md in compare-xml-0.61
- old
+ new
@@ -15,21 +15,19 @@
## Installation
Add this line to your application's Gemfile:
-```ruby
-gem 'compare-xml'
-```
+ gem 'compare-xml'
And then execute:
- $ bundle
+ bundle
Or install it yourself as:
- $ gem install compare-xml
+ gem install compare-xml
## Usage
@@ -63,283 +61,250 @@
## Options at a Glance
CompareXML has a variety of options that can be invoked as an optional argument, e.g.:
```ruby
-CompareXML.equivalent?(doc1, doc2, {ignore_comments: false, verbose: true, ...})
+CompareXML.equivalent?(doc1, doc2, {collapse_whitespace: false, verbose: true, ...})
```
-- `collapse_whitespace: {true|false}` default: **`true`** [→ read more ←](#collapse_whitespace)
- - when `true`, trims and collapses whitespace
+- `collapse_whitespace: {true|false}` default: **`true`** [⇨ show examples ⇦](#collapse_whitespace)
+ - when `true`, trims and collapses whitespace
-- `ignore_attr_order: {true|false}` default: **`true`** [→ read more ←](#ignore_attr_order)
- - when `true`, ignores attribute order within tags
+- `ignore_attr_order: {true|false}` default: **`true`** [⇨ show examples ⇦](#ignore_attr_order)
+ - when `true`, ignores attribute order within tags
-- `ignore_attr_content: [string1, string2, ...]` default: **`[]`** [→ read more ←](#ignore_attr_content)
- - when provided, ignores all attributes that contain substrings `string`, `string2`, etc.
+- `ignore_attr_content: [string1, string2, ...]` default: **`[]`** [⇨ show examples ⇦](#ignore_attr_content)
+ - when provided, ignores all attributes that contain substrings `string`, `string2`, etc.
-- `ignore_attrs: [css_selector1, css_selector1, ...]` default: **`[]`** [→ read more ←](#ignore_attrs)
- - when provided, ignores specific *attributes* using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp)
+- `ignore_attrs: [css_selector1, css_selector1, ...]` default: **`[]`** [⇨ show examples ⇦](#ignore_attrs)
+ - when provided, ignores specific *attributes* using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp)
-- `ignore_comments: {true|false}` default: **`true`** [...](#ignore_comments)
- - when `true`, ignores comments, such as `<!-- comment -->`
+- `ignore_comments: {true|false}` default: **`true`** [⇨ show examples ⇦](#ignore_comments)
+ - when `true`, ignores comments, such as `<!-- comment -->`
-- `ignore_nodes: [css_selector1, css_selector1, ...]` default: **`[]`** [→ read more ←](#ignore_nodes)
- - when provided, ignores specific *nodes* using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp)
+- `ignore_nodes: [css_selector1, css_selector1, ...]` default: **`[]`** [⇨ show examples ⇦](#ignore_nodes)
+ - when provided, ignores specific *nodes* using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp)
-- `ignore_text_nodes: {true|false}` default: **`false`** [→ read more ←](#ignore_text_nodes)
- - when `true`, ignores all text content within a document
+- `ignore_text_nodes: {true|false}` default: **`false`** [⇨ show examples ⇦](#ignore_text_nodes)
+ - when `true`, ignores all text content within a document
-- `verbose: {true|false}` default: **`false`** [→ read more ←](#verbose)
- - when `true`, instead of a boolean, `CompareXML.equivalent?` returns an array of discrepancies.
+- `verbose: {true|false}` default: **`false`** [⇨ show examples ⇦](#verbose)
+ - when `true`, instead of a boolean, `CompareXML.equivalent?` returns an array of discrepancies.
## Options in Depth
- <a id="collapse_whitespace"></a>`collapse_whitespace: {true|false}` default: **`true`**
When `true`, all text content within the document is trimmed (i.e. space removed from left and right) and whitespace is collapsed (i.e. tabs, new lines, multiple whitespace characters are replaced by a single whitespace).
- **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {collapse_whitespace: true})`
+ **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {collapse_whitespace: true})`
**Example:** When `true` the following HTML strings are considered equal:
- <a href="/admin"> SOME TEXT CONTENT </a>
- <a href="/index"> SOME TEXT CONTENT </a>
+ <a href="/admin"> SOME TEXT CONTENT </a>
+ <a href="/index"> SOME TEXT CONTENT </a>
- **Example:** When `true` the following HTML strings are considered equal:
+ **Example:** When `true` the following HTML strings are considered equal:
- <html>
- <title>
- This is my title
- </title>
- </html>
+ <html>
+ <title>
+ This is my title
+ </title>
+ </html>
- <html><title>This is my title</title></html>
+ <html><title>This is my title</title></html>
----------
- <a id="ignore_attr_order"></a>`ignore_attr_order: {true|false}` default: **`true`**
When `true`, all attributes are sorted before comparison and only attributes of the same type are compared.
- **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_attr_order: true})`
+ **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_attr_order: true})`
**Example:** When `true` the following HTML strings are considered equal:
- <a href="/admin" class="button" target="_blank">Link</a>
- <a class="button" target="_blank" href="/admin">Link</a>
+ <a href="/admin" class="button" target="_blank">Link</a>
+ <a class="button" target="_blank" href="/admin">Link</a>
- **Example:** When `false` the above HTML strings are compared as follows:
+ **Example:** When `false` the above HTML strings are compared as follows:
- href="admin" != class="button
+ href="admin" != class="button
- The comparison of the `<a>` element will stop at this point, since a discrepancy is found.
+ The comparison of the `<a>` element will stop at this point, since a discrepancy is found.
- **Example:** When `true` the following HTML strings are compared as follows:
+ **Example:** When `true` the following HTML strings are compared as follows:
- <a href="/admin" class="button" target="_blank">Link</a>
- <a class="button" target="_blank" href="/admin" rel="nofollow">Link</a>
+ <a href="/admin" class="button" target="_blank">Link</a>
+ <a class="button" target="_blank" href="/admin" rel="nofollow">Link</a>
- class="button" == class="button"
- href="/admin" == href="/admin"
- =! rel="nofollow"
- target="_blank" == target="_blank"
+ class="button" == class="button"
+ href="/admin" == href="/admin"
+ =! rel="nofollow"
+ target="_blank" == target="_blank"
----------
- <a id="ignore_attr_content"></a>`ignore_attr_content: [string1, string2, ...]` default: **`[]`**
When provided, ignores all **attributes** that contain any of the given substrings. **Note:** types of attributes still have to match (i.e. `<p>` = `<p>`, `<div>` = `<div>`, etc).
- **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_attr_content: ['button']})`
+ **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_attr_content: ['button']})`
**Example:** With `ignore_attr_content: ['button']` the following HTML strings are considered equal:
- <a href="/admin" id="button_1" class="blue button">Link</a>
- <a href="/admin" id="button_2" class="info button">Link</a>
+ <a href="/admin" id="button_1" class="blue button">Link</a>
+ <a href="/admin" id="button_2" class="info button">Link</a>
**Example:** With `ignore_attr_content: ['menu']` the following HTML strings are considered equal:
- <a class="menu left" data-scope="abrth$menu" role="side-menu">Link</a>
- <a class="main menu" data-scope="ergeh$menu" role="main-menu">Link</a>
+ <a class="menu left" data-scope="abrth$menu" role="side-menu">Link</a>
+ <a class="main menu" data-scope="ergeh$menu" role="main-menu">Link</a>
----------
- <a id="ignore_attrs"></a>`ignore_attrs: [css_selector1, css_selector1, ...]` default: **`[]`**
When provided, ignores all **attributes** that satisfy a particular rule using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp).
- **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_attrs: ['a[rel="nofollow"]', 'input[type="hidden"']})`
+ **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_attrs: ['a[rel="nofollow"]', 'input[type="hidden"']})`
**Example:** With `ignore_attrs: ['a[rel="nofollow"]', 'a[target]']` the following HTML strings are considered equal:
- <a href="/admin" class="button" target="_blank">Link</a>
- <a href="/admin" class="button" target="_self" rel="nofollow">Link</a>
+ <a href="/admin" class="button" target="_blank">Link</a>
+ <a href="/admin" class="button" target="_self" rel="nofollow">Link</a>
- **Example:** With `ignore_attrs: ['a[href^="http"]', 'a[class*="button"]']` the following HTML strings are considered equal:
+ **Example:** With `ignore_attrs: ['a[href^="http"]', 'a[class*="button"]']` the following HTML strings are considered equal:
- <a href="http://google.ca" class="primary button">Link</a>
- <a href="https://google.com" class="primary button rounded">Link</a>
+ <a href="http://google.ca" class="primary button">Link</a>
+ <a href="https://google.com" class="primary button rounded">Link</a>
----------
- <a id="ignore_comments"></a>`ignore_comments: {true|false}` default: **`true`**
When `true`, ignores comments, such as `<!-- This is a comment -->`.
- **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_comments: true})`
+ **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_comments: true})`
**Example:** When `true` the following HTML strings are considered equal:
- <!-- This is a comment -->
- <!-- This is another comment -->
+ <!-- This is a comment -->
+ <!-- This is another comment -->
- **Example:** When `true` the following HTML strings are considered equal:
+ **Example:** When `true` the following HTML strings are considered equal:
- <a href="/admin"><!-- This is a comment -->Link</a>
- <a href="/admin">Link</a>
+ <a href="/admin"><!-- This is a comment -->Link</a>
+ <a href="/admin">Link</a>
----------
- <a id="ignore_nodes"></a>`ignore_nodes: [css_selector1, css_selector1, ...]` default: **`[]`**
When provided, ignores all **nodes** that satisfy a particular rule using [CSS selectors](http://www.w3schools.com/cssref/css_selectors.asp).
- **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_nodes: ['script', 'object']})`
+ **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_nodes: ['script', 'object']})`
**Example:** With `ignore_nodes: ['a[rel="nofollow"]', 'a[target]']` the following HTML strings are considered equal:
- <a href="/admin" class="icon" target="_blank">Link 1</a>
- <a href="/index" class="button" target="_self" rel="nofollow">Link 2</a>
+ <a href="/admin" class="icon" target="_blank">Link 1</a>
+ <a href="/index" class="button" target="_self" rel="nofollow">Link 2</a>
- **Example:** With `ignore_nodes: ['b', 'i']` the following HTML strings are considered equal:
+ **Example:** With `ignore_nodes: ['b', 'i']` the following HTML strings are considered equal:
- <a href="/admin"><i class"icon bulb"></i><b>Warning:</b> Link</a>
- <a href="/admin"><i class"icon info"></i><b>Message:</b> Link</a>
+ <a href="/admin"><i class"icon bulb"></i><b>Warning:</b> Link</a>
+ <a href="/admin"><i class"icon info"></i><b>Message:</b> Link</a>
----------
- <a id="ignore_text_nodes"></a>`ignore_text_nodes: {true|false}` default: **`false`**
When `true`, ignores all text content. Text content is anything that is included between an opening and a closing tag, e.g. `<tag>THIS IS TEXT CONTENT</tag>`.
- **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_text_nodes: true})`
+ **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {ignore_text_nodes: true})`
**Example:** When `true` the following HTML strings are considered equal:
- <a href="/admin">SOME TEXT CONTENT</a>
- <a href="/admin">DIFFERENT TEXT CONTENT</a>
+ <a href="/admin">SOME TEXT CONTENT</a>
+ <a href="/admin">DIFFERENT TEXT CONTENT</a>
- **Example:** When `true` the following HTML strings are considered equal:
+ **Example:** When `true` the following HTML strings are considered equal:
- <i class="icon></i> <b>Warning:</b>
- <i class="icon> </i> <b>Message:</b>
+ <i class="icon></i> <b>Warning:</b>
+ <i class="icon> </i> <b>Message:</b>
----------
- <a id="verbose"></a>`verbose: {true|false}` default: **`false`**
When `true`, instead of returning a boolean value `CompareXML.equivalent?` returns an array of all errors encountered when performing a comparison.
- > **Warning:** When `true`, the comparison takes longer! Not only because more processing is required to produce meaningful differences, but also because in this mode, comparison does **NOT** stop when a first difference is encountered, because the goal is to capture as many differences as possible.
+ > **Warning:** When `true`, the comparison takes longer! Not only because more processing is required to produce meaningful differences, but also because in this mode, comparison does **NOT** stop when a first difference is encountered, because the goal is to capture as many differences as possible.
- **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {verbose: true})`
+ **Usage Example:** `CompareXML.equivalent?(doc1, doc2, {verbose: true})`
**Example:** When `true` given the following HTML strings:
- ![diffing](https://dl.dropboxusercontent.com/u/1001101/input.png)
+ ![diffing](https://github.com/vkononov/compare-xml/raw/master/img/diffing.png)
- `CompareXML.equivalent?(doc1, doc2, {verbose: true})` will produce an array shown below.
+ `CompareXML.equivalent?(doc1, doc2, {verbose: true})` will produce an array shown below.
- ```ruby
- [
- {
- node1: '<title>TITLE</title>',
- node2: '<title>ANOTHER TITLE</title>',
- diff1: 'TITLE',
- diff2: 'ANOTHER TITLE',
- },
- {
- node1: '<h1>SOME HEADING</h1>',
- node2: '<h1 id="main">SOME HEADING</h1>',
- diff1: nil,
- diff2: 'id="main"',
- },
- {
- node1: '<a href="/admin" rel="icon">Link</a>',
- node2: '<a rel="button" href="/admin">Link</a>',
- diff1: '"rel="icon"',
- diff2: '"rel="button"',
- },
- {
- node1: '<cite>Author Name</cite>',
- node2: nil,
- diff1: '<cite>Author Name</cite>',
- diff2: nil,
- },
- {
- node1: '<p class="footer">FOOTER</p>',
- node1: '<div class="footer">FOOTER</div>',
- diff1: 'p',
- diff2: 'div',
- }
- ]
- ```
-
- The structure of each hash inside the array is:
-
- node1: [Nokogiri::XML::Node] left node that contains the difference
- node2: [Nokogiri::XML::Node] right node that contains the difference
- diff1: [Nokogiri::XML::Node|String] left difference
- diff1: [Nokogiri::XML::Node|String] right difference
-
- **Node location** of `html:body:p(4)` means that the element in question is `<p>`, its hierarchical ancestors are `html > body`, and it is the **4th** `<p>` tag. That is, it could be found in
-
- <html><body><p>one</p...p>two</p...p>three</p...p>TARGET</p></body></html>
-
- > **Note:** `p(4)` means that it is the fourth tag of type `<p>`, but there could be many other tags of other types between `p(3)` and `p(4)`.
-
- **Node content** displays the discrepancy in content (which could be the name of the tag, attributes, text content, comments, etc)
-
- **Error code** is a numeric value that indicates the type of a discrepancy. CompareXML implements the following error codes
-
- ```ruby
- EQUIVALENT = 1 # nodes are equal (for internal use only)
- MISSING_ATTRIBUTE = 2 # attribute is missing its counterpart
- MISSING_NODE = 3 # node is missing its counterpart
- UNEQUAL_ATTRIBUTES = 4 # attributes are not equal
- UNEQUAL_COMMENTS = 5 # comment contents are not equal
- UNEQUAL_DOCUMENTS = 6 # document types are not equal
- UNEQUAL_ELEMENTS = 7 # nodes have the same type but are not equal
- UNEQUAL_NODES_TYPES = 8 # nodes do not have the same type
- UNEQUAL_TEXT_CONTENTS = 9 # text contents are not equal
- ```
-
- Here is an example of how these could be used:
-
```ruby
- case error_code
- when CompareXML::UNEQUAL_ATTRIBUTES
- '!='
- when CompareXML::MISSING_ATTRIBUTE
- '?'
- end
+ [
+ {
+ node1: '<title>TITLE</title>',
+ node2: '<title>ANOTHER TITLE</title>',
+ diff1: 'TITLE',
+ diff2: 'ANOTHER TITLE',
+ },
+ {
+ node1: '<h1>SOME HEADING</h1>',
+ node2: '<h1 id="main">SOME HEADING</h1>',
+ diff1: nil,
+ diff2: 'id="main"',
+ },
+ {
+ node1: '<a href="/admin" rel="icon">Link</a>',
+ node2: '<a rel="button" href="/admin">Link</a>',
+ diff1: '"rel="icon"',
+ diff2: '"rel="button"',
+ },
+ {
+ node1: '<cite>Author Name</cite>',
+ node2: nil,
+ diff1: '<cite>Author Name</cite>',
+ diff2: nil,
+ },
+ {
+ node1: '<p class="footer">FOOTER</p>',
+ node2: '<div class="footer">FOOTER</div>',
+ diff1: 'p',
+ diff2: 'div',
+ }
+ ]
```
+
+ The structure of each hash inside the array is:
+
+ node1: [Nokogiri::XML::Node] left node that contains the difference
+ node2: [Nokogiri::XML::Node] right node that contains the difference
+ diff1: [Nokogiri::XML::Node|String] left difference
+ diff2: [Nokogiri::XML::Node|String] right difference
## Contributing
\ No newline at end of file