# Caracal
Caracal is a ruby library for dynamically creating professional-quality Microsoft Word documents (.docx) using an HTML-style syntax.
## Installation
Add this line to your application's Gemfile:
gem 'caracal'
Then execute:
$ bundle install
## Overview
Many people don't know that .docx files are little more than a zipped collection of XML documents that follow the OfficeOpen XML (OpenXML or OOXML) standard. This means constructing a .docx file from scratch actually requires the creation of several files. Caracal abstracts users from this process by providing a simple set of Ruby commands and HTML-style syntax for generating Word content.
For each Caracal request, the following document structure will be created and zipped into the final output file:
example.docx
|- _rels
|- docProps
|- app.xml
|- core.xml
|- word
|- _rels
|- document.xml.rels
|- media
|- image001.png
|- image002.png
...
|- document.xml
|- fontTable.xml
|- footer.xml
|- numbering.xml
|- settings.xml
|- styles.xml
|- [Content_Types].xml
### File Descriptions
The following provides a brief description for each component of the final document:
**_rels/.rels**
Defines an internal identifier and type for global content items. *This file is generated automatically by the library based on other user directives.*
**docProps/app.xml**
Specifies the name of the application that generated the document. *This file is generated automatically by the library based on other user directives.*
**docProps/core.xml**
Specifies the title of the document. *This file is generated automatically by the library based on other user directives.*
**word/_rels/document.xml.rels**
Defines an internal identifier and type with all external content items (images, links, etc). *This file is generated automatically by the library based on other user directives.*
**word/media/**
A collection of media assets (each of which should have an entry in document.xml.rels).
**word/document.xml**
The main content file for the document.
**word/fontTable.xml**
Specifies the fonts used in the document.
**word/footer.xml**
Defines the formatting of the document footer.
**word/numbering.xml**
Defines ordered and unordered list styles.
**word/settings.xml**
Defines global directives for the document (e.g., whether to show background images, tab widths, etc). Also, establishes compatibility with older versions on Word.
**word/styles.xml**
Defines all paragraph and table styles used through the document. Caracal adds a default set of styles to match its HTML-like content syntax. These defaults can be overridden.
**[Content_Types].xml**
Pairs extensions and XML files with schema content types so Word can parse them correctly. *This file is generated automatically by the library based on other user directives.*
## Units
OpenXML uses a few basic units.
**Points**
Most spacing declarations are measured in full points.
**Half Points**
All font sizes are measure in half points. A font size of 24 is equivalent to 12pt.
**Eighth Points**
Borders are measured in 1/8 points. A border size of 4 is equivalent to 0.5pt.
**Twips**
A twip is 1/20 of a point. Word documents are printed at 72dpi. 1in == 72pt == 1440 twips.
**Pixels**
In Word documents, pixels are equivalent to points.
**EMUs (English Metric Unit)**
EMUs are a virtual unit designed to facilitate the smooth conversion between inches, milliimeters, and pixels for images and vector graphics. 1in == 914400 EMUs == 72dpi x 100 x 254.
## Syntax
In the following examples, the variable `docx` is assumed to be an instance of Caracal::Document.
docx = Caracal::Document.new('example_document.docx')
### File Name
The final output document's title can be set at initialization or via the `file_name` method.
docx = Caracal::Document.new('example_document.docx')
docx.file_name 'example_document.docx'
The current document name can be returned by invoking the `name` method:
docx.name # => 'example_document.docx'
*The default file name is caracal.docx.*
### Page Size
Page dimensions can be set using the `page_size` method. The method accepts two parameters for controlling the width and height of the document.
*Pages default to the United States standard A4, portrait dimensions (8.5in x 11in).*
# options via block
docx.page_size do
width 12240 # sets the page width. units in twips.
height 15840 # sets the page height. units in twips.
end
# options via hash
docx.page_size width: 12240, height: 15840
The `page_size` command will produce the following XML in the `document.xml` file:
### Page Margins
Page margins can be set using the `page_margins` method. The method accepts four parameters for controlling the margins of the document.
*Margins default to 1.0in for all sides.*
# options via block
docx.page_margins do
left 720 # sets the left margin. units in twips.
right 720 # sets the right margin. units in twips.
top 1440 # sets the top margin. units in twips.
bottom 1440 # sets the bottom margin. units in twips.
end
# options via hash
docx.page_margins left: 720, right: 720, top: 1440, bottom: 1440
The `page_margins` command above will produce the following XML in the `document.xml` file:
### Page Breaks
Page breaks can be added via the `page` method. The method accepts no parameters.
docx.page # starts a new page.
The `page` command will produce the following XML in the `document.xml` file:
### Page Numbers
Page numbers can be added to the footer via the `page_numbers` method. The method accepts an optional parameter for controlling the alignment of the text.
*Page numbers are turned off by default.*
# no options
docx.page_numbers true
# options via block
docx.page_numbers true do
align :right # controls text alignment. defaults to :center.
end
# options via hash
docx.page_numbers true, align: :right
The default command will produce the following `footer.xml` file contents.
*It will also automatically add the correct notation to the `w:sectPr` node of the `document.xml` file.*
### Fonts
Fonts are added to the font table file by calling the `font` method and passing the name of the font. At present, Caracal only supports declaring the primary font name.
docx.font name: 'Arial'
docx.font do
name 'Droid Serif'
end
These commands will produce the following `fontTable.xml` file contents:
### Styles
Style classes can be added using the `style` method. The method accepts several optional parameters to control the rendering of text using the style.
# options via block
docx.style do
type :paragraph # :paragraph or :table
id 'Heading1' # sets the internal identifier for the style.
name 'heading 1' # set the friendly name of the style.
color '333333' # sets the text color. values in hex RGB.
font 'Droid Serif' # sets the font family.
size 28 # set the font size. units in half points.
bold false # sets the font weight.
italic false # sets the font style.
underline false # sets whether or not to underline the text.
align :left # sets the alignment. accepts :left, :center, :right, and :both.
top 100 # sets the spacing above the paragraph. units in twips.
bottom 0 # sets the spacing below the paragraph. units in twips.
spacing 360 # sets the spacing between lines. units in twips.
end
The `style` command above would produce the following XML:
### Paragraphs
Text can be added using the `p` method. The `p` either takes a string and a `class` option or a block of `text`-like commands.
Text within a `p` block can be further defined using the `text` and `link` methods. The `text` method takes a text string and the optional parameters `style`, `color`, `size`, `bold`, `italic`, and `underline`. See below for details on the `link` method.
docx.p 'some text', style: 'my_style'
docx.p do
text 'Here is a sentence with a ', style: 'my_style'
link 'link', 'https://www.example.com'
text ' to something awesome', color: '555555', size: 16, bold: true, italic: true, underline: true
end
A `p` block might yield the following XML:
Here is a sentence with a
link
to something awesome.
### Headers
Headers can be added using the `h1`, `h2`, etc. methods. Text within a header block can be further defined using the `text` method.
*Ultimately, headers are just paragraphs that use header styles.*
docx.h3 'Heading 3'
The `h3` block above will yield the following XML:
Heading 3
### Links
Links can be added inside paragraphs by using the `link` method. The method accepts several optional parameters for controlling the style and behavior of the rule.
*At present, all links are assumed to be external.*
# no options
docx.p do
link 'Example Text', 'https://wwww.example.com'
end
# options via block
p do
link 'Example Text', 'https://wwww.example.com' do
style 'my_style' # sets the style class. defaults to nil.
color '0000ff' # sets the color of the text. defaults to 1155cc.
size 24 # sets the font size. units in half-points. defaults to nil.
bold false # sets whether or not the text will be bold. defaults to false.
italic false # sets whether or not the text will be italic. defaults to false.
underline true # sets whether or not the text will be underlined. defaults to true.
end
end
# options via hash
p do
link 'Example Text', 'https://wwww.example.com', color: '0000ff', underline: false
end
The `link` command with default properties will produce the following XML output:
Example Text
*Caracal will automatically generate the relationship entries required by the OpenXML standard.*
### Images
Images can be added by using the `img` method. The method accepts several optional parameters for controlling the style and placement of the asset.
# options via block
docx.img image_url('example.png') do
width 396 # sets the image width. units specified in pixels.
height 216 # sets the image height. units specified in pixels.
align :right # controls the justification of the image. default is :left.
top 10 # sets the top margin. units specified in pixels.
bottom 10 # sets the bottom margin. units specified in pixels.
left 10 # sets the left margin. units specified in pixels.
right 10 # sets the right margin. units specified in pixels.
end
# options via hash
docx.img image_url('example.png'), width: 396, height: 216, align: :right
The `img` command with default properties will produce the following XML output:
*Caracal will automatically generate the relationship entries required by the OpenXML standard.*
### Rules
Horizontal rules can be added using the `hr` method. The method accepts several optional parameters for controlling the style of the rule.
# no options
docx.hr # defaults to a thin, single line.
# options via block
docx.hr do
color '333333' # controls the color of the line. defaults to auto.
line :double # controls the line style (single or double). defaults to single.
size 8 # controls the thickness of the line. units in 1/8 points. defaults to 4.
spacing 4 # controls the spacing around the line. units in points. defaults to 1.
end
# options via hash
docx.hr color: '333333', line: :double, size: 8, spacing: 2
The `hr` command with default properties will produce the following XML output:
### Ordered Lists
Ordered lists can be added using the `ol` and `li` methods. The `li` method substantially follows the same rules as the the `p` method; here, simpler examples are demonstrated.
docx.ol do
li 'First item'
li 'Second item'
end
The `ol` and `li` commands with default properties will produce the following XML (assuming the ordered list styles have the abstractNumId=2 in the `numbering.xml` file).
First item
Second item
### Unordered Lists
Unordered lists can be added using the `ul` and `li` methods. The `li` method substantially follows the same rules as the the `p` method; here, simpler examples are demonstrated.
docx.ul do
li 'First item'
li 'Second item'
end
The `ul` and `li` commands with default properties will produce the following XML (assuming the ordered list styles have the abstractNumId=1 in the `numbering.xml` file).
First item
Second item
### Tables
Tables can be added using the `table` method. The method accepts several optional paramters to control the layout and style of the table cells.
table data, border: 8 do
cell_style rows(0), background_color: '4a86e8', bold: true
end
Given the a data structure with two rows and five columns, the `table` method would produce the following XML:
Field
Response
Perf. Quality
Data Quality
State
After-Hours trading
Yes
B
B
published
### Line Breaks
Line breaks can be added via the `br` method. The method accepts no parameters.
docx.br # adds a blank line using the default paragrpah style.
The `br` command will produce the folowing XML:
## Template Rendering
Caracal includes [Tilt](https://github.com/rtomayko/tilt) integration to facilitate its inclusion in other frameworks. Rails integration can be added via the [Caracal-Rails](https://github.com/trade-informatics/caracal-rails) gem.
## Defaults
[Unsure how best to handle this without code exploration. Not a critical element for the first version.]
## Contributing
1. Fork it ( https://github.com/trade-informatics/caracal/fork )
2. Create your feature branch (`git checkout -b my-new-feature`)
3. Commit your changes (`git commit -am 'Add some feature'`)
4. Push to the branch (`git push origin my-new-feature`)
5. Create a new Pull Request
## Why is It Called Caracal?
Because my son likes caracals. :)
## Inspiration
A tip of the hat to the wonderful PDF generation library [Prawn](https://github.com/prawnpdf/prawn).
## License
Copyright (c) 2014 Trade Informatics, Inc
[MIT License](https://github.com/trade-informatics/caracal/blob/master/LICENSE.txt)