# Caracal
Caracal is a ruby library for dynamically creating professional-quality Microsoft Word documents (.docx) using an HTML-style syntax.
## Installation
Add this line to your application's Gemfile:
gem 'caracal'
Then execute:
$ bundle install
## Overview
Many people don't know that .docx files are little more than a zipped collection of XML documents that follow the OfficeOpen XML (OpenXML or OOXML) standard. This means constructing a .docx file from scratch actually requires the creation of several files. Caracal abstracts users from this process by providing a simple set of Ruby commands and HTML-style syntax for generating Word content.
For each Caracal request, the following document structure will be created and zipped into the final output file:
|- _rels
|- docProps
|- app.xml
|- core.xml
|- word
|- _rels
|- document.xml.rels
|- media
|- image001.png
|- image002.png
|- document.xml
|- fontTable.xml
|- footer.xml
|- numbering.xml
|- settings.xml
|- styles.xml
|- [Content_Types].xml
### File Descriptions
The following provides a brief description for each component of the final document:
Defines an internal identifier and type for global content items. *This file is generated automatically by the library based on other user directives.*
Specifies the name of the application that generated the document. *This file is generated automatically by the library based on other user directives.*
Specifies the title of the document. *This file is generated automatically by the library based on other user directives.*
Defines an internal identifier and type with all external content items (images, links, etc). *This file is generated automatically by the library based on other user directives.*
A collection of media assets (each of which should have an entry in document.xml.rels).
The main content file for the document.
Specifies the fonts used in the document.
Defines the formatting of the document footer.
Defines ordered and unordered list styles.
Defines global directives for the document (e.g., whether to show background images, tab widths, etc). Also, establishes compatibility with older versions on Word.
Defines all paragraph and table styles used through the document. Caracal adds a default set of styles to match its HTML-like content syntax. These defaults can be overridden.
Pairs extensions and XML files with schema content types so Word can parse them correctly. *This file is generated automatically by the library based on other user directives.*
## Units
OpenXML uses a few basic units.
Most spacing declarations are measured in full points.
**Half Points**
All font sizes are measure in half points. A font size of 24 is equivalent to 12pt.
**Eighth Points**
Borders are measured in 1/8 points. A border size of 4 is equivalent to 0.5pt.
A twip is 1/20 of a point. Word documents are printed at 72dpi. 1in == 72pt == 1440 twips.
In Word documents, pixels are equivalent to points.
**EMUs (English Metric Unit)**
EMUs are a virtual unit designed to facilitate the smooth conversion between inches, milliimeters, and pixels for images and vector graphics. 1in == 914400 EMUs == 72dpi x 100 x 254.
## Syntax
In the following examples, the variable `docx` is assumed to be an instance of Caracal::Document.
docx = Caracal::Document.new('example_document.docx')
### File Name
The final output document's title can be set at initialization or via the `file_name` method.
docx = Caracal::Document.new('example_document.docx')
docx.file_name 'example_document.docx'
The current document name can be returned by invoking the `name` method:
docx.name # => 'example_document.docx'
*The default file name is caracal.docx.*
### Page Size
Page dimensions can be set using the `page_size` method. The method accepts two parameters for controlling the width and height of the document.
*Pages default to the United States standard A4, portrait dimensions (8.5in x 11in).*
# options via block
docx.page_size do
width 12240 # sets the page width. units in twips.
height 15840 # sets the page height. units in twips.
# options via hash
docx.page_size width: 12240, height: 15840
The `page_size` command will produce the following XML in the `document.xml` file:
### Page Margins
Page margins can be set using the `page_margins` method. The method accepts four parameters for controlling the margins of the document.
*Margins default to 1.0in for all sides.*
# options via block
docx.page_margins do
left 720 # sets the left margin. units in twips.
right 720 # sets the right margin. units in twips.
top 1440 # sets the top margin. units in twips.
bottom 1440 # sets the bottom margin. units in twips.
# options via hash
docx.page_margins left: 720, right: 720, top: 1440, bottom: 1440
The `page_margins` command above will produce the following XML in the `document.xml` file:
### Page Breaks
Page breaks can be added via the `page` method. The method accepts no parameters.
docx.page # starts a new page.
The `page` command will produce the following XML in the `document.xml` file:
### Page Numbers
Page numbers can be added to the footer via the `page_numbers` method. The method accepts an optional parameter for controlling the alignment of the text.
*Page numbers are turned off by default.*
# no options
docx.page_numbers true
# options via block
docx.page_numbers true do
align :right # controls text alignment. defaults to :center.
# options via hash
docx.page_numbers true, align: :right
The default command will produce the following `footer.xml` file contents.
*It will also automatically add the correct notation to the `w:sectPr` node of the `document.xml` file.*
### Fonts
Fonts are added to the font table file by calling the `font` method and passing the name of the font. At present, Caracal only supports declaring the primary font name.
docx.font name: 'Arial'
docx.font do
name 'Droid Serif'
These commands will produce the following `fontTable.xml` file contents:
### Styles
Style classes can be added using the `style` method. The method accepts several optional parameters to control the rendering of text using the style.
# options via block
docx.style do
type :paragraph # :paragraph or :table
id 'Heading1' # sets the internal identifier for the style.
name 'heading 1' # set the friendly name of the style.
color '333333' # sets the text color. values in hex RGB.
font 'Droid Serif' # sets the font family.
size 28 # set the font size. units in half points.
bold false # sets the font weight.
italic false # sets the font style.
underline false # sets whether or not to underline the text.
align :left # sets the alignment. accepts :left, :center, :right, and :both.
top 100 # sets the spacing above the paragraph. units in twips.
bottom 0 # sets the spacing below the paragraph. units in twips.
spacing 360 # sets the spacing between lines. units in twips.
The `style` command above would produce the following XML:
### Paragraphs
Text can be added using the `p` method. The `p` either takes a string and a `class` option or a block of `text`-like commands.
Text within a `p` block can be further defined using the `text` and `link` methods. The `text` method takes a text string and the optional parameters `style`, `color`, `size`, `bold`, `italic`, and `underline`. See below for details on the `link` method.
docx.p 'some text', style: 'my_style'
docx.p do
text 'Here is a sentence with a ', style: 'my_style'
link 'link', 'https://www.example.com'
text ' to something awesome', color: '555555', size: 16, bold: true, italic: true, underline: true
A `p` block might yield the following XML:
Here is a sentence with a
to something awesome.
### Headers
Headers can be added using the `h1`, `h2`, etc. methods. Text within a header block can be further defined using the `text` method.
*Ultimately, headers are just paragraphs that use header styles.*
docx.h3 'Heading 3'
The `h3` block above will yield the following XML:
Heading 3
### Links
Links can be added inside paragraphs by using the `link` method. The method accepts several optional parameters for controlling the style and behavior of the rule.
*At present, all links are assumed to be external.*
# no options
docx.p do
link 'Example Text', 'https://wwww.example.com'
# options via block
p do
link 'Example Text', 'https://wwww.example.com' do
style 'my_style' # sets the style class. defaults to nil.
color '0000ff' # sets the color of the text. defaults to 1155cc.
size 24 # sets the font size. units in half-points. defaults to nil.
bold false # sets whether or not the text will be bold. defaults to false.
italic false # sets whether or not the text will be italic. defaults to false.
underline true # sets whether or not the text will be underlined. defaults to true.
# options via hash
p do
link 'Example Text', 'https://wwww.example.com', color: '0000ff', underline: false
The `link` command with default properties will produce the following XML output:
Example Text
*Caracal will automatically generate the relationship entries required by the OpenXML standard.*
### Images
Images can be added by using the `img` method. The method accepts several optional parameters for controlling the style and placement of the asset.
# options via block
docx.img image_url('example.png') do
width 396 # sets the image width. units specified in pixels.
height 216 # sets the image height. units specified in pixels.
align :right # controls the justification of the image. default is :left.
top 10 # sets the top margin. units specified in pixels.
bottom 10 # sets the bottom margin. units specified in pixels.
left 10 # sets the left margin. units specified in pixels.
right 10 # sets the right margin. units specified in pixels.
# options via hash
docx.img image_url('example.png'), width: 396, height: 216, align: :right
The `img` command with default properties will produce the following XML output:
*Caracal will automatically generate the relationship entries required by the OpenXML standard.*
### Rules
Horizontal rules can be added using the `hr` method. The method accepts several optional parameters for controlling the style of the rule.
# no options
docx.hr # defaults to a thin, single line.
# options via block
docx.hr do
color '333333' # controls the color of the line. defaults to auto.
line :double # controls the line style (single or double). defaults to single.
size 8 # controls the thickness of the line. units in 1/8 points. defaults to 4.
spacing 4 # controls the spacing around the line. units in points. defaults to 1.
# options via hash
docx.hr color: '333333', line: :double, size: 8, spacing: 2
The `hr` command with default properties will produce the following XML output:
### Ordered Lists
Ordered lists can be added using the `ol` and `li` methods. The `li` method substantially follows the same rules as the the `p` method; here, simpler examples are demonstrated.
docx.ol do
li 'First item'
li 'Second item'
The `ol` and `li` commands with default properties will produce the following XML (assuming the ordered list styles have the abstractNumId=2 in the `numbering.xml` file).
First item
Second item
### Unordered Lists
Unordered lists can be added using the `ul` and `li` methods. The `li` method substantially follows the same rules as the the `p` method; here, simpler examples are demonstrated.
docx.ul do
li 'First item'
li 'Second item'
The `ul` and `li` commands with default properties will produce the following XML (assuming the ordered list styles have the abstractNumId=1 in the `numbering.xml` file).
First item
Second item
### Tables
Tables can be added using the `table` method. The method accepts several optional paramters to control the layout and style of the table cells.
table data, border: 8 do
cell_style rows(0), background_color: '4a86e8', bold: true
Given the a data structure with two rows and five columns, the `table` method would produce the following XML:
Perf. Quality
Data Quality
After-Hours trading
### Line Breaks
Line breaks can be added via the `br` method. The method accepts no parameters.
docx.br # adds a blank line using the default paragrpah style.
The `br` command will produce the folowing XML:
## Template Rendering
Caracal includes [Tilt](https://github.com/rtomayko/tilt) integration to facilitate its inclusion in other frameworks. Rails integration can be added via the [Caracal-Rails](https://github.com/trade-informatics/caracal-rails) gem.
## Defaults
[Unsure how best to handle this without code exploration. Not a critical element for the first version.]
## Contributing
1. Fork it ( https://github.com/trade-informatics/caracal/fork )
2. Create your feature branch (`git checkout -b my-new-feature`)
3. Commit your changes (`git commit -am 'Add some feature'`)
4. Push to the branch (`git push origin my-new-feature`)
5. Create a new Pull Request
## Why is It Called Caracal?
Because my son likes caracals. :)
## Inspiration
A tip of the hat to the wonderful PDF generation library [Prawn](https://github.com/prawnpdf/prawn).
## License
Copyright (c) 2014 Trade Informatics, Inc
[MIT License](https://github.com/trade-informatics/caracal/blob/master/LICENSE.txt)