# Caracal Caracal is a ruby library for dynamically creating professional-quality Microsoft Word documents (.docx) using an HTML-style syntax. ## Installation Add this line to your application's Gemfile: gem 'caracal' Then execute: $ bundle install ## Overview Many people don't know that .docx files are little more than a zipped collection of XML documents that follow the OfficeOpen XML (OpenXML or OOXML) standard. This means constructing a .docx file from scratch actually requires the creation of several files. Caracal abstracts users from this process by providing a simple set of Ruby commands and HTML-style syntax for generating Word content. For each Caracal request, the following document structure will be created and zipped into the final output file: example.docx |- _rels |- docProps |- app.xml |- core.xml |- word |- _rels |- document.xml.rels |- media |- image001.png |- image002.png ... |- document.xml |- fontTable.xml |- footer.xml |- numbering.xml |- settings.xml |- styles.xml |- [Content_Types].xml ### File Descriptions The following provides a brief description for each component of the final document: **_rels/.rels** Defines an internal identifier and type for global content items. *This file is generated automatically by the library based on other user directives.* **docProps/app.xml** Specifies the name of the application that generated the document. *This file is generated automatically by the library based on other user directives.* **docProps/core.xml** Specifies the title of the document. *This file is generated automatically by the library based on other user directives.* **word/_rels/document.xml.rels** Defines an internal identifier and type with all external content items (images, links, etc). *This file is generated automatically by the library based on other user directives.* **word/media/** A collection of media assets (each of which should have an entry in document.xml.rels). **word/document.xml** The main content file for the document. **word/fontTable.xml** Specifies the fonts used in the document. **word/footer.xml** Defines the formatting of the document footer. **word/numbering.xml** Defines ordered and unordered list styles. **word/settings.xml** Defines global directives for the document (e.g., whether to show background images, tab widths, etc). Also, establishes compatibility with older versions on Word. **word/styles.xml** Defines all paragraph and table styles used through the document. Caracal adds a default set of styles to match its HTML-like content syntax. These defaults can be overridden. **[Content_Types].xml** Pairs extensions and XML files with schema content types so Word can parse them correctly. *This file is generated automatically by the library based on other user directives.* ## Units OpenXML uses a few basic units. **Points** Most spacing declarations are measured in full points. **Half Points** All font sizes are measure in half points. A font size of 24 is equivalent to 12pt. **Eighth Points** Borders are measured in 1/8 points. A border size of 4 is equivalent to 0.5pt. **Twips** A twip is 1/20 of a point. Word documents are printed at 72dpi. 1in == 72pt == 1440 twips. **Pixels** In Word documents, pixels are equivalent to points. **EMUs (English Metric Unit)** EMUs are a virtual unit designed to facilitate the smooth conversion between inches, milliimeters, and pixels for images and vector graphics. 1in == 914400 EMUs == 72dpi x 100 x 254. ## Syntax In the following examples, the variable `docx` is assumed to be an instance of Caracal::Document. docx = Caracal::Document.new('example_document.docx') ### File Name The final output document's title can be set at initialization or via the `file_name` method. docx = Caracal::Document.new('example_document.docx') docx.file_name 'example_document.docx' The current document name can be returned by invoking the `name` method: docx.name # => 'example_document.docx' *The default file name is caracal.docx.* ### Page Size Page dimensions can be set using the `page_size` method. The method accepts two parameters for controlling the width and height of the document. *Pages default to the United States standard A4, portrait dimensions (8.5in x 11in).* # options via block docx.page_size do width 12240 # sets the page width. units in twips. height 15840 # sets the page height. units in twips. end # options via hash docx.page_size width: 12240, height: 15840 The `page_size` command will produce the following XML in the `document.xml` file: ### Page Margins Page margins can be set using the `page_margins` method. The method accepts four parameters for controlling the margins of the document. *Margins default to 1.0in for all sides.* # options via block docx.page_margins do left 720 # sets the left margin. units in twips. right 720 # sets the right margin. units in twips. top 1440 # sets the top margin. units in twips. bottom 1440 # sets the bottom margin. units in twips. end # options via hash docx.page_margins left: 720, right: 720, top: 1440, bottom: 1440 The `page_margins` command above will produce the following XML in the `document.xml` file: ### Page Breaks Page breaks can be added via the `page` method. The method accepts no parameters. docx.page # starts a new page. The `page` command will produce the following XML in the `document.xml` file: ### Page Numbers Page numbers can be added to the footer via the `page_numbers` method. The method accepts an optional parameter for controlling the alignment of the text. *Page numbers are turned off by default.* # no options docx.page_numbers true # options via block docx.page_numbers true do align :right # controls text alignment. defaults to :center. end # options via hash docx.page_numbers true, align: :right The default command will produce the following `footer.xml` file contents. *It will also automatically add the correct notation to the `w:sectPr` node of the `document.xml` file.* ### Fonts Fonts are added to the font table file by calling the `font` method and passing the name of the font. At present, Caracal only supports declaring the primary font name. docx.font name: 'Arial' docx.font do name 'Droid Serif' end These commands will produce the following `fontTable.xml` file contents: ### Styles Style classes can be added using the `style` method. The method accepts several optional parameters to control the rendering of text using the style. # options via block docx.style do type :paragraph # :paragraph or :table id 'Heading1' # sets the internal identifier for the style. name 'heading 1' # set the friendly name of the style. color '333333' # sets the text color. values in hex RGB. font 'Droid Serif' # sets the font family. size 28 # set the font size. units in half points. bold false # sets the font weight. italic false # sets the font style. underline false # sets whether or not to underline the text. align :left # sets the alignment. accepts :left, :center, :right, and :both. top 100 # sets the spacing above the paragraph. units in twips. bottom 0 # sets the spacing below the paragraph. units in twips. spacing 360 # sets the spacing between lines. units in twips. end The `style` command above would produce the following XML: ### Paragraphs Text can be added using the `p` method. The `p` either takes a string and a `class` option or a block of `text`-like commands. Text within a `p` block can be further defined using the `text` and `link` methods. The `text` method takes a text string and the optional parameters `style`, `color`, `size`, `bold`, `italic`, and `underline`. See below for details on the `link` method. docx.p 'some text', style: 'my_style' docx.p do text 'Here is a sentence with a ', style: 'my_style' link 'link', 'https://www.example.com' text ' to something awesome', color: '555555', size: 16, bold: true, italic: true, underline: true end A `p` block might yield the following XML: Here is a sentence with a link to something awesome. ### Headers Headers can be added using the `h1`, `h2`, etc. methods. Text within a header block can be further defined using the `text` method. *Ultimately, headers are just paragraphs that use header styles.* docx.h3 'Heading 3' The `h3` block above will yield the following XML: Heading 3 ### Links Links can be added inside paragraphs by using the `link` method. The method accepts several optional parameters for controlling the style and behavior of the rule. *At present, all links are assumed to be external.* # no options docx.p do link 'Example Text', 'https://wwww.example.com' end # options via block p do link 'Example Text', 'https://wwww.example.com' do style 'my_style' # sets the style class. defaults to nil. color '0000ff' # sets the color of the text. defaults to 1155cc. size 24 # sets the font size. units in half-points. defaults to nil. bold false # sets whether or not the text will be bold. defaults to false. italic false # sets whether or not the text will be italic. defaults to false. underline true # sets whether or not the text will be underlined. defaults to true. end end # options via hash p do link 'Example Text', 'https://wwww.example.com', color: '0000ff', underline: false end The `link` command with default properties will produce the following XML output: Example Text *Caracal will automatically generate the relationship entries required by the OpenXML standard.* ### Images Images can be added by using the `img` method. The method accepts several optional parameters for controlling the style and placement of the asset. # options via block docx.img image_url('example.png') do width 396 # sets the image width. units specified in pixels. height 216 # sets the image height. units specified in pixels. align :right # controls the justification of the image. default is :left. top 10 # sets the top margin. units specified in pixels. bottom 10 # sets the bottom margin. units specified in pixels. left 10 # sets the left margin. units specified in pixels. right 10 # sets the right margin. units specified in pixels. end # options via hash docx.img image_url('example.png'), width: 396, height: 216, align: :right The `img` command with default properties will produce the following XML output: *Caracal will automatically generate the relationship entries required by the OpenXML standard.* ### Rules Horizontal rules can be added using the `hr` method. The method accepts several optional parameters for controlling the style of the rule. # no options docx.hr # defaults to a thin, single line. # options via block docx.hr do color '333333' # controls the color of the line. defaults to auto. line :double # controls the line style (single or double). defaults to single. size 8 # controls the thickness of the line. units in 1/8 points. defaults to 4. spacing 4 # controls the spacing around the line. units in points. defaults to 1. end # options via hash docx.hr color: '333333', line: :double, size: 8, spacing: 2 The `hr` command with default properties will produce the following XML output: ### Ordered Lists Ordered lists can be added using the `ol` and `li` methods. The `li` method substantially follows the same rules as the the `p` method; here, simpler examples are demonstrated. docx.ol do li 'First item' li 'Second item' end The `ol` and `li` commands with default properties will produce the following XML (assuming the ordered list styles have the abstractNumId=2 in the `numbering.xml` file). First item Second item ### Unordered Lists Unordered lists can be added using the `ul` and `li` methods. The `li` method substantially follows the same rules as the the `p` method; here, simpler examples are demonstrated. docx.ul do li 'First item' li 'Second item' end The `ul` and `li` commands with default properties will produce the following XML (assuming the ordered list styles have the abstractNumId=1 in the `numbering.xml` file). First item Second item ### Tables Tables can be added using the `table` method. The method accepts several optional paramters to control the layout and style of the table cells. table data, border: 8 do cell_style rows(0), background_color: '4a86e8', bold: true end Given the a data structure with two rows and five columns, the `table` method would produce the following XML: Field Response Perf. Quality Data Quality State After-Hours trading Yes B B published ### Line Breaks Line breaks can be added via the `br` method. The method accepts no parameters. docx.br # adds a blank line using the default paragrpah style. The `br` command will produce the folowing XML: ## Template Rendering Caracal includes [Tilt](https://github.com/rtomayko/tilt) integration to facilitate its inclusion in other frameworks. Rails integration can be added via the [Caracal-Rails](https://github.com/trade-informatics/caracal-rails) gem. ## Defaults [Unsure how best to handle this without code exploration. Not a critical element for the first version.] ## Contributing 1. Fork it ( https://github.com/trade-informatics/caracal/fork ) 2. Create your feature branch (`git checkout -b my-new-feature`) 3. Commit your changes (`git commit -am 'Add some feature'`) 4. Push to the branch (`git push origin my-new-feature`) 5. Create a new Pull Request ## Why is It Called Caracal? Because my son likes caracals. :) ## Inspiration A tip of the hat to the wonderful PDF generation library [Prawn](https://github.com/prawnpdf/prawn). ## License Copyright (c) 2014 Trade Informatics, Inc [MIT License](https://github.com/trade-informatics/caracal/blob/master/LICENSE.txt)