== Welcome to Tartan Tartan is a general purpose text parsing engine whose main target is wiki text parsing. (see c2.com[http://c2.com/cgi/wiki?WikiWikiWeb] and Wikipedia[http://en.wikipedia.org/wiki/Wiki]) It doesn't implement one specific mark-up, but instead, provides a way to specify a variety of mark-ups. So, Tartan is a bit more "involved" than a purpose built parser like RedCloth[http://whytheluckystiff.net/ruby/redcloth/] or BlueCloth[http://www.deveiate.org/projects/BlueCloth], but provides the following benefits: 1. separates the specific wiki syntax specification from the implementation 2. allows layering and extension of parsing rules 3. allows multiple output formats from the same syntax specification The current implementation of Tartan is in Ruby and includes a full Markdown parser (described in YAML). The format of the parsing specification has been created with an eye to having a language independent definition of wiki (and possibly other) mark-ups. That's a lofty goal, and Tartan hasn't quite gotten there yet, but we think there's a clear path. In any case, even if it is only available in Ruby it will hopefully be helpful for projects needing to do something more than just convert wiki text directly into HTML. == Usage So, really all you want to do is generate HTML from Markdown text. Here's how you do it: require 'tartan_markdown' html = TartanMarkdown.new("* howdy\n* doody").to_html # => "

howdy
doody

" Other parsers would have similar names and would have the same usage. In particular, you will need to require the parser class file and then creat a new instance of the parser and call to_html on that instance. You can also have other output methods, say to_xml, which would be called in the same way on the instance of the parser object. === Layering Parsers You can add parsing syntax to existing parsers. This is done by building up a set of parsers specifications that work together. In the Tartan distribution you have a specification for Markdown and you also have a specification for table mark-up. You can combine them by creating a new class that layers the tables onto the Markdown definition as follows in a file called tartan_markdown_tables.rb: require 'tartan_markdown_def' require 'tartan_table_def' class TartanMarkdownTables < Tartan include TartanMarkdownDef include TartanTableDef end In another file you could use this new parser: require 'tartan_markdown_tables' html = TartanMarkdownTables.new("[|*happy*||**days**|]").to_html # => "

happy

days

" == The Parsing Specification === Overall Structure Each parser is made up of a parsing definition and optional helper methods. The specification is defined in YAML and the helper methods are defined in a parser definition class. The parsing definition in YAML has the following general structure: block: - - : - So the parsing rules are defined as a set of contexts and each context is an list of parsing rules. The base context defaults to block; that is, the parser starts with the block context which may point the parser off to other contexts to parse blocks of the parsed text. More on this after the explanation of the parsing rules. ==== Parsing Rules The following is a simple parsing rule to match paragraphs and mark them up in HTML: title: paragraph match: "/(^[^\n]+$\n)+^[^\n]+$/m" html: start_mark:

end_mark:

A paragraph, in this case, is any grouping of non blank lines. The parser will repetitively apply the match regular expression and if it matches, the html output sub-rule will put

and

around the text that is matched as a paragraph. If we wanted to also mark off blocks of code that are indented by say 2 or more spaces at the beginning of the line, we could use the following rule: title: code match: "/(^[ ]{2,}\S.+?$\n)+^[ ]{2,}\S.+?$/m" html: start_mark:


    end_mark:

When we want to add the code rule, the ordering becomes important. If we put the paragraph rule first, it will gobble up both the paragraphs and the code blocks since it's just looking for groups of non blank lines. To prevent this we need to put the code rule first. So the overall definition would be: block: - title: code match: "/(^[ ]{2,}\S.+?$\n)+^[ ]{2,}\S.+?$/m" html: start_mark:


        end_mark:

- title: paragraph match: "/(^[^\n]+$\n)+^[^\n]+$/m" html: start_mark:

end_mark:

== The Name Tartan is intended to weave together different parsing elements. It's intended to be an alternative of both RedCloth[http:www.redcloth.org/] and BlueCloth. Tartan is a kind of cloth that weaves different colors together in an interesting pattern.