Livetext: A smart processor for text

Livetext is simply a tool for transforming text from one format into another. The source file has commands embedded in it, and the output is dependent on those commands.

Why is this special? It's very flexible, very extensible, and it's extensible in Ruby.



Why Livetext?

Livetext grew out of several motivations. One was a desire for a markup language that would permit me to write articles (and even books) in my own way and on my own terms. I've done this more than once (and I know others who have, as well).

I liked Softcover, but I found it to be very complex. I never liked Markdown much -- it is very dumb and not extensible at all.

I wanted something that had the basic functionality of all my ad hoc solutions but allowed extensions. Then my old solutions would be like subsets of the new format. This was a generalization similar to the way we began several years ago to view HTML as a subset of XML.



What is Livetext really?

Here goes:



How does it work?

A Livetext file is simply a text file which may have commands interspersed. A command is simply a period followed by a name and optional parameters (at the beginning of a line).

The period is configurable if you want to use another character. The names are (for now) actual Ruby method names, so names such as to_s and inspect are currently not allowed.

Currently I am mostly emitting "dumb HTML" or Markdown as output. In theory, you can write code (or use someone else's) to manipulate text in any way and output any format. Technically, you could even emit PDF, PNG, or SVG formats.

It's possible to embed comments in the text, or even to pass them through to the output in commented form.

The command .end is special, marking the end of a body of text. Some commands may operate on a block of lines rather than just a few parameters. (A text block is like a here-document.) There is no method name corresponding to the .end command.

The file extension I've chosen is .lt (though this may change). Note: The source for this README is a .lt file which uses its own little ad hoc library (called readme.rb). Refer to the repo to see these.



Syntax, comments, and more

At first, my idea was to provide predefined commands and allow user-defined commands (to be distinguished by a leading . or .. markers). So the single and double dots are both legal.

However, my concept at present is that the double dots (currently unused) will be used for subcommmands.

User-defined commands may be added to the standard namespace marked with a period. They may also be preceded by a specified character other than the period and thus stored in their own namespace. More on that later.

When a leading period (or double period) is followed by a space, that line is a comment. When it is follwed by a name, that name is typically understood to be a method name. Any remaining text on the line is treated as a parameter list to be accessed by that method. Some methods accept multiple lines of text, terminated by a .end tag.



Boldface and italics

Very commonly we want to format short words or phrases in italics, boldface, or a monospaced (fixed width) font. The Markdown spec provides ways to do this that are fairly intuitive; but I personally don't like them. My own notation works a different way.

First of all, note that these don't work across source lines; they're strictly intra-line. You may need (for example) an italicized phrase that spans across a newline; at present, you'll need a workaround for that.

I find that most short items I want to format are single tokens. Therefore I use a prefixed character in front of such a token: Underscore for italics, asterisk for boldface, and backtick for "code font." The formatting ends when the first blank space is encountered, without any kind of suffixed character. (This behavior may change to include certain punctuation marks as terminators.)

Of course, there are cases where this won't work; a formatted string may contain spaces, or it may exclude characters before the blank space. In this case, we can use an opening parenthesis after the prefix and a closing parenthesis at the end of the string.

This means that it can be difficult to include a left paren inside a formatted token. I'm thinking about that. It also means that a "literal" prefix character must be escaped.

This is all summarized in this example (taken from one of the testcases):

Test: 015_basic_formatting

Input Output
 Here are examples of *boldface and _italics and code
 as well as *(more complex) examples of \_(italicized text)
 and(code font).

Here are some random punctuation marks: # . @ * _ ` : ; % ^ & $

Oops, forgot to escape these: * \_ `

 Here are examples of boldface and italics and code
 as well as more complex examples of italicized text
 and code font.

Here are some random punctuation marks: # . @ * _ ` : ; % ^ & $

Oops, forgot to escape these: * _ `



Standard methods

The module Livetext::Standard contains the set of standard or predefined methods. Their names are essentially the same as the names of the dot-commands, with occasional exceptions. (For example, it is impractical to use the name def as a method name, so we use _def instead.) Here is the current list:

comment Start a comment block
errout Write an error message to STDERR
sigil Change the default sigil from . to some other character
_def Define a new method inline
set Assign values to variables for later interpolation
include Include an outside text file (to be interpreted as Livetext)
mixin Mix this file of Ruby methods into the standard namespace
copy Copy this input file verbatim (no interpretation)
r Pass a single line through without processing
raw Pass this special text block (terminated with __EOF__) directly into output without processing



Examples from the tests

Here are some tests from the suite. The file name reflects the general purpose of the test.

Test: 001_hello_world

Input Output
 Hello,
 world!
 Hello,
 world!

Test: 002_comments_ignored_1

Input Output
 . Comments are ignored
 abc 123
 this is a test
 . whether at beginning, middle, or
 more stuff
 still more stuff
 . end of the file
 abc 123
 this is a test
 more stuff
 still more stuff

Test: 003_comments_ignored_2

Input Output
 .. Comments (with a double-dot) are ignored
 abc 123
 this is a test
 .. whether at beginning, middle, or
 more stuff
 still more stuff
 .. end of the file
 abc 123
 this is a test
 more stuff
 still more stuff

Test: 004_sigil_can_change

Input Output
 . This is a comment
 .sigil #
 # Comments are ignored
 abc 123
 this is a test
 . this is not a comment
 # whether at beginning, middle, or
 more stuff
 .this means nothing
 still more stuff
 # end of the file
 abc 123
 this is a test
 . this is not a comment
 more stuff
 .this means nothing
 still more stuff

Test: 005_block_comment

Input Output
 .comment
 This is
 a comment
 .end
 abc 123
 xyz
 .comment
 And so is this.
 .end

one more time .comment And so is this .end

 abc 123
 xyz

one more time

Test: 006_def_method

Input Output
 abc
 123
 .def foobar
 ::STDERR.puts "This is the"
 ::STDERR.puts "foobar method"
 .end
 xyz
 .foobar
 xyzzy
 123
 abc
 123
 xyz
 xyzzy
 123

Test: 007_simple_vars

Input Output
 Just
 some text.
 .set name=GulliverFoyle,nation=Terra
 Hi, there.
 $name is my name, and $nation is my nation.
 I'm $name, from $nation.
 That's all.
 Just
 some text.
 Hi, there.
 GulliverFoyle is my name, and Terra is my nation.
 I'm GulliverFoyle, from Terra.
 That's all.

Test: 008_simple_include

Input Output
 Here I am
 trying to
 include
 .include simplefile.inc
 I hope that
 worked.
 Here I am
 trying to
 include
 a simple
 include file.
 I hope that
 worked.

Test: 009_simple_mixin

Input Output
 Here I am
 testing a simple mixin
 .mixin simple_mixin
 Now call it:
 .hello_world
 That's all.
 Here I am
 testing a simple mixin
 Now call it:
 Hello, world.
 That's all.

Test: 010_simple_copy

Input Output
 The copy command
 copies any file
 without interpretation,
 such as:
 .copy simplefile.inc
 That is all.
 The copy command
 copies any file
 without interpretation,
 such as:
 a simple
 include file.
 That is all.

Test: 011_copy_is_raw

Input Output
 A copy command
 does not interpret its input:
 .copy rawtext.inc
 That's all.
 A copy command
 does not interpret its input:
 This is not a comment:
 .comment woohoo!
 This is not a method:
 .no_such_method
 That's all.

Test: 012_raw_text_block

Input Output
 This text block will be passed thru
 with no interpretation or processing:
 .raw
 .comment
 This isn't a
 real comment.
 .end  This isn't picked up.

.not_a_method

And this stuff won't be munged: alpha \_beta *gamma Or this:(alpha male) _(beta max) *(gamma rays) __EOF__

I hope that worked.

 This text block will be passed thru
 with no interpretation or processing:
 .comment
 This isn't a
 real comment.
 .end  This isn't picked up.

.not_a_method

And this stuff won't be munged: alpha \_beta *gamma Or this:(alpha male) _(beta max) *(gamma rays)

I hope that worked.



Writing custom methods

Suppose you wanted to write a method called chapter that would simply output a chapter number and title with certain heading tags and a horizontal rule following. There is more than one way to do this.

The simplest way is just to define a method inline with the rest of the text. Here's an example.

     .comment
     This example shows how to define
     a simple method "chapter" inline
     .end
   
     . This is also a comment, by the way.
     .def chapter
        params = _args
        raise "chapter: expecting at least two args" unless params.size > 1
        num, *title = params     # Chapter number + title
        title = title.join(" ")  # Join all words into one string
        text = <<-HTML
        <h3>Chapter #{num}</h3>
        <h2>#{title}</h2>
        <hr>
        HTML
        _puts text
     .end
     . Now let's invoke it...
     .chapter 1 Why I Went to the Woods
     It was the best of times, and you can call me Ishmael. The clocks
     were striking thirteen.

What can we see from this example? First of all, notice that the part between .def and .end (the body of the method) really is just Ruby code. The method takes no parameters because parameter passing is handled inside the Livetext engine and the instance variable @args is initialized to the contents of this array. We usually refer to the @args array only through the method _args which returns it.

The _args method is also an iterator. If a block is attached, that block will be called for every argument.

We then create a string using these parameters and call it using the _puts method. This really does do a puts call, but it applies it to wherever the output is currently being sent (defaulting to STDOUT).

All the "helper" methods start with an underscore so as to avoid name collisions. These are all stored in the Livetext::Helpers module (which also has some methods you will never use).

Here is the HTML output of the previous example:

     <h3>Chapter 1</h3>
     <h2>Why I Went to the Woods</h2>
     <hr>
     It was the best of times, and you can call me Ishmael. The clocks
     were striking thirteen.

What are some other helper methods? Here's a list.

_args Returns an array of arguments for the method (or an enumerator for that array)
_data A single "unsplit" string of all arguments in raw form
_body Returns a string (or enumerator) giving access to the text block (preceding .end)
_puts Write a line to output (STDOUT or wherever)
_print Write a line to output (STDOUT or wherever) without a newline
_formatting A function transforming boldface, italics, and monospace (Livetext conventions)
_var_substitution Substitute variables into a string
_passthru Feed a line directly into output after transforming and substituting

Note that the last three methods are typically not called in your own code. They could be, but it remains to be seen whether something that advanced is useful.



More examples

Suppose you wanted to take a list of words, more than one per line, and alphabetize them. Let's write a method called alpha for that. This exercise and the next one are implemented in the test suite.

Test: 013_example_alpha

Input Output
 .def alpha
    text = _body.join
    text.gsub!(/\n/, " ")
    words = text.split.sort
    words.each {|w| _puts "    #{w}" }
 .end
 Here is an alphabetized list:

.alpha fishmonger anarchist aardvark glyph gryphon halcyon zymurgy mataeotechny zootrope pareidolia manicotti quark bellicose anamorphic cytology fusillade ectomorph .end

I hope that worked.

 Here is an alphabetized list:

 aardvark
 anamorphic
 anarchist
 bellicose
 cytology
 ectomorph
 fishmonger
 fusillade
 glyph
 gryphon
 halcyon
 manicotti
 mataeotechny
 pareidolia
 quark
 zootrope
 zymurgy

I hope that worked.

I'll let that code stand on its own. Now suppose you wanted to allow columnar output. Let's have the user specify a number of columns (from 1 to 5, defaulting to 1).

Test: 014_example_alpha2

Input Output
 .def alpha
    cols = _args.first
    cols = "1" if cols == ""
    cols = cols.to_i
    raise "Columns must be 1-5" unless cols.between?(1,5)
    text = _body.join
    text.gsub!(/\n/, " ")
    words = text.split.sort
    words.each_slice(cols) do |row|
      row.each {|w| _print '%-15s' % w }
      _puts
    end
 .end
 Here is an alphabetized list:

.alpha 3 fishmonger anarchist aardvark glyph gryphon halcyon zymurgy mataeotechny zootrope pareidolia manicotti quark bellicose anamorphic cytology fusillade ectomorph .end

I hope that worked a second time.

 Here is an alphabetized list:

aardvark anamorphic anarchist bellicose cytology ectomorph fishmonger fusillade glyph gryphon halcyon manicotti mataeotechny pareidolia quark zootrope zymurgy

I hope that worked a second time.

What if we wanted to store the code outside the text file? There is more than one way to do this.

Let's assume we have a file called mylib.rb in the same directory as the file we're processing. (Issues such as paths and security have not been addressed yet.) We'll stick the actual Ruby code in here (and nothing else).

   # File: mylib.rb
   
   def alpha
     cols = _args.first
     cols = "1" if cols == ""
     cols = cols.to_i
     raise "Columns must be 1-5" unless cols.between?(1,5)
     text = _body.join
     text.gsub!(/\n/, " ")
     words = text.split.sort
     words.each_slice(cols) do |row| 
       row.each {|w| _print '%-15s' % w }
       _puts 
     end
   end

Now the .lt file can be written this way:

    .mixin mylib
    Here is an alphabetized list:
   
    .alpha 3
    fishmonger anarchist aardvark glyph gryphon
    halcyon zymurgy mataeotechny zootrope
    pareidolia manicotti quark bellicose anamorphic
    cytology fusillade ectomorph
    .end
   
    I hope that worked a second time.

The output, of course, is the same.

There is an important feature that has not yet been implemented (the require method). Like Ruby's require, it will grab Ruby code and load it; however, unlike mixin, it will load it into a customized object and associate a new sigil with it. So for example, the command .foobar would refer to a method in the Livetext::Standard class (whether predefined or user-defined). If we did a require on a file and associated the sigil # with it, then #foobar would be a method on that new custom object. I will implement this soon.



Issues, open questions, and to-do items

This list is not prioritized yet.

  1. Add versioning information
  2. Clean up code structure
  3. Add RDoc
  4. Think about command line executable
  5. Write as pure library in addition to executable
  6. Package as gem
  7. Document: require `include copy `mixin errout and others
  8. Need much better error checking and corresponding tests
  9. Worry about nesting of elements (probably mostly disallow)
  10. Think about UTF-8
  11. Document API fully
  12. Add _raw_args and let _args honor quotes
  13. Support quotes in .set values
  14. Support "namespaced" variables (`(.set code.font="whatever"))
  15. Support functions (`($$func)) including namespacing
  16. Create predefined variables and functions (e.g., $_source_file, $(_line), $$_today)
  17. Support markdown-style bold/italics? (`_markdown replaces _formatting method)
  18. Allow turning on/off: formatting, variable interpolation, function interpolation?
  19. .require with file and sigil parameters
  20. Comments passed through (e.g. as HTML comments)
  21. .run to execute arbitrary Ruby code inline?
  22. Concept of .proc (guaranteed to return no value, produce no output)?
  23. Exceptions??
  24. Ruby $SAFE levels?
  25. Warn when overriding existing names?
  26. Think about passing data in (erb replacement)
  27. Allow custom ending tag on raw method
  28. Ignore first blank line after .end? (and after raw-tag?)
  29. Allow/encourage custom passthru method?
  30. Must have sane support for CSS
  31. Support for Pygments and/or other code processors
  32. Support for gists? arbitrary links? other remote resouces?
  33. Small libraries for special purposes (books? special Softcover support? blogs? PDF? RMagick?)
  34. Experiment with idea of special libraries having pluggable output formats (via Ruby mixin?)
  35. Imagining a lib that can run/test code fragments as part of document generation
  36. Create vim (emacs?) syntax files
  37. Someday: Support other languages (Elixir, Python, ...)
  38. .pry method?
  39. .irb method?
  40. Other debugging features
  41. Feature to "break" to EOF?
  42. .meth? method ending in ? takes a block that may be processed or thrown away (`(.else) perhaps?)
  43. .dump to dump all variables and their values
  44. .if and .else?
  45. Make any/all delimiters configurable
  46. HTML helper? (in their own library?)