CAST -- Ruby's C parsing dog. Woof.

CAST parses C code into an abstract syntax tree (AST), lets you break it, then vomit it out as code. The parser does C99.

Library Overview

Usage

If there's a parse error, #parse raises a ParseError (which has a nice error message in #message).

The Parser

I bet you said "why you l4m3r n00b, that's a statement that multiplies a by b and throws away the answer -- now go take your meaningless snippetage to your computing 101 class and let me finish hurting this Java^TM programmer." Well, you'd be both mean and wrong. It was, of course, a trick question. I didn't say if any of a and b are types! If only a is a type, it's actually a declaration. And if b is a type, it's a syntax error.

So, the parser's gonna need to know which identifiers are type names. This is one of the bits of state that a Parser keeps. Here's the complete list (um, of two):

A Node::Pos has three read-write atts: #filename, #line_num, #col_num. Default is nil, 1, 0.

The Nodes

Node
- TranslationUnit
- Declaration
- Declarator
- FunctionDef
- Parameter
- Enumerator
- MemberInit
- Member
- Statement
  - Block
  - If
  - Switch
  - While
  - For
  - Goto
  - Continue
  - Break
  - Return
  - ExpressionStatement
- Label
  - PlainLabel
  - Default
  - Case

Node
- Expression
  - Comma
  - Conditional
  - Variable
  - UnaryExpression
    - PostfixExpression
      - Index
      - Call
      - Dot
      - Arrow
      - PostInc
      - PostDec
    - PrefixExpression
      - Cast
      - Address
      - Dereference
      - Sizeof
      - Plus
      - Minus
      - PreInc
      - PreDec
      - BitNot
      - Not

Node
- Expression
  - BinaryExpression
    - Add
    - Subtract
    - Multiply
    - Divide
    - Mod
    - Equal
    - NotEqual
    - Less
    - More
    - LessOrEqual
    - MoreOrEqual
    - BitAnd
    - BitOr
    - BitXor
    - ShiftLeft
    - ShiftRight
    - And
    - Or

Node
- Expression
  - AssignmentExpression
    - Assign
    - MultiplyAssign
    - DivideAssign
    - ModAssign
    - AddAssign
    - SubtractAssign
    - ShiftLeftAssign
    - ShiftRightAssign
    - BitAndAssign
    - BitXorAssign
    - BitOrAssign
  - Literal
    - StringLiteral
    - CharLiteral
    - CompoundLiteral
    - IntLiteral
    - FloatLiteral

Node
- Type
  - IndirectType
    - Pointer
    - Array
    - Function
  - DirectType
    - Struct
    - Union
    - Enum
    - CustomType
    - PrimitiveType
      - Void
      - Int
      - Float
      - Char
      - Bool
      - Complex
      - Imaginary
- NodeList
  - NodeArray
  - NodeChain

The last 2 (the NodeLists) represent lists of nodes. Methodwise, they try to behave like normal ruby ::Arrays. Implementationwise, a NodeChain is a doubly linked list, whereas a NodeArray is an array. NodeChains may be more efficient when adding things at the beginning of a LARGE list.

Attributes

If you're a duck-typing purist, then sorry for the cardiac arrest you're now experiencing. CAST does pay attention to the class of Node objects for quite a few things. This is the cleanest way to distinguish, e.g., an Add from a Subtract (which both have the same methods but represent very different things). It just seems impractical (and unnecessary) to allow duck typing in this situation.

This is not the same as "declarator.type.to_s == 'const int *'"; that'd require you to guess how to_s formats its strings (most notably, the whitespace).

Fields and children

Each concrete Node class has a member for each bit of important C stuff it pertains to. I know you peeked at the big list below, so you know the kind of thing I mean.

But these aren't defined as attrs as you normally do in Ruby -- they're fields. If a node has a field foo, it means there's a setter #foo= and getter #foo. (A field foo? means the setter is #foo= and the getter is #foo?.) Some fields are even more special: child fields. These form the tree structure, and always have a Node or nil value.

Why divulge these bizarre internal secrets? Because these Node methods are defined in terms of fields and children:

Then there's the tree-twiddling methods, which only ever yield/return/affect (non-nil) children.

If you're walking the tree looking for nodes to move around, don't forget that modifying the tree during traversal is a criminal offence.

And now, the episode you've been waiting for: THE FIELD LIST! (Cue music and dim lights.)

Node Construction

They're for losers, though. What you really want to do is make Nodes by parsing C code. Each class -- even the abstract classes like Statement -- has a .parse method:

Need to tell it to treat WaffleIron as a type name? All those parse methods use C.default_parser:

In fact, there's also C.parse(str, parser=nil), which is an alias for C::TranslationUnit.parse(str, parser).

Yes, all that talk in the intro about doing parser = C::Parser.new; parser.parse(...) was actually all a charade to make you type more. I so own you.

Open Issues

To Do

Contact

You can spam me at george.ogata@gmail.com. It'd help if you prefixed the subject with "[cast] " so I can easily distinguish CAST spam from fake Rolex spam.

CAST

What Is