Ruby / Rails Jobs

Previous Post: Announcing: From .NET to Rails - January 13, Chicago
Next Post: Google Video API for Ruby

Recursive Descent Parser for Ruby

Permanent Link  |   spacer Del.icio.us  |   spacer Cosmos  |   spacer Digg

Sometimes strange things happen. I've been developing a small, basic recursive descent parser for Ruby called RDParse. Just before writing this post I decided to Google that name, and lo and behold the first result is a Ruby recursive descent parser called RDParse, created by Dennis Ranke, that I posted to Code Snippets for posterity several months ago. Since both of these libraries are unlikely to be used at once and that Dennis doesn't seem to be maintaining his version, I've decided to stick with RDParse as the name of mine for now.

You can download my RDParse as rdparse.rb.txt, just rename it at your end if you want to use it. To use it, you'd do something like this (no syntax coloring as the Syntax gem doesn't appear to cope with the complexity):

require 'rdparser'

parser = RDParser.new do |g|
  g.main                'line(s)'
  g.line                'expression separator(?) comment(?)'
  g.comment             '"#" rest_of_line'
  g.rest_of_line        /.+$/
  g.separator           /;/
  g.expression          'term operation expression | term'
  g.term                'number | variable | string | brkt_expression'
  g.brkt_expression     '"(" expression ")"'
  g.number              /d+(.d+)?/
  g.operation           /[+-*/]/
  g.variable            /[a-z][a-z0-9]*/
  g.string              %r(["'](.*?[^\]|.*?)["'])
end

content = %q{
  (34 - 3) * 42;   # Comment here..
  "a" + "bcd"
}

syntax_tree = parser.parse(:main, content)
puts RDParser.text_syntax_tree(syntax_tree)

Here, a grammar is defined within the RDParser.new block, although it can also be passed in as a parameter in a hash. This grammar (for a nonsense language that can only perform basic expressions) is used to fuel a parser that generates the following syntax tree from the code in the content variable:

line
  expression
    term
      brkt_expression
        "(" => (
        expression
          term
            number => 34
          operation => -
          expression
            term
              number => 3
        ")" => )
    operation => *
    expression
      term
        number => 42
  separator => ;
  comment
    "#" => #
    rest_of_line => Comment here..
line
  expression
    term
      string => "a"
    operation => +
    expression
      term
        string => "bcd"

It's simple but effective. My initial version had callbacks and some other features that I discovered I didn't really need (and which I'd poorly implemented anyway). This version just does the basics, lexing and enough parsing to get a tree.

I'm not releasing it as a Gem or on RubyForge yet as it has a long way to go, but for anyone interested in this stuff, see what grammars you can knock up! Next step.. error reporting.

This entry was posted on Monday, November 13th, 2006 at 7:54 am and is filed under Miscellaneous, Ruby Tricks. You can leave a response, or trackback from your own site.

One Response to “Recursive Descent Parser for Ruby”

  1. jay Says:
    November 15th, 2006 at 10:19 am

    Could RDParse also benefit from the so-called packrat parsing algorithm in some way?
    meta-meta.blogspot.com/2006/04/packrat-parsers.html

Leave a Reply


gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.