|
Ragel State Machine Compiler
What is Ragel?
Ragel compiles executable finite state machines from regular languages. Ragel
targets C, C++, Objective-C, D, Java and Ruby.
Ragel state machines can not
only recognize byte sequences as regular expression machines do,
but can also execute code at arbitrary points in the recognition of
a regular language. Code embedding is done using inline operators
that do not disrupt the regular language syntax.
The core language consists of standard regular expression operators (such as
union, concatenation and Kleene star) and action
embedding operators. The user's regular expressions are compiled to a
deterministic state machine and the embedded actions are associated with the
transitions of the machine. Understanding the formal relationship
between regular expressions and deterministic finite automata is key to using
Ragel effectively.
Ragel also provides operators that let you control any non-determinism that
you create, construct scanners, and build state
machines using a statechart model. It is also possible to influence the
execution of a state machine from inside an embedded action by jumping or
calling to other parts of the machine, or reprocessing input.
Ragel provides a very flexible interface to the host language that
attempts to place minimal restrictions on how the generated code is
integrated into the application. The generated code has no
dependencies.
action dgt { printf("DGT: %c\n", fc); }
action dec { printf("DEC: .\n"); }
action exp { printf("EXP: %c\n", fc); }
action exp_sign { printf("SGN: %c\n", fc); }
action number { /*NUMBER*/ }
number = (
[0-9]+ $dgt ( '.' @dec [0-9]+ $dgt )?
( [eE] ( [+\-] $exp_sign )? [0-9]+ $exp )?
) %number;
main := ( number '\n' )*;
| => |
st0:
if ( ++p == pe )
goto out0;
if ( 48 <= (*p) && (*p) <= 57 )
goto tr0;
goto st_err;
tr0:
{ printf("DGT: %c\n", (*p)); }
st1:
if ( ++p == pe )
goto out1;
switch ( (*p) ) {
case 10: goto tr5;
case 46: goto tr7;
case 69: goto st4;
case 101: goto st4;
}
if ( 48 <= (*p) && (*p) <= 57 )
goto tr0;
goto st_err;
|
|| \/ | |
What kind of task is Ragel good for?
- Writing robust protocol implementations.
- Parsing data formats.
- Lexical analysis of programming languages.
- Validating user input.
Features
- Construct finite state machines using:
- regular language operators
- state chart operators
- a scanner operator
- some mix of the above
- Embed actions into machines in arbitrary places.
- Control non-determinism using guarded operators.
- Minimize state machines using Hopcroft's algorithm.
- Visualize output with Graphviz.
- Use byte, double byte or word-sized alphabets.
- Generate C, C++, Objective-C, D, Java or Ruby
code with no dependencies.
- Choose from table or control flow driven state machines.
Publications
[1] |
Adrian D. Thurston. "Parsing Computer Languages with an
Automaton Compiled from a Single Regular Expression." In 11th International
Conference on Implementation and Application of Automata (CIAA 2006), Lecture
Notes in Computer Science, volume 4094, pp. 285-286, Taipei, Taiwan, August
2006.
pdf. |
Documentation, Editors and Mailing List
Ragel has a user guide available in PDF format as well as a man page.
Major version number releases
contain language changes. See the ChangeLog and Release Notes for details.
If you use Vim, there is a syntax file ragel.vim
for your editing pleasure. If you use TextMate there is a Ragel bundle
Ragel.tmbundle.
The Ragel mailing list is available here: ragel-users.
Ask for help, post parsing problems, or tell us what you think of Ragel.
Links
-
NEW
The OverSIP
Project includes several parsers written in Ragel.
-
Connecting Ragel to Bison in C++
-
Stream Parser with Ragel and Ruby
-
Does your project use Ragel? Make the docs sexy with Dexy.
-
A simple intro to writing a lexer with Ragel
-
Jitify Web Acceleration
with lexers in Ragel.
-
Calculating the mean selector specificity, using a Ragel-generated
CSS3 parser.
project,
blog post.
-
Article
on use of ragel in embedded systems.
- Lighttpd sandbox (which will become 2.0) uses Ragel. link.
- There are Ragel lexers for the Pygments syntax highlighting system. link.
- An implementation of W3C Selectors in Java. link.
- Source code line counting. link.
- Screenplay typesetting. link.
- EaRing, an assembler using Ragel and Lemon.
link.
- An include file dependency scanner, mostly for C files.
link.
- CroMo: Morphological analysis of Croatian and other languages.
link.
- An ESI server derived from Mongrel. link.
- A little assembler that uses Ragel for scanning and Lemon for parsing. link.
- An article in Japanese on using Ragel. link.
|
- Using Ragel for scanning wikitext. link.
-
A Brazilian Portuguese translation of the above:
Um Hello World para o Ruby em Ragel 6.0
- devChix article: A Hello World for Ruby on Ragel.
(updated)
- Perian is a QuickTime component that
adds native support for many popular video formats.
- ABNF Generator is a tool which
accepts grammars in ABNF and outputs Ragel definitions.
- Qfsm is a graphical tool
for designing state machines. It includes a Ragel export feature.
-
Layout Expression Language (part of
Profligacy)
is for building Swing GUIs with JRuby.
- RaSPF
is an SPF library in C.
-
appid:
single-pass application protocol identification.
-
Utu:
internet communication with cryptographically enforced identity, reputation and
retribution.
-
Lib2geom:
a computational geometry library for
Inkscape.
- A JSON parser for Pike. link
-
json
A JSON parser and generator for Ruby.
- SuperRedCloth is a snappy implementation of Textile for Ruby.
- Zed Shaw on Ragel State Charts. link
- RFuzz is an HTTP destroyer.
- Hpricot
is an HTML parsing and manipulation library for Ruby.
- Mongrel
is an HTTP library and server for Ruby.
- Using Ragel and XCode.
link
|
Examples
Clang: a scanner for a simple C like language. clang.rl
Mailbox: parses unix mailbox files. It breaks files
into separate messages, the headers within messages and the bodies of messages. mailbox.rl
AwkEmu: performs the basic parsing that the awk program performs
on input. awkemu.rl
Atoi: converts a string to an integer.
atoi.rl
Concurrent: performs multiple independent tasks concurrently.
concurrent.rl
|
|
StateChart: the Atoi machine built with the named state and transition list paradigm.
statechart.rl
GotoCallRet: demonstrates the use of fgoto, fcall, fret and fhold.
gotocallret.rl
Params: parses command line arguments.
params.rl
RagelScan: scans ragel input files.
rlscan.rl
CppScan: A c++ scanner that uses the longest-match method
of scanning
cppscan.rl
|
Download
Source Code Repository:
git://git.complang.org/ragel.git
Tar.Gz: The latest release is version is ragel-6.8.tar.gz (sig).
Older: Previous versions are available here.
Debian:
The homepage for the Debian package of Ragel is
here.
It is by Robert Lemmen.
OpenPKG: Ragel has been included in the OpenPKG project.
FreeBSD: A port
for Ragel is available in the FreeBSD ports system.
NetBSD: There is a package
for Ragel in the pkgsrc database.
|
|
Mac OS X: A port
is available in the MacPorts repository.
Crux: A port for the
Crux
Linux distribution is available
here.
Gentoo: A Gentoo port
is available.
Suse: Packages for Suse can be found here.
Windows: Ragel can be compiled using Cygwin or MinGW. Binaries compiled
with visual studio are
here (6.7), provided by Joseph Goettgens. (landing)
Redhat/Fedora: A package for Ragel is available in Fedora Extras.
Slackware: A package is available at LinuxPackages.net
|
Version: |
6.8 |
Date: |
Feb 11, 2013 |
Change Log: |
ChangeLog |
Tar.Gz: |
ragel-6.8.tar.gz (sig) |
Zip: |
ragel-6.8.zip (sig) |
User Guide PDF: |
ragel-guide-6.8.pdf |
The public key for package signing is here.
License
Ragel is released under the GNU General Public License. A copy of
the license is included in the distribution. It is also available from
GNU.
Note: parts of Ragel output are copied from Ragel source covered by the
GNU GPL. As a special exception to the GPL, you may use the parts of Ragel
output copied from Ragel source without restriction. The remainder of Ragel
output is derived from the input and inherits the copyright status of the
input file. Use of Ragel makes no requirements about the license of generated
code.
Credits
Ragel was written by Adrian Thurston.
It was originally developed in early 2000 and was first released
January 2002. Many
people have contributed feedback, ideas and code. Please have a look at the CREDITS file.
|
|