introducing esprima: blazing-fast javascript parser

Home
About
Code
Projects
Talks
Research

introducing esprima: blazing-fast javascript parser

December 13th, 2011 Tags: coding, esprima, javascript, parser

In a nutshell, Esprima (esprima.org) is a JavaScript parser written in pure JavaScript. In the near future, it will expand itself to something even more cooler, but as of now it’s just a parser. It uses the common recursive descent approach. The main parsing routine is not machine generated, everything is written by hand. The output of the parser is a syntax tree in JSON, formatted compatible to Mozilla Parser API.

The code is designed to be educational (no funky obfuscated tricks only a JavaScript ninja can decipher), self explanatory (the terminologies match the actual official 258-page specification), and high performant (it can tear apart jQuery source code, and not the minified version, in less than 0.1 sec). It’s always challenging to pick the sweet spot which nails all these three objectives, though I hope Esprima hits an optimal compromise.

Like any complex parser, unit testing is an integral part of the development. To ensure faithful compatibility with Mozilla Parser API, hundreds of its tests have been imported as well. All in all, there are over a thousand tests. In addition, there is a benchmarks suite, it consists of most common JavaScript libraries out there. The performance of various web browsers running the benchmarks suite is depicted in the following chart (shorter is better). The test machine is an iMac from late 2010, with 3 GHz Intel Core i3.

If you think it’s not fast enough, wait for the improvements being made to major JavaScript engines out there. Preliminary tests showed that V8 engine in Chrome 17 (dev channel) executes the benchmarks suite 1.7 faster than Chrome 15. Related to that, JavaScriptCore in WebKit nightly speeds up the benchmark running time by 25% (and it keeps getting faster). In addition, Firefox 9 will feature type inference which shows 65% performance win when running the said benchmarks suite.

What about mobile devices? As expected, it’s rather slower at this kind of job, limited pretty much by the CPU power. Some data of the running time for the benchmarks suite: 5.8 sec for Amazon Kindle Fire, 7.9 sec for Apple iPad 2, 12.8 sec for Nexus S, and 17.9 sec for Nokia N9.

Since Esprima is written in JavaScript, it runs wherever there is a decent implementation of JavaScript. Supported browsers are (among others) IE 8+, Firefox 3.5+, Safari 4+, Chrome 7+, and Opera 10.5+. As expected, Esprima can also be used in Node.js applications by installing esprima package using npm.

The best way to try Esprima is right in the browser via the online syntax parser demo. Type in your code, and voila! Esprima will show you the corresponding syntax tree almost right away. There is also the operator precedence demo, inspired by previously similar demo. Beside comparing if an expressions is equivalent to another one, the example also rewrites your expression as if you would have written it using brackets to enforce the intended precedence, illustrated in the following screenshot:

Compared to other parsers, Esprima is one of the fastest. There is a whole speed comparison page which puts Esprima head-to-head against parse-js (famously known as part of UglifyJS), ZeParser, and Narcissus. Since Esprima does not output location information yet (see issue #6), like ZeParser and Narcissus, a pure speed benchmark is only fair between Esprima vs parse-js. Here is the result, tested with different (stable version) browsers. Still not impressed? With the upcoming Chrome 17, Esprima will be actually 2x faster than parse-js.

So which parser should you pick? Narcissus has been around for a while so its stability and correctness are well tested. It does also support various JavaScript extensions, as well as features from ES.next. Both ZeParser and parse-js are not necessarily new anymore so they are more battle hardened than Esprima. Since the excellent minifier UglifyJS is based on parse-js, I’m not shocked if there are tons of peculiar JavaScript syntax which parse-js can handle really well. At the end of the day, I still hope that as the new kid on the block, Esprima is attractive enough since it’s readable, easy of follow, heavily unit tested, and yet carrying out the parsing task at blazing speed. Thus, if you feel adventurous, give Esprima a try!

Beside dealing with code parsing, Esprima also has the ability to optionally collect the comments (see issue #71). Since it involves some extra steps, expect some minor performance penalty if you do that. Once those comments are extracted, a bit of additional cross reference will allow you to associate certain comment blocks with parts of the code. This is extremely valuable for an automatic documentation tool.

To keep an eye on Esprima development, go to its project page, watch the issue tracker for future plan, and join the discussion in the mailing list.

Get the code and express yourself!

P.S.: Special thanks to Thomas Aylott, Yusuke Suzuki, and Axel Rauschmayer for the useful initial discussion, suggestions, and feedback.

Share this:

math expression evaluator in javascript: part 2 (parser)
math evaluator in javascript: part 1 (the tokenizer)
matching a decimal digit
parsing: imperative vs declarative

You can leave a response, or trackback from your own site.

MySchizoBuddy

Can this be used as a learning tool for creating a parser for your own programming language.

ariya

An easier start would be math expression parser (see the Related Posts).

NiKo

Nice! I’ve just added test file syntax checking to CasperJS thanks to esprima, it works pretty well
www.mysparebrain.com/ Tim M

Nice one – I’ve worked on UglifyJS (github id schmerg) so it’s nice to see both a new fresh implementation (makes testing easier when there are more implementations) and the combination of pride-in-your-new-thing and due-respect-for-others.

The uglifyjs module has some nice walker routines that make it easy to write single routines that will walk the parse tree performing various actions – this makes sense as uglifyjs is fundamentally about modifying the parse tree, but it’s something worth doing nicely in a parser.

I see your parse discards comments – I presume this is in the tokenisation phase. Uglifyjs does this but for some features people want to put special hints for optimisations in comments. I know having all comments present in full in the parse tree would be a pain, but if you’re thinking of adding location information to the parse info, it might be an idea to make it easy to do things like “look for the comment before a statement” by keying the location back to a raw token stream index or similar.

ariya

Thanks for the feedback! As for comments and location info, check the issue tracker. They are being worked on (to certain extent). The use case that you mentioned will be easily supported. Suggestions are welcomed

who is ariya

I am a passionate technologist, well versed in various hardware and software systems.
I am the author of PhantomJS and Esprima.
I believe in sharing and openness, I contribute to open-source projects: WebKit, Qt, KDE.
Follow me on Twitter: @ariyahidayat.
contact

E-mail: ariya.hidayat@gmail.com
Twitter: ariyahidayat
LinkedIn: ariyahidayat
Google+ Profile
Recent Tweet
- If you are interested in function exit tracing as well, follow t.co/WodJ3EAZ #in #JavaScript #Esprima 3 hours ago
- RT @verge: Tesla Model X crossover electric vehicle coming in 2013, starts at $49,900 after tax credits t.co/px9CeyKk 4 hours ago
- BBC News - How The Muppets film was boosted by GPU chip advances t.co/ZYaNJ41i 6 hours ago
- Wolfram, a Search Engine, Finds Answers Within Itself: t.co/uRGQ7Sr5 7 hours ago
- RT @casperjs_org: #casperjs repository just reached 150 watchers, thank you all =) 8 hours ago
- TIL: "deltification", Subversion's representation of a chunk of data as a collection of differences against some other chunk of data 10 hours ago
Recent Posts
- Dart bootstrap: Dartium vs other browsers
- PhantomJS and Mac OS X
- The Real Dark Knight Rises
- Unconfidential tricks to challenge brainwashing
- Scalable web apps: the complexity issue
- one year of wandering headlessly
- senchacon 2011 videos
Recent Comments
- Web Mentor on on the story of browser names
- Ryan on on the story of browser names
- Gunnar Bittersmann on on the story of browser names
- Dhanrajmanure on on the story of browser names
- Ariya Hidayat on small-scale software craftsmanship
Archives
- February 2012 (4)
- January 2012 (7)
- December 2011 (5)
- November 2011 (11)
- October 2011 (14)
- September 2011 (7)
- August 2011 (13)
- July 2011 (3)
- June 2011 (4)
- May 2011 (3)
- April 2011 (2)
- March 2011 (3)
- February 2011 (3)
- January 2011 (3)
- December 2010 (4)
- November 2010 (4)
- October 2010 (9)
- September 2010 (9)
- August 2010 (8)
- July 2010 (2)
- June 2010 (2)
- May 2010 (5)
- April 2010 (2)
- March 2010 (5)
- February 2010 (5)
- January 2010 (2)
- December 2009 (1)
- November 2009 (3)
- October 2009 (11)
- September 2009 (7)
- August 2009 (9)
- July 2009 (7)
- June 2009 (12)
- May 2009 (3)
- April 2009 (5)
- March 2009 (11)
- February 2009 (3)
- January 2009 (12)
- December 2008 (8)
- November 2008 (13)
- October 2008 (12)
- September 2008 (5)
- August 2008 (8)
- July 2008 (6)
- June 2008 (7)
- May 2008 (6)
- April 2008 (5)
- March 2008 (11)
- February 2008 (14)
- January 2008 (11)
- December 2007 (6)
- November 2007 (9)
- October 2007 (10)
- September 2007 (3)
- August 2007 (10)
- July 2007 (3)
- June 2007 (2)
- May 2007 (8)
- April 2007 (7)
- March 2007 (3)
- February 2007 (4)
- January 2007 (2)
- December 2006 (8)
- November 2006 (4)
- October 2006 (14)
- September 2006 (14)
- August 2006 (11)
- July 2006 (12)
- June 2006 (8)
- May 2006 (7)
- April 2006 (7)
- March 2006 (13)
- February 2006 (8)
- January 2006 (16)
- December 2005 (17)
- November 2005 (24)
- October 2005 (1)
Imprint

This is a personal blog. All opinions expressed in this blog are the my own and do not necessarily represent the official view of my employer.

Certain links, including hypertext links, in my blog will take you outside my blog. Links are provided for your convenience and inclusion of any link does not imply endorsement or approval of the linked site, its operator or its content. I am not responsible for the content of any website outside of my blog.

gipoco.com is neither affiliated with the authors of this page or responsible
for its contents. This is a safe-cache copy of the original web site.

gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.

don't code today

what you can't debug tomorrow

introducing esprima: blazing-fast javascript parser

who is ariya

contact

Recent Tweet

Recent Posts

Recent Comments

Archives

Imprint