2 Nov, 2012

by Carl Mäsak

no notes

Macros progress report: D2 merged

The grant that I'm currently working on, the macros grant, has now reached its D2 milestone. That is, the so-called "unquotes" work as advertised in Rakudo:

macro apply($code, $argument) {
    quasi {
        {{{$code}}}( {{{$argument}}} );
    }
}

apply sub ($t) { say $t }, "OH HAI";    # prints "OH HAI"

Macros are routines, and so they take parameters. The above apply macro takes some $code and an $argument, and calls the former with the latter. It's as if, when the macro expansion is all done, what's left in the code is the following line:

(sub ($t) { say $t })("OH HAI");    # prints "OH HAI"

Of course, we never actually see this line, and in the compiler it's never textually substituted like that, because the substitution all happens on the level of syntax trees, not on the level of text.

The new thing in this picture is the {{{ }}} thingies: the so-called unquotes. Back in my last progress report, we still didn't have unquote support in nqp. Now we do. In fact, we got unquote support already back in August. That, it turns out, was the easy bit.

Then the conceptual problems appeared. For a few months, whenever I thought about macros, my brain would melt trying to think about those problems. It took quite a while to go from unquotes existing to them being actually useful. What follows is an explanation of the problem and its solution.

The problem is one of context. By that, we mean the variable bindings seen by a piece of code. Psychologically, we expect a piece of code to see the variables in its lexical environment, that is, all the variables declared in all surrounding blocks.

my $a;

sub f {
    my $b;
    sub {
        my $c;
        # $a, $b, $c visible here
    }
}

The exciting and highly useful thing about closures is that they honor this expectation, while simultaneously being first-class values that you can pass around between parts of your program. This combination of static bindings and dynamic function values is so powerful that you can use it to emulate the object encapsulation so espoused by OO enthusiasts.

In the above case, the sub f implicitly returns its inner sub, which can be transported across the Russian tundras, stored in a dank wine cellar for 75 years before being uncorked... but when finally called, it will still remember its $a, $b, and $c bindings. That's because closures aren't just containers of statements. They also hold a reference to an OUTER block through which variable lookups can be made.

(And in the above case, $b and $c are properly encapsulated. $a isn't, since it's globally visible.)

We want macros to behave the same. That is, quasi blocks should behave like closures.

macro f {
    my $a = "OH HAI";
    quasi {
        say $a;
    }
}

my $a = "B... BOOOOM!";
f;      # OH HAI

It's the same principle: after the f; call has been conceptually replaced by say $a; this code should still remember its context, its origins, namely the macro body. The fact that say $a; doesn't print "B... BOOOOM!", from the variable in the mainline scope, is part of what's called hygiene. Hygiene means that just like with closures, bindings inside are isolated from bindings outside by default.

(The term "hygiene" is often conflated in people's minds with the term "AST-based macros". The two are not the same. AST-based macros are necessary but not sufficient for hygiene. End of rant.)

But wait a minute. These two situations are obviously very similar. In the case of the closure, we know that the closure must keep an OUTER reference to remember its context. What is it in the macro case that remembers the context?

The quasi construct generates an AST, a syntax tree, that then gets spliced into the mainline code where the f; call used to be. This AST must be the vessel for the context information. So, just like a closure is a bunch of statements plus a context, an AST object must be a tree plus the context information. If the AST didn't have a context, the above macro expansion couldn't be hygienic.

We must perform unholy surgery on the block that eventually results from the quasi AST. The block will naturally have mainline context, but we want to recontext it to have macro context. So in the Rakudo macro expansion code, there is some code that transplants the context from the AST object to the new block. It involves a Rakudo-specific op called perl6_get_outer_ctx. It's only used for this code path.

This much was clear already when I was merging D1. Now for the new complications.

Macro expansions consist of two stages of substitution, and this is what makes them useful:

Unquotes are replaced by ASTs, typically arguments originating from the outisde of the macro.
The macro invocation is replaced by the AST returned from applying the macro to its arguments.

When implementing D1, I sorted out my thoughts by writing lots of ruminating gists. During this phase of the work, I've composed fewer gists, but an unexpected thing happened: the more time passed, the more I realized how much I had misnamed the variables in the macro code I had contributed to Rakudo.

It wasn't that I was careless about naming when I first wrote that code. Instead, my understanding of the macro domain had shifted so much that the choices of names I had made started to feel wrong. Today I landed a long-awaited refactor which not only unified the three macro-invocation code paths, but also fixed all the now slighty-off variable names. Quite a relief.

Here's part of what changed. During D1, a lot of AST objects in source ended up being called quasi ASTs. Nowadays, the following distinction is made:

quasi ASTs are what quasi blocks generate. Naturally.
argument ASTs are the things that the parser generates as it parses the macro arguments, just before it invokes the macro.
macro ASTs are what's returned from a macro, to be spliced back into the mainline code.

There is overlap. A macro AST has been generated as either a quasi AST or an argument AST at some point or other. But the focus here is where the ASTs are coming from. And it turns out that matters a lot. Quasi ASTs and argument ASTs are quite different. Hence the need for precision.

By the way, there is possibly a fourth kind of AST, one that we don't have yet, but that is totally possible once people start building macro libraries and stuff:

synthetic ASTs, syntax trees built up programmatically from individual AST nodes or smaller ASTs.

Don't know yet if that's going to become a reality. Until it does, quasi blocks fill much the same role.

Once we had unquotes working in Rakudo, the one glaring omission was that the unquotes didn't behave hygienically. Which was a shame because, again, people really expect hygiene to work:

macro test($value) {
    my $a = "B... BOOOOM!";
    quasi {
        say {{{$value}}};
    }
}

my $a = "OH HAI";
test $a;    # OH HAI nowadays, used to B... BOOOOM!

Just as the quasi AST should remember its own original context, so should the argument AST that ends up in $value. It used not to, and so the context it got was the quasi's, resulting in "B... BOOOOM!" above. A little ironic that it was the successful recontexting of the quasi AST that messed things up for the argument AST.

For months I struggled with the problem of how to recontext the argument ASTs. I developed a solution in a branch, which finally worked as it should, except that it still didn't recontext the ASTs properly! Argh!

My plan of attack had been to set the context at the time of unquote evaluation, as the quasi is evaluated when running the macro. The other day, jnthn pointed out that this approach may be overly complicated: maybe the context could be set at the time of argument collection, just before calling the macro. This is definitely simpler. Not least because at this point, the parser actually is in the context it wants to set! And in particular, no block surgery was needed this time.

I tried it. It worked. This solution almost feels too simple, and I'm not sure yet it will let us do all the things we want to do. But all the tests pass, and I have hammered this solution with tricky situations that might break, and it's holding up so far. So, we now have hygienic macros with unquotes in Rakudo.

Here are the macro-related gists that I wrote during this period. They are in various states of obsolescence at this point, but still potentially informative:

Proceedings: What are macros?
macros use case by FROGGS
It's all about context
pack, unpack, pack, unpack

The other artifacts that have emerged since D1 are as follows:

A new spectest file. Also, thanks to a suggestion by moritz++, the macro spectest files are now much better named.
A number of commits to the nom branch of Rakudo:
- can parse unquotes in quasis
- backpedal on throwing an exception
- <statementlist>, not <EXPR>
- implement unquote splicing
- X::TypeCheck::MacroUnquote -> X::TypeCheck::Splice
- throw X::TypeCheck::Splice everywhere
- make comment more precise
- refactor
- wrap macro-arg ASTs in thunks
- unify macro code paths
- make macro expansion ignore empty ASTs
A number of commits to the nqp project:
- added QAST::Unquote
- added .evaluate_unquotes method to QAST nodes
- add .evaluate_unquotes to BVal and Block
- shallow-clone nodes with kids
Two more deliveries of the macros talk: one at French Perl Workshop in Strasbourg, and one at YAPC::Europe in Frankfurt.

And once again, it's time to glance at what's ahead in the grant work. D3 promises to deliver hygiene. As explained above, D2 already provides this; I actually could have declared the milestone D2 finished at the point I got unquotes working in Rakudo (in August), but it felt slightly disingenuous to do so, because unquotes aren't really useful until they're fully hygienic. Anyway, half of D3 ends up being already done. What still needs to be implemented is the COMPILING:: pseudopackage, which gives the macro author several ways to opt out of hygiene. This is sometimes very powerful, even if it makes sense for it not to be the default.

My grant reports have been sparse lately. I'm hopeful that the wait until the next one won't be as long.

26 Oct, 2012

by Carl Mäsak

no notes

Sweet ports

I had invited jnthn over because I had invested in a bottle of nice port wine that I really wanted to try, but one whole bottle was too much for me. For both of us, during the course of an afternoon, it was about right.

So that was the premise. Because we both like bad puns — jnthn admittedly more than I do — we decided to make it a "porting hackathon", where we would port some piece of software to Perl 6 while sipping the sweet beverage from Portugal.

Which piece of software did we end up porting?

JSON::Path from CPAN. It's a module that does for JSON what XPath did for XML.

Well, actually JSON::Path is not very tied to the JSON format at all. It just expects your data to be a hierarchy of arrays, hashes, and scalar values, just like JSON.

Here are some JsonPath examples, just to give you a taste of it:

$.class.student[0].name           name of first student in class hash
$['class']['student'][0]['name']  same, with an alternate notation
$.beers[*]                        all beers
$.beers[*].name                   names of all beers
$..author                         recursively find all 'author' entries

See here for a more detailed specification.

Any nice insights along the way?

Yes, we found a pattern which we really liked. Not sure what it's called, or if it has a name. We're certainly not the first to come across it — it's probably well-known in FP circles. But as soon as we came up with the idea, the rest of the design basically fell into place.

Perhaps the easiest way to explain the pattern is to say this: our action methods make little subroutines.

method command:sym<.>($/) {
    my $key = ~$<ident>;
    make sub ($next, $current, @path) {
        $next($current{$key}, [@path, "['$key']"]);
    }
}

This is especially apt, because the JsonPath language is all about how to find data in a hierarchical data structure. So each little subroutine outlines how one particular piece of syntax digs further down into the data structure. In the above case, it's saying that a JsonPath fragment such as

.foo

will be translated to Perl 6 code such as

$current{'foo'}

All the other action methods also return little anonymous subs like this one. To make it all work, the grammar is structured so that the fragments end up nesting around each other from left to right:

token commandtree {
    <command> <commandtree>?
}

The resulting AST comes out looking like a Matryoshka doll of commandtree nodes. Or a chain where each link is generated by its own specific rule. The links of the chain are forged together by the commandtree action method, that uses assuming to pre-set the $next parameter of the anonymous functions:

method commandtree($/) {
    make $<command>.ast.assuming(
        $<commandtree>
            ?? $<commandtree>[0].ast
            !! #`[create a function that returns the result];
}

And that's it. Long ago in the mists of computer history, AI researchers must have felt that intelligence would sprout from depths of LISP because they knew about patterns like this one: how to build up complex behavior by forging together little links in a chain, each one a simple "atom" of behavior. At least that's how it felt to see this experiment work.

It also made me tweet this:

real coding, not some watered-down attempts at coding."" title="" />

Yes, but how was the port?

Oh, it was pretty nifty. I've decided to provisionally like port wine after this. It's fruity and sweet, and goes really well with cheese, or quality chocolate.

Here's a picture of the finished bottle, along with all the tests passing:

Our finished module can be found at Github.

31 Jul, 2012

by Carl Mäsak

no notes

July 31 2012 — the finished game

All last days of a project should be like this.

No stress. No looming deadline, just a normal deadline. No tearing one's hair. No failing test, just 144 really good passing ones.

First, I fixed something that I realized earlier today: I don't need to special-case the wrapper methods for take and put_thing_on with special handling of the tiny disk and its interaction with the bounded Hanoi context. Now that all the disks and rods are objects in the game, I'll just add more hooks. Nice.

Then it was time to split the big script file into smaller modules and test files.

$ wc bin/crypt 
  3483   8026 103863 bin/crypt

100 KB! All of it written this month... that means I've netted around 3 KB of code per day. Hm, that actually sounds about right.

Anyway, some splitting later...

$ wc `find bin lib t -type f`
   334   1007  12198 bin/crypt
   938   1931  25364 lib/Adventure/Engine.pm
   276    612   9013 lib/Crypt/Game.pm
     9     31    274 lib/Event.pm
   327   1021   9950 lib/Hanoi/Game.pm
   309   1010   9726 t/hanoi.t
   714   1227  16398 t/crypt.t
   633   1359  17366 t/adventure-engine.t
  3540   8198 100289 total

The script file completely deflated; it now only contains the MAIN routine for the game. Even it could be largely factored out into an Adventure::Engine::REPL or something, but right now it's a little too coupled with crypt for me to attempt that.

Much of the general logic went into Adventure::Engine, not surprisingly. Crypt::Game now mostly contains the world-building logic and some wrapper methods.

The test files also distribute easily. 54 hanoi.t tests. The crypt.t and adventure-engine.t tests were all mixed together in one MAIN multi, but they were easy to tease apart. (A deeper issue there is that some tests in crypt.t would need to be duplicated, anonymized ("de-crypted", heh), and put into adventure-engine.t so that it gets better coverage. I might get to that.)

If you're wondering why things got smaller from this, I think it was because some amount of indentation actually disappeared from the test files. Those tests were all in separate MAIN routines previously — indented one level. Now they're not.

But, just splitting things apart wasn't enough. I had promised to move Adventure::Engine into its own repository as an independent module, so I did that. It can now be found at github/masak/Adventure-Engine.

Then I took the chance to publish crypt as well. I published it as Crypt::Game, which was probably a mistake from an module naming perspective. I'll investigate tomorrow what it takes to rename the module Game::Crypt.

Both modules can be found at modules.perl6.org, of course. You should be able to install it with Panda now, though I haven't tried.

The contest for finding bugs in the adventure game is now closed as well, and I do have a winner — to be announced. If anyone wants to sneak in at the end with lots of bug reports and suggestions for improvements, I won't be impossible. The later they are, though, the more awesome they have to be in order for me not to consider them to be too late.

...and that concludes this day's work, and this blogging month. Thanks for following along this far. I'll probably sum up the month and what I learned, when I've regained enough strength to do that.

Strangely Consistent

Musings about programming, Perl 6, and programming Perl 6

Macros progress report: D2 merged

Sweet ports

July 31 2012 — the finished game