First, ANTLR. This is now close to the final 3.1 release – as it has been for the last few months to be honest. But hopefully, by the time you read this, ANTLR 3.1 will be out in its full glory. The main changes have been in the syntax for tree grammars (tighter) and, no doubt, numerous bug fixes.
But with the DLR, it’s been all change. My original calculator examples just don’t work with the latest DLR. They don’t even compile. The internal structure of the DLR has been moved around an awful lot – not really what you would expect in a transition from a first beta to a second, but it has. I suspect this has more than a little to do with the impact that Iron Ruby and Silverlight have had on a DLR originally designed to run Iron Python. Still, if we end up with a DLR that’s better able to run a variety of dynamic languages, then I’m all for it.
In the last article, we ended up with a very simple calculator that could handle expressions like (1 + 2) / (3 + 4)
. I did this by implementing a simple lexer/parser in ANTLR3 and hooking it up to just about the simplest DLR layer I could get away with via an ANTLR3 ‘tree grammar’. Just to summarize, a ‘lexer’ takes in input stream of characters and assembles complete ‘tokens’ as output to the ‘parser’. So the lexer might take the single characters ‘1’, ‘2’, ‘3’ and pass a complete ‘INT’ token ‘123’ to the parser. Next the parser assembles the tokens from the lexer into an Abstract Syntax Tree (AST). For example, the expression ‘ 1 + 2’ becomes a tree structure with ‘+’ as the root node and ‘1’ and ‘2’ as the child nodes (or leaves). Finally, the ‘tree grammar’ allows you to ‘walk the tree’ and do something with it. This is where the interface between the DLR and ANTLR actually happens. My contention is that ANTLR3 is the best way to go about this – it reduces the ‘impedance mismatch’ between the DLR and the grammar considerably.
Now, the next steps in the calculator example will be to add variables and then function calls. The first thing I want to do is allow statements like:
x = 1
and (x + 1)/(x - 1)
This will be followed on by ‘built-in’ function calls like:
y = ln(x)
and then ‘user defined’ function calls like this:
def square(a)
return a * a
end
I’ve included the code for the first part (assignment statements) this month and I’ll explain how it all works next month. However, if you want to play with the code in the meantime, here are a few notes:
First, I’ve combined the lexer and parser into a single ANTLR3 file, MyL.g. This comprises three sections:
a set of instruction to ANTLR3 telling it what language to output, and so on.
the grammar (marked by //**** PARSER ****//)
the lexer (marked by //**** LEXER ****//).
Secondly, the tree grammar is in MyLTree.g.
Thirdly, the main entry point is in MyL.cs – start here by setting a breakpoint if you want to follow what’s going on using the debugger.
Lastly, the main work of connecting up the lexer/parser/tree grammar/DLR is done in MyLLanguageContext.cs in the routine ParseSourceCode.
I’ll go through what’s happening in these files in (much) more detail next month. See Part Three.
Dermot Hogan is the chief architect of the Ruby In Steel IDE and he is currently involved in the design and implementation of the new Sapphire language for the Dynamic Language Runtime.