Puzzle: operator precedence
-
Let's say you're a totally theoretical programmer implementing a totally theoretical simple programming language for doing calculations. Let's say you've built it up to the point where you have tokenized and built syntax trees. For instance:
a = 3 * 2 - 1
That is tokenized into 7 tokens. The syntax tree is built recursively. First the "a" is consumed and interpreted as an identifier (variable) to be evaluated at runtime. Then the syntax tree is called recursively on the rest of the tokens, starting from =. The equals is consumed and interpreted as an assignment operation. Assignment operations have a left hand side and a right hand side. The left hand side is set to the expression in progress, in this case the identifer 'a'. The right hand side is set to another recursive call to the syntax tree builder. Eventually the literals and operators are consumed and interpreted as a syntax tree with this structure:
a = (3 * (2 - 1))
That operator precedence is a side effect of the recursive way the tree is built, rather than an intentional operator precedence. So let's say you're this programmer writing this bit of code, what is a good way to impose operator precedence within this syntax tree building process?
-
One common method is to have a grammar with a hierarchy of nonterminals to express precedence and associativity, e.g. something like this:
Nobody in his right mind would build a parser by hand-built recursive descent, though.
By the way, an equally interesting problem is the dual problem of how to pretty-print ASTs with a minimal number of parentheses (considering operator precedence and associativity).
One can even go so far and have only one grammar from which both a parser and a pretty-printer with that property can be derived, but only insane people would ever do that.
-
One common method is to have a grammar with a hierarchy of nonterminals to express precedence and associativity, e.g. something like this:
Nobody in his right mind would build a parser by hand-built recursive descent, though.
By the way, an equally interesting problem is the dual problem of how to pretty-print ASTs with a minimal number of parentheses (considering operator precedence and associativity).
One can even go so far and have only one grammar from which both a parser and a pretty-printer with that property can be derived, but only insane people would ever do that.
-
@Klaus said in Puzzle: operator precedence:
Nobody in his right mind would build a parser by hand-built recursive descent, though.
Why not?
@Horace said in Puzzle: operator precedence:
Why not?
Because look-ahead is too difficult to implement, and grammar refactorings like left-factoring suck. If parsing speed isn't super-important, use a parser combinator library. Otherwise, use a parser generator such as ANTLR. I'm a big fan of SGLR parsing - it allows you to express your grammar in the most direct way possible and avoids the complications of a separate scanner.
-
It's just a simple language to do calculations and logical operations on literals or input variables. No functions and the only flow control is if...else. Structure is limited to parentheses and blocks (to group the if...else). I know there are libraries available for this sort of thing, but my hand coded implementation is less than 1000 lines of code and has everything I need. Or at least, so far so good. I solved the problem of this thread by calling a precedence imposer method on binary operations after the recursive tree builder returns the right hand side. If the right hand side of the operation is itself a binary operation, like so:
a <op1> (b <op2> c)
then the precedence imposer compares precedence of op1 and op2 and if op1 precedence is >= op2, the operation is re-organized as
(a <op1> b) <op2> c
Seems to be working so far. This has been a fun little project, as evidenced that I'm working on it on a weekend.
-
According to the google calculator (math expressions typed into the search box), the exponent operator has different associativity than minus.
4-3-2 is calculated as (4-3)-2
4^3^2 is calculated as 4^(3^2)Python appears to share this. That throws a wrench into my rules that all equal precedence operators are computed left to right.