osa1 github gitlab twitter cv rss

Top-down expression parsing is easy

January 29, 2015 - Tagged as: en, haskell, lua, parsing.

I recently fixed language-lua’s 2-years-old expression parsing bug. Previously it was using Parsec’s expression parser, which is actually horrible because it can’t handle chained unary operators.

Two weeks ago I decided to take a look into Lua’s original implementation, and in about an hour or so the algorithm was crystal clear to me. I immediately implemented it and closed the 2-years-old bug report.

This implementation is essentially a port of Lua’s expression parser. Recently I thought about the algorithm and I was wondering if this has a name – the algorithm looked pretty obvious to me once I understand and given how much we know about parsing I thought this should have a name.

I found this algorithm named “precedence climbing”. This is almost the same algorithm, only difference is that instead of using lookahead I’m just consuming the binary operator and returning it to the caller(which is parsing an expression with lower precedence than current parser) if precedence is lower. Associativity handling is also different(I use different left and right precedences to handle associativity) but the idea is really the same.

Now, there is also another algorithm called Pratt, and I can’t read the original paper(paywall), but according to this LtU discussion it should also be similar. Indeed, this explanation of it looks pretty similar, and this StackOverflow answer says that Lua’s implementation is “Pratt style parsing”.

So it seems like we have two, or maybe one since they’re actually very similar, solution(s) to solve top-down expression parsing problem and Haskell implementation using Parsec is possible in only 12 lines of code.

A challenge

One challenge might be to modify Parsec’s expression parser so that internally it generates a Pratt/precedence climbing parser. I’m hoping to spare some time to work on this.