In this post, I'll touch on some of advantages and limits of this approach.

Recursive descent parsers are limited to LL(k) grammars, which means that left-recursive grammars cannot be directly handled. It is necessary to rewrite such grammars to remove left recursion.

Consider the common problem of parsing expressions, e.g. Boolean expressions. A typical formulation of the grammar could be:

Expr := AndExpr | OrExpr | NegExpr | Variable | True | False | "(" Expr ")" AndExpr := Expr "AND" Expr OrExpr := Expr "OR" Expr NegExpr := "~" Expr | Expr

Note there is an ambiguity, as

`"A AND B OR C"`could be parsed as

`(A AND B) OR C`or as

`A AND (B OR C)`. In order to lift this ambiguity, priorities can be assigned to rules. Similarly for associativity, allowing to parse

`"A AND B AND C"`as

`(A AND B) AND C`or

`A AND (B AND C)`.

Such a grammar, if implemented naively using a recursive descent parser, would loop endlessly on some inputs: The parser would try to match rule

`AndExpr`, which would try to match

`Expr`, which would again try to match

`AndExpr`...

An alternative formulation of the grammar can be used that will generate semantically equivalent results

Expr := OrExpr OrExpr := AndExpr "OR" OrExpr | AndExpr AndExpr := NegExpr "AND" AndExpr | NegExpr NegExpr := "~" AtomExpr | AtomExpr AtomExpr := Variable | True | False | "(" Expr ")"

Note the hierarchy between

`Expr`,

`OrExpr`,

`AndExpr`,

`NegExpr`and

`AtomExpr`. The left-most rule appearing on an alternative in the right-hand side of each definition is always a "lower" rule:

`Expr`refers to

`OrExpr`,

`OrExpr`to

`AndExpr`,

`AndExpr`to

`NegExpr`,

`NegExpr`to

`AtomExpr`.

`AtomExpr`refers to the higher rule

`Expr`in

`"(" Expr ")"`, but that's OK as

`Expr`does not appear as the left-most rule (the token for a left parenthesis does).

What I like with this formulation is that it captures both associativity and operator precedence within the system of rules.

It is common to split parsers into two stages, using a lexer that processes the input stream in linear time, producing a pre-digested stream of so-called tokens. While this has undeniable advantages when it comes to performance, it prevents composing grammars and rule-specific tokenization. This is why most languages have keywords reserved for future use, and this also explains why you can't use identifiers that are also keywords even in situations where no ambiguity arises.

For these reasons, I personally prefer avoiding the separate lexing stage if performance constraints allow for it.

Regarding performance, it's important to write a recursive descent parser in such a way that decisions on the rule to try can always be done looking at a finite number of tokens.

In the example below, the second implementation performs significantly better:

let (|OrExpr|_|) s = match s with | AndExpr (e1, Token ("OR", OrExpr(es, s))) -> Some (e1 :: es, s) | AndExpr (e, s) -> Some ([e], s) | _ -> None

let (|OrExpr|_|) s = match s with | AndExpr (e1, s) -> let rec work s es = match s with | Token ("OR", AndExpr(e, s)) -> work s (e :: es) | _ -> (List.rev es, s) let es, s = work s [] Some(e1 :: es, s) | _ -> None

To see what happens, consider the input "~X". This is a "degenerate"

`OrExpr`with a single term. The first implementation will go all the way down through each rule to identify a

`NegExpr`, promoting it to an

`AndExpr`, then will look for "OR", which isn't there. The first match case will therefore fail, and the second will be tried, redoing the work. The same problem exists at all levels in

`AndExpr`,

`NegExpr`and

`AtomExpr`. What makes matters worse is that this kind of "degenerate" case is actually the normal common case in typical expressions.

The second implementation uses the functional equivalent of a while loop, going as far as possible. The trick here is to factorize common prefixes into a single branch, thus avoiding redoing the work in case of a match failure.

An important problem with my simple example code is the lack of error messages. Those are important from a usability point of view, but I'm currently not too sure how to do this with manually-written parsers.

To summarize, using this technique requires some amount of manual work, which is the price to pay for not using smarter parsing frameworks. I wouldn't advise it over using parser generators, but it's interesting nonetheless. I can see it having uses for extremely small grammars, for instance when parsing fragments in a document. Otherwise, go for a parser generator such as fslex+fsyacc or ANTLR (which generates recursive descent parsers), or combinator-based parsers such as FParsec.