Wednesday, August 3, 2011

The forgotten control flow construct

There is a truly powerful control flow construct that many programmers have forgotten. It helps implement complex flows that ifs, whiles and matches can't handle. Not even throwing and catching exceptions can measure itself to this construct.

You may wonder what kind of situation would require something that traditional constructs can't handle. Here is an example taken from Go To Statement Considered Harmful: A Retrospective.

int parse()
{
    Token   tok;

reading:
    tok = gettoken();
    if (tok == END)
        return ACCEPT;
shifting:
    if (shift(tok))
        goto reading;
reducing:
    if (reduce(tok))
        goto shifting;
    return ERROR;
}

Let's see how using a while turns out:

type Token = END

type ParseState = Reading | Shifting | Reducing | Accepted | Errored
type Status = ACCEPT | ERROR

let parse gettoken shift reduce () =
    let mutable state = Reading
    let mutable tok = Unchecked.defaultof<Token>
    
    while state <> Accepted && state <> Errored do
        state <-
            match state with
            | Reading ->
                tok <- gettoken()

                if tok = END then
                    Accepted
                else
                    Shifting

            | Shifting ->
                if shift tok then
                    Reading
                else
                    Reducing

            | Reducing ->
                if reduce tok then
                    Shifting
                else
                    Errored

            | Errored | Accepted -> failwith "Invalid state"

    match state with
    | Errored -> ERROR
    | Accepted -> ACCEPT
    | _ -> failwith "Non-final state"

(Note that I have introduced extra parameters gettoken, shift and reduce; I wanted to be able to write concise code that syntactically correct without explicitly defining these functions).

It looks OK, but we had to introduce an extra type, ParseState. There are also some issues with mutability and initialization of tok.

If you have some experience with functional programming, you are probably thinking that recursion is the right tool. I will leave writing a recursive version of parse to the reader, and jump to a slightly different idea that many might have missed.

type Token = END | SOMETHINGELSE

type Status = ACCEPT | ERROR

let parse gettoken shift reduce () =
    let rec do_read() =
        match gettoken() with
        | END -> ACCEPT
        | _ as tok -> do_shift tok

    and do_shift tok =
        if shift tok then
            do_read()
        else
            do_reduce tok

    and do_reduce tok =
        if reduce tok then
            do_shift tok
        else
            ERROR

    do_read()

do_read, do_shift and do_reduce are mutually recursive functions. All recursive calls are in tail position, which makes it possible to avoid stack overflows (on some platforms...).

Here you have it: calls in tail position are basically safe versions of goto.

The fantastic control flow construct that many have forgotten is simply the procedure call. Every programmer uses it a countless number of times every working day, but fear of recursion and stack overflows have kept many from using it in situations where it fits well.