You may wonder what kind of situation would require something that traditional constructs can't handle. Here is an example taken from Go To Statement Considered Harmful: A Retrospective.
int parse() { Token tok; reading: tok = gettoken(); if (tok == END) return ACCEPT; shifting: if (shift(tok)) goto reading; reducing: if (reduce(tok)) goto shifting; return ERROR; }
Let's see how using a while turns out:
type Token = END type ParseState = Reading | Shifting | Reducing | Accepted | Errored type Status = ACCEPT | ERROR let parse gettoken shift reduce () = let mutable state = Reading let mutable tok = Unchecked.defaultof<Token> while state <> Accepted && state <> Errored do state <- match state with | Reading -> tok <- gettoken() if tok = END then Accepted else Shifting | Shifting -> if shift tok then Reading else Reducing | Reducing -> if reduce tok then Shifting else Errored | Errored | Accepted -> failwith "Invalid state" match state with | Errored -> ERROR | Accepted -> ACCEPT | _ -> failwith "Non-final state"
(Note that I have introduced extra parameters gettoken, shift and reduce; I wanted to be able to write concise code that syntactically correct without explicitly defining these functions).
It looks OK, but we had to introduce an extra type, ParseState. There are also some issues with mutability and initialization of tok.
If you have some experience with functional programming, you are probably thinking that recursion is the right tool. I will leave writing a recursive version of parse to the reader, and jump to a slightly different idea that many might have missed.
type Token = END | SOMETHINGELSE type Status = ACCEPT | ERROR let parse gettoken shift reduce () = let rec do_read() = match gettoken() with | END -> ACCEPT | _ as tok -> do_shift tok and do_shift tok = if shift tok then do_read() else do_reduce tok and do_reduce tok = if reduce tok then do_shift tok else ERROR do_read()
do_read, do_shift and do_reduce are mutually recursive functions. All recursive calls are in tail position, which makes it possible to avoid stack overflows (on some platforms...).
Here you have it: calls in tail position are basically safe versions of goto.
The fantastic control flow construct that many have forgotten is simply the procedure call. Every programmer uses it a countless number of times every working day, but fear of recursion and stack overflows have kept many from using it in situations where it fits well.
17 comments:
I'm actually quite the fan of recursive programming, so your mutually recursive solution at the end was the first to leap to my mind.
While recursive functions are awesome, people don't often think of them as a method of storing state. Using them to store state can be abused just like most things. Adding else cases to the if's in the original goto code makes it look very similar to the recursive code. There is a fine line between creating functions to organize your code for readability and creating functions to represent state in an algorithm (like this). It's also fun to over analyze this classic code example.
As soon as I started reading your article, the note on Message Processing in Expert F# 2.0@p.379 flashed in my mind... I knew where you were going.
int parse()
{
Token tok;
while ((tok = gettoken()) != END)
while (!shift(tok))
if (!reduce(tok))
return ERROR;
return ACCEPT;
}
@gradbot: What would be the better alternative? At least in the functional solution, it's clear who sets tok and who reads it. That's not obvious at all in the original statement of the algorithm.
@Michael: Your algorithm does not behave in the same way as the original statement. In your code, shift and reduce must be able to handle the case where tok == END, whereas the original algorithm returns immediately.
In general, trying to encode a state machine that's represented using an automaton in the design by a bunch of whiles and ifs in code is tricky. Add or remove an arrow here and there in the design, and suddenly it's not so clear anymore how to reflect that in your code and still have confidence that the code and the design agree.
@David and Erik: Always nice to see one's ideas have been had by other people before, it shows I'm not crazy :)
@David, do you know if the note is also in the first edition of the book? I don't own the second edition :(
Ah, never mind the first part of my comment to Michael, his code is correct. I was fooled by the indentation.
Yeah, sorry about that. The title of your post caught my eye while I was at work and I couldn't resist reading it and then posting, but I couldn't take the time to explain myself properly. And blogger messed up the formatting :).
I do like your mutually tail recursive solution, and some might say that the original goto implementation is nicer to read than my while loop code (even with the formatting). I was just trying to make the point that the iterative implementation didn't need to be as complex as you showed.
If you want to go somewhere, goto is the best way to get there. — Ken Thompson
Goto is used >90k times in the 2.6.38 linux kernel, a 10x increase over 2.4.0
http://temp.thera.be/kernelcurses.png
@Maht: I'm not debating the usefulness of goto, I think everyone agrees it's useful sometimes, especially in C which lacks exceptions.
The point I'm trying to make is that in languages which allow function definitions anywhere and provide tail call removal, function calls offer the functionality you want from goto that you can't get from if, while and exceptions.
It's been a while since I read it, but Guy Lewis Steele, Jr.'s Debunking the 'Expensive Procedure Call' Myth, or, Procedure Call Implementations Considered Harmful, or, Lambda: The Ultimate GOTO comes to mind. Along with the other "Lambda Papers", available from readscheme.org.
@Tommy McGuire: Interesting paper (which was written before I was born :)
My blog post basically duplicates the content of section E in that paper. The paper reproduces an example from Yourdon of a state machine with 23 states, which is a bit more interesting than the small read-shift-reduce example shown here.
Since I didn't know about Expert F# first edition, here's a screenshot of the Nota bene:
http://goo.gl/QIETU
Joh, I'm a failed academic; all I do is point to stuff other people have written.
In this case, what you have found is a very neat idea that deserves more exposure.
@David: Thanks. I found the note in the first edition, at page 381.
@Tommy McGuire: A "failed academic"? Don't be so hard on yourself ;) Thanks for pointing me to that paper, it was an interesting read.
@Joh I agree that the functional implementation is more readable and still provides the same performance as a goto. Tailcall optimizations are what really allow us to transform gotos into functional code. I got carried away with the code and separated the two concerns out (http://fssnip.net/6Z).
Revisiting this older blog entry here and after doing a going through most of Scott Wlaschin's excellent FSharpForFunAndProfit series, in particular the Designing with Types series I noticed something.
In your first example the two "required" failwith is an indication that Accepted and Errored have been added to ParseState due to the while loop.
Splitting that into it's own type, let's say FinalState reveals the problem. We must have those invalid routing because the while loop is an expression on state and we need a way to break out. Errored/Accepted have been artificially added to make the code possible - possible because in F# we can't break out early (no break/return statements).
Incidentally, Michael McMullen's code is doing exactly that which makes it possible to return the one type. That or we would have had to use a separate mutable boolean for the control which is just as smelly.
Removing the two cases from ParseState leads us inevitably to Joh's second solution (unless we give up figuring it out) which is F# idiomatic. I don't think we can find a "correct" (in the sense that I mean) that doesn't involve a state machine, but would be happy to be proven wrong.
Post a Comment