The typical imperative loop idiom is demonstrated by this C function for finding the sum of the nodes in a list.
| typedef struct node {
    int          value;
    struct node* next;
} Node;
int
sumlist(const Node* the_list)
{
    int     sum = 0;
    Node*   l;
    for(l = the_list; l != 0; l = l->next)
    {
        sum = sum + l->value;
    }
    return sum;
} | 
This shows two major features of imperative loops. There is a loop variable, here l, that passes through a sequence of values, one for each iteration. There are values that are passed from one iteration to the next through one or more other variables, here just sum. These are global to the body of the loop in order to persist from one iteration to another. So the iterations are communicating with each other via side-effects.
Both of these features are anathema to functional programming. You can't have variables changing from one iteration to the next like that. So for and while loops and anything similar is out of the question.
Instead in functional programming, loops are done with recursion within a function. The idea is that the computation within the recursive function represents the computation of one iteration of the loop. The next iteration is started by calling the function again. Values are communicated between two iterations via the arguments passed to the function. The rebinding of the argument variables to different values on different calls is the closest you will get to changing the value of a variable in pure functional programming.
Here is the same function in SML. In the first version I've written the case expression out in full.
| fun sumlist the_list =
(
    case the_list of
      []        => 0
    | (v::rest) => v + (sumlist rest)
) | 
It can be made neater by merging the case expression with the function definition.
| fun sumlist [] = 0 | sumlist (v::rest) = v + (sumlist rest) | 
The algorithm is described by the following specification.
To find the sum of the elements in a list:
If the list is empty then define the sum to be 0.
If the list is not empty then we can split it into a first element and some remaining elements. The sum is the value of the first element plus the sum of the remaining elements.
This is a "divide and conquer" style of specification. We reduce the problem of summing the list to a smaller problem by splitting it into two pieces. SML provides a cheap mechanism to split the first element off from a list. Figure 2-2 shows this division.
To find the sum of the list [1, 2, 3, 4, 5] we reduce it to the problem of finding the sum of the list [2, 3, 4, 5] and adding 1 to it. This continues until we get to an empty list in which case its sum is known trivially. Then we can complete the additions.
The problem with this algorithm is that the addition cannot be completed until the sum of the remaining elements is known. This leaves a trail of pending additions which is as long as the list itself. Each pending addition means there is an incomplete function call taking up stack space. So the stack space consumed is proportional to the length of the list. The iterative algorithm in C above takes up constant stack space. When imperative programmers point to the superiority of iterative loops this is a major complaint they make against recursion.
But this problem is not a problem with recursion itself as much as how it is implemented in imperative languages like C. We can sum the list with recursion in constant stack space too using tail recursion.
A tail call to a function is one that is the last step made in the execution of the calling function. In other words, the return value of the calling function will unconditionally be the return value of the called function. Since there will be no more execution in the calling function its stack space can be reclaimed before the tail call. This eliminates the accumulation of stack space during the loop. This is called the tail call optimisation.
Here is the same function in SML taking care to use tail recursion.
| fun sumlist the_list =
let
    fun loop []        sum = sum
    |   loop (v::rest) sum = loop rest (sum+v)
in
    loop the_list 0
end | 
The first argument to the loop function is the remainder of the list to be counted. The sum variable accumulates the result. The initial call to loop supplies initial values for these variables. Each subsequent call to the function passes updated values for these variables. When the remainder of the list is empty then the value of the sum variable is the number of list elements. Figure 2-3 shows the function calls in a data flow diagram.
Each iteration of the loop function is an operation that shortens the list by one and increments the sum. When the list is reduced to an empty list then the accumulated sum is the answer.
I emphasised the word unconditionally in the definition of tail calls above because sometimes what is a tail call can be obscured. For example if there is an exception handler surrounding the tail call then this implies that the calling function may sometimes still have work to do handling an exception from the called function so it can't be a tail call. You may need to watch out for this in loops.
The first reaction of a C programmer to using recursion everywhere is to object to the performance of using recursion instead of iteration. But the programmer's intuition on what is expensive and what isn't is based on how C works and is pretty much useless when it comes to functional programming because it works so differently.
In this section I want to emphasise the equivalence of tail recursion and iteration by working back from recursion to iteration. This will not only show that tail recursion is as efficient as iteration but will also provide an intuition that will be useful for bringing across imperative idioms to functional programming.
If you have studied a little assembly language you will know of the variety of machine instructions for jumping around in a program. The Intel architecture has the simple unconditional JMP instruction. This corresponds to the goto statement in C. You would expect that a goto translates to a single JMP instruction. For calling functions there is the CALL instruction which works like JMP except that it saves a return address on the stack. This allows the calling function to continue execution after the called function has finished.
But when we have tail recursion there is nothing to return to. By definition the calling function has completed. So instead of using a CALL, we should be able to use a JMP instruction to implement the tail call. In other words, a tail call is equivalent to a goto.
I'll demonstrate this equivalence by manually translating the sumlist function to C. Here is the original in SML.
| fun sumlist the_list =
let
    fun loop []        sum = sum
    |   loop (v::rest) sum = loop rest (sum+v)
in
    loop the_list 0
end | 
In C, using the Node type in the section called The Basics, I get the (somewhat literal) code:
| int
sumlist(const Node* the_list)
{
    const Node* list;       /* args to loop */
    int         sum;    
    list = the_list;        /* args to the first call */
    sum  = 0;
    goto loop;              /* a tail call to loop */
loop:
    if (list == 0)
    {
        return sum;         /* value returned from loop */
    }
    else
    {
        int         v    = list->value;
        const Node* rest = list->next;
        list = rest;        /* new args for the tail call */
        sum  = sum + v;
        goto loop;
    }
} | 
Since all calls to loop are tail calls I can use gotos instead of function calls and use a label for loop. This translation simultaneously incorporates the tail call optimisation and the inlining of the loop function. A good SML compiler can be expected to perform these optimisations as a matter of course and generate machine code as good as C for the sumlist function.
Just to belabor the point, Figure 2-4 shows the equivalence graphically.
Part (a) of the figure shows a function f in a recursive loop being called. In the middle of its execution it calls itself recursively. This continues until one of the invocations chooses not to call itself. Then the invocation returns and the second half of the previous invocation executes. This continues until all invocations have returned. Part (b) shows what we have with tail recursion. There is no second half, the returned value from one invocation becomes the returned value from the previous one and eventually the returned value from the entire function. Looking at the diagram we see that the cascade of returns is redundant. In part (c) the last invocation returns directly for the whole loop. With a bit of inlining the recursion has become just a sequence of executions of the body of the function f joined by goto statements, in other words conventional imperative iteration.
The structure of the code in Figure 2-3 is such a common pattern that there is a standard built-in function to implement it. It is called List.foldl, but it is also directly callable as foldl[1]. Actually there are two variants, foldl and foldr, depending on whether you want to read the list from left to right or right to left. Normally you should read the list left to right for efficiency.
The function for summing the list using foldl can now be written
| fun sumlist the_list = foldl (fn (v, sum) => v + sum) 0 the_list | 
The first argument to foldl is a function that performs the body of the loop. It takes a pair of arguments, the first is a value from the list and the second is the accumulated value. It must then compute a new accumulated value. The second argument to foldl is the initial value for the accumulator and the third is the list. The foldl takes care of all of the iteration over the list.
In general the expression foldl f i l corresponds to the data flow diagram in Figure 2-5. In this diagram I have represented the calling of the function f by an @ operator. This applies the function to the pair of the list element and the accumulated value. These two values are always supplied in a single argument as a tuple with the list element first and the accumulated value second.
There are further abbreviations you can do in the foldl call. A function that just adds two integers together can be derived directly from the addition operator.
| fun sumlist the_list = foldl (op +) 0 the_list | 
The notation (op +) makes the addition operator into a function that can be passed around like any other. The type of the function is declared in the standard INTEGER signature as (int * int) -> int which means it takes a pair of integers as its argument, just as needed by foldl.
This notation will only work if the compiler can work out from the context that it is the integer addition operator that is needed, rather than real or some other type. It can do this in this case because the initial value is known to be the integer zero. If you wrote foldl (op +) 0.0 the_list then it would know to use the real addition operator. You can't write a sum function that can sum either lists of integers or lists of reals.
The order of the arguments to foldl is meaningful. You can use currying to omit arguments from the right to create partially satisfied functions. For example the expression foldl (op +) 0 represents a function that will take a list and return its sum. You can write
| val sumlist = foldl (op +) 0 | 
which binds the name sumlist to this partially satisified function. When you write sumlist [1, 2, 3] you satisfy all of the arguments for the foldl and it executes. Similarly you could define
| val accumlist = foldl (op +) val sumlist = accumlist 0 | 
and if you wrote accumlist x [1, 2, 3] you would be accumulating the sum of the list elements onto the value of x. (The compiler will default accumlist to do integer addition in the absence of any type constraints saying otherwise).
As a general rule when choosing the order of arguments, if you want to make currying useful, then place the argument that varies the least first and the most varying last. The designers of foldl judged that you are more likely to want to apply the same function to a variety of lists than apply a variety of functions to a particular list. You can think of the first arguments as customisation arguments so that foldl (op +) 0 customises foldl to sum lists as opposed to foldl (op *) 1 which multiplies list elements together.
A finite state machine, or FSM, is a common design technique for describing repetitive behaviour. The FSM passes through a series of discrete states in response to its inputs. As it makes the transition from one state to another it performs some output action. This may continue forever or there may be an end state. The word finite in the name refers to the finite number of different discrete states that the machine can be in, not how long the machine runs for.
Figure 2-6 shows a FSM to count words in text. There are two operational states: in means the machine is inside a word, out means the machine is outside a word. The END state stops the machine. The text is supplied as a sequence of characters. Each character causes a transition to another state, which may be the same as the previous state.
If the machine is in the out state and it gets a white space character, represented by ws in the figure, then it stays in the out state. If it gets a non-white space character (and it's not the end of data, eod) then it changes to the in state because it has entered a word. A word is completed when there is a transition from the in state back to the out state upon a white space character. The [incr] notation means that the count of words is incremented during the transition.
If the machine gets an end-of-data condition then it stops at the END state. A word has to be counted if it is was in a word at the time.
If you were to write this FSM in C you might implement the states with small pieces of code joined with goto statements. It might be something like:
| int
word_count(const char* text)
{
    int     count = 0;
    char    c;
out:
    c = *text++;
    if (!c) goto eod;
    if (!isspace(c)) goto in;
    goto out;
in:
    c = *text++;
    if (!c)
    {
        count++
        goto eod;
    }
    if (isspace(c))
    {
        count++;
        goto in;
    }
    goto in;
eod:
    return count;
} | 
(This is a bit ugly but it's a literal translation of the design and it should generate nice fast machine code if you care.)
Now that we know that tail recursion in functional programming is equivalent to goto in imperative programming we can write the same algorithm directly in SML. The set of states will correspond to a set of mutually tail-recursive functions. Here is the word counter function.
| and word_count text =
let
    fun out_state []        count = count
    |   out_state (c::rest) count =
    (
        if Char.isSpace c
        then
            out_state rest count
        else
            in_state rest count
    )
    and in_state []        count = count + 1
    |   in_state (c::rest) count =
    (
        if Char.isSpace c
        then
            out_state rest (count + 1)
        else
            in_state rest count
    )
in
    out_state (explode text) 0
end | 
The two state functions are part of a mutually recursive pair joined by the and keyword. For convenience I've represented the text as a list of characters. The built-in explode function makes a list of characters from a string and the built-in Char.isSpace tests if the character is white space. The output from the loop is in the accumulator count. It gets incremented whenever we leave the in state. In place of an explicit eod state we just return the accumulated count.
Here is the main function that calls the word_count function.
| fun main(arg0, argv) =
let
    val cnt = word_count "the quick brown fox";
in
    print(concat["Count = ", Int.toString cnt, "\n"]);
    OS.Process.success
end | 
It counts the word in the foxy message and prints the result. To print I've used the built-in concat function which concatenates a list of strings into a single string and Int.toString to make a string from an integer.
Alternatively you can represent the state by a state variable and have a single loop. The word_count function then becomes:
| and word_count text =
let
    datatype State = In | Out
    fun loop Out []        count = count
    |   loop Out (c::rest) count =
    (
        if Char.isSpace c
        then
            loop Out rest count
        else
            loop In rest count
    )
    |   loop In []        count = count + 1
    |   loop In (c::rest) count =
    (
        if Char.isSpace c
        then
            loop Out rest (count + 1)
        else
            loop In rest count
    )
in
    loop Out (explode text) 0
end | 
In this code I've used a datatype to define some states. This use of a datatype is equivalent to an enumeration type in C or C++. Apart from that there is little difference in such a small example. We now have just one tail-recursive function that takes a current state argument. I've used pattern matching in the function definition to recognise all of the combinations of state and character list.
Concerning performance, using explode requires copying the string to a list before counting. If you were dealing with long strings and were worried about the amount of memory needed for two copies then you could try just subscripting the string using String.sub. This may well run slower though since there is bounds-checking on each subscript call.
This word counting example is just an example to demonstrate a simple state machine. It's a bit of overkill. The shortest piece of code to count words in a string uses String.tokens:
| fun word_count text = length(String.tokens Char.isSpace text) | 
| [1] | This function is called reduce in some languages |