osa1

Lazy lists in Lua and eliminating tail-calls with continuations

May 20, 2012 - Tagged as: lua, en.

I was reading some papers about parsing with continuations and I realized that I had never implemented continuations in any language. Since I’m interested in Lua nowadays, I want to implement it in Lua.

After a while, I realized that you could use continuations to eliminate recursive calls to prevent stack overflows. Now, in a language like Lua, we have tail-call optimization(TCO) and recursive calls in tail positoins are not problem, but even in languages that don’t have TCO, you can easily convert tail calls to loops with help of continuations. And I found the underlying idea of this is pretty similar to lazy-lists. Now I’m going to try to explain how.

For those who want to see the code, here’s the gist.

Nowadays most of modern languages(functional ones or not), have some primitives for lazy evaluation(like Python’s generators), and I’m not an expert on Lua but AFAIK, Lua’s coroutines can be used for lazy evaluation. But even if you use a language that doesn’t have any non-strict primitives, you can have some lazy structures(ie. in JavaScript).

The main idea of this is that you can always pass functions in a form that holds the function itself and a list of arguments that will be passed to function. And when you want to evaluate the result of the function call, you just call the function in that form with the arguments. Note that by “function itself”, I mean the function callback. Most modern languages nowadays have functions as first-class values or at least some kind of function pointers/referances, so this is not a problem.

I’ll give the code in Lua. We’ll call the structure that holds a function callback(or whatever your language call it) and a list of parameters, thunk¹ In Lua, this will be a table with two keys: f and args. Here’s an example thunk.

t = { f = print, args = {"first arg", " second arg"} }

So when we want to run the function in the thunk, we just call evalThunk, which is pretty straightforward to implement:

function evalThunk(t)
    return t.f(unpack(t.args))
end
> evalThunk(t)
first arg        second arg

Basically, when we want to make a function non-strict, we just create a thunk of it, with f = function itself, and the args = the args we want to pass to the function when we run it. Here’s a helper for it:

function makeThunk(f, args)
    return { tag = "thunk", f = f, args = args }
end

Now, with the help of this two functions, we can create a infinite-length lazy linked-lists. Each node in our linked-lists will have two keys: first and rest. first will have the value of that node, and rest will have the next node connected to that node. Here’s a helper to create linked-list nodes.

function cons(first, rest)
    return { first = first,
             rest  = rest }
end

In lazy-lists, we always have the rest part of the list as unevaluated thunks. To traverse the list to some point, we need to evaluate the nodes we passed, and when we evaluate this nodes, we just replace the thunks with evaluated values. Because we don’t want to evaluate the same node again and again for eact iteration.

function evalPart(t, p)
    if t == nil then
        return nil
    elseif type(t[p]) == "table" and t[p].tag == "thunk" then
        t[p] = evalThunk(t[p])
    end
    return t[p]
end
function first(t)
    return evalPart(t, "first")
end
function rest(t)
    return evalPart(t, "rest")
end

first and rest functions are returning the first and rest parts of our linked-list nodes. We’re expecting this parts to be unevaluated thunks. So if they’re thunks, we just evaluate them and replace the return value with thunks so that we don’t need to re-evaluate everytime we get a node.

And with the last helper, we can start creating infinite lazy-lists in Lua:

function nth(t, n)
    if n == 0 then
        return first(t)
    end
    return nth(rest(t), n-1)
end

This function is just getting the n. node in a linked list. And while getting this node, it’s evaluating all the thunk on the way with help of rest function. This is important because as you’ll see, we will be creating the rest of the lists while we’re traversing it.

Let’s see an example of infinite lazy-list that creates a list of factorials, starting from 1!:

function fact(n, f)
    n = n or 1
    f = f or 1
    return cons(n, makeThunk(fact, {n*f, f+1}))
end

This function returns a linked-list that contains factorial of 1, and then a thunk of the same factorial function, but with different parameters. The function call in the rest part of the thunk will return the next factorial, and then the same function as a thunk in rest part of the thunk, in a recursive fashion. Note that our nth function is tail-recursive so we’ll never have stack overflows while traversing the list(it’s also trivial to implement an iterative loop version of it).

> a = fact()
> print(nth(a, 1))
1
> print(nth(a, 2))
2
> print(nth(a, 10))
3628800
> print(nth(a, 20))
2.4329020081766e+18
> print(nth(a, 30))
2.6525285981219e+32
> print(nth(a, 40))
8.159152832479e+47
> print(nth(a, 50))
3.0414093201713e+64
> print(nth(a, 250))
inf

So it’s obviously generating all the factorials while we traverse the list. You can also try to see that every thunk is evaluated only once by adding some print functions to thunks.

Here’s a lazy-list that contains all fibonacci numbers:

function fib(a, b)
    a = a or 0
    b = b or 1
    return cons(a, makeThunk(fib, {b, a+b}))
end

You can also easily define functions that take linked-lists and map/filter and return the result as lazy-lists. I added them the to the gist.

Now, how’s that related with continuations that eliminate tail-calls? Let’s work on an exmple:

function sum(n, cont)
    if n <= 1 then
        return cont(1)
    end
    local function newCont(v)
        return cont(v+n)
    end
    return sum(n-1, newCont)
end

I found JavaScript version of this example in Nathan’s University and I think it’s a typical usage of continuations. Instead of returning the result, we’re passing the result to the continuation function. And when we hit the buttom(ie. when n <= 1) we pass the last result to continuation function and return it.

I don’t explain why one would do that, you can read it from Nathan’s University.

If you run a function like this in JavaScript, you get a stack overflow after a while since this function is recursive. But with the help of thunks, you can eliminate the recursive call entirely, even in JavaScript. Let’s change it to:

function sum(n, cont)
    if n <= 1 then
        return makeThunk(cont, {1})
    end
    local function newCont(v)
        return makeThunk(cont, {v+n})
    end
    return makeThunk(sum, {n-1, newCont})
end

We replaced every function call with thunks. Now you can realize that makeThunk calls are just linking function calls together. For example, after calling sum(10, function(n) print("result: ", n) end), we get this thunk:

{ tag = "thunk", f = sum, args = {9, function(v) return makeThunk(cont, {v+10}) end}}

When we evaluate this thunk, we just get another thunk unless the first arg is not <= 1. Now we need a helper to evaluate the thunk, and then evaluate the thunk returned by the first thunk, until we evaluate all the thunks. This is what’s called “trampoline”:

function trampoline(thunk)
    while true do
        if type(thunk) ~= "table" then
            return thunk
        elseif thunk.tag == "thunk" then
            thunk = evalThunk(thunk)
        end
    end
end
> a = sum(10, function(n) print("result: ", n) end)
> print(a.tag)
thunk
> trampoline(a)
result:         55

We’re evaluating the thunks in a while loop. This is how you can eliminate tail calls. Now, this exactly looks like the lazy-list method I mentioned in this post. And we already used the same thunk structure for both of them.

The only difference is, in continuations we don’t save old thunks anywhere and just replace them with new continuations, and we only return a value other than new thunks when we finished the calculation(ie. when we reach the base case).

So essentially, both continuations and lazy-lists have the same idea. And these are very easy to implement even in langauges that doesn’t support any non-strict primitives.

For a better explanation of thunks, see SICP.↩︎