osa1 github about atom

An interesting case of closures: is closed-over variable reference or value?

April 24, 2013 - Tagged as: en, lua, javascript, python.

I discovered an interesting behavior of JavaScript’s closures while writing a nodejs script.

This behavior is pretty easy to observe when writing a nodejs application, because of it’s callback-based asynchronous nature, you’ll be writing callbacks all the time. Let’s say I’ll create a callback function which uses a variable defined in outer-scope, then do some actions using that variable:

var callbacks = [];
var words = [ "foo", "bar", "baz" ];

for (var idx in words) {
    var say = "say " + words[idx];
    callbacks.push(function () {
        console.log(say);
    });
}

for (var idx in callbacks) {
    callbacks[idx]();
}

What I expect from this program is to print foo\nbar\nbaz, but it instead prints baz\nbaz\nbaz. It’s like say variable used inside the callback is a reference and not a value. But it’s still strange because the reference should be local to for-loop’s body, so each var say = ... assignment should create a separate reference.

I find this behavior very counterintuitive. Before moving to solutions to fix this, I tried same program with several other languages.

Python also has this problem1:

callbacks = []

for i in ["foo", "bar", "baz"]:
    say = "say " + i
    def callback():
        print say
    callbacks.append(callback)

for c in callbacks:
    c()

This prints same wrong output as with JavaScript.

Lua, my favorite dynamic language, does great:

callbacks = {}

for _, v in pairs({ "foo", "bar", "baz" }) do
    local say = "say " .. v
    table.insert(callbacks, function () print(say) end)
end

for _, v in pairs(callbacks) do
    v()
end

It prints foo\nbar\nbaz as expected. Trying this in functional languages may be pointless, since variables are actually not variables(they’re immutable), but it may be still useful for demonstration purposes, here’s the Haskell code that works as expected:

module Main where

main = sequence_ callbacks
  where callbacks = map (putStrLn . ("say " ++ )) [ "foo", "bar", "baz" ]

I’ll show how to get JavaScript’s behavior in languages that handle this right, and in Haskell it’s harder to get this behavior because we will need to use reference cells explicitly.

I think in Python it’s more understandable, because it doesn’t have any scope declarations. ie. we can’t reason about say variable’s scope by the look of it. In JavaScript, we have var keyword that indicates a new variable is created in the scope. But it still works wrong.

Indeed, in JavaScript, the worst language ever, var keyword is just like any other strange JavaScript feature and works in an unexpected way:

> for (var v in [ 1, 2, 3 ]) { console.log(v); }
0
1
2
> v
"2"

So one explanation of this behavior may be this: In Python, we don’t know the scope of variable and it looks like it’s global. So in closure, it works like a reference. And in JavaScript, var keyword is simply broken(and also variable inside closure works like reference).

Fixing it

Let’s fix that in JavaScript and Python.

var callbacks = [];
var words = [ "foo", "bar", "baz" ];

for (var idx in words) {
    var say = "say " + words[idx];
    callbacks.push((function (say) {
      return function () {
        console.log(say);
      }
    })(say));
}

for (var idx in callbacks) {
    callbacks[idx]();
}

Here we’re creating a new scope with function(remember the JavaScript module pattern?), and then passing say variable to it. This guarantees that we have say variable local to the function. Then in callback returned by wrapper function, we have a reference just like before, but it’s not shared with any other functions.

In Python, there’s a cleaner way to do same thing:

callbacks = []

for i in ["foo", "bar", "baz"]:
    say = "say " + i
    def callback(say=say):
        print say
    callbacks.append(callback)

for c in callbacks:
    c()

Here the parameter is passed implicitly. (to me it’s still very strange and it shouldn’t be working, but for now I’ll just keep this post short)

Breaking it

Let’s have JavaScript’s behavior in Haskell:

module Main where

import Data.IORef

printFromRef r = putStrLn =<< readIORef r

mkCallbacks (w:ws) = do
    ref <- newIORef w
    r   <- iter ref ws
    return $ printFromRef ref : r
  where iter ref []     = return []
        iter ref (w:ws) = do
          writeIORef ref w
          cs <- iter ref ws
          return $ printFromRef ref : cs

main = do
  callbacks <- mkCallbacks [ "foo", "bar", "baz" ]
  sequence_ callbacks

The reason this code is that long is because we need to create and pass references explicitly.


  1. Calling this behavior problem may be a bit wrong, maybe it’s just a design decision. To me it’s a problem because this behavior is really counterintuitive.↩︎