25 February, 2010

Asynchronous callbacks in JavaScript

var Queue = (function () {
    "use strict";

    function make() {
        var results = [],
            stack = [],
            queue = [];

        function buildControl(queue) {
            var obj = {
                next: function () {
                    obj.next = function () {};
                    stack.shift();

                    var first = stack[0];
                    if (first) {
                        first(buildControl(queue));
                    } else {
                        queue();
                    }
                }
            };
            return obj;
        }

        var obj = {
            make: make,

            sync: function (func) {
                if (typeof func === "function") {
                    stack.push(func);

                    if (stack.length === 1) {
                        func(buildControl(obj.async()));
                    }
                }
            },

            async: function (func) {
                var index = queue.push(func) - 1;

                return function () {
                    if (index === null) {
                        return;
                    }

                    results[index] = Array.prototype.slice.call(arguments);
                    index = null;

                    for (var i = 0; i < queue.length; i += 1) {
                        if (results[i]) {
                            if (typeof queue[i] === "function") {
                                queue[i].apply(null, results[i]);
                                delete queue[i];
                            }
                        } else {
                            return;
                        }
                    }

                    queue.length = results.length = 0;
                };
            },

            run: function (func) {
                return obj.async(func)();
            }
        };
        return obj;
    }

    return make();
}());

JavaScript is normally synchronous: it executes each line in sequence. Certain functions, however, operate asynchronously: they do not halt execution, yet you need a way to get the results at a later time. To deal with these situations, you usually pass in a callback function that is run when the asynchronous call completes.

This works rather well, until you try to mix asynchronous and synchronous code together. Consider, for example, that you are trying to make multiple calls using XMLHttpRequest. The requests themselves are asynchronous, but you want them to behave synchronously.

Here's an example to illustrate the problem better. Assume the function getURL asynchronously loads a URL and returns it's contents:

getURL("foo", function (data) {
    // do something with data here

    getURL("bar", function (data) {
        // do something with data here

        getURL("qux", function (data) {
            // do something with data here
        });
    });
});

In order to do this, you normally need to stack each asynchronous call inside the callback of the previous call! Being able to write them linearly, however, can be useful. Using this library, you can now do this:

Queue.sync(function (queue) {
    getURL("foo", function (data) {
        // do something with data here
        queue.next();
    });
});

Queue.sync(function (queue) {
    getURL("bar", function (data) {
        // do something with data here
        queue.next();
    });
});

Queue.sync(function (queue) {
    getURL("qux", function (data) {
        // do something with data here
        queue.next();
    });
});

As you can see, we're adding functions onto a queue. When an asynchronous call within the queue is done, it calls queue.next(), which then executes the next function in the queue.

An even better example is being able to synchronously execute calls within a loop:

["foo", "bar", "qux"].forEach(function (url) {
    Queue.sync(function (queue) {
        getURL(url, function (data) {
            // do something with data here
            queue.next();
        });
    });
});

The above does the same thing as before, but uses a loop to generate the three calls. This makes it trivial to add new calls: just add a string to the array. This also opens up the possibility of dynamically creating an array, and then looping over it and executing asynchronous callbacks in order.

There is one problem with the previous three examples. Although each call to getURL is asynchronous, it waits before executing the next call. It would be better to execute all the calls in parallel, but display the results sequentially. Here's an example of how to do that:

["foo", "bar", "qux"].forEach(function (url) {
    getURL(url, Queue.async(function (data) {
        // do something with data here
    }));
});

This should return the same results as the previous examples, but performs faster because the three calls are executing in parallel. To be more specific, the results are limited by the speed of the slowest call, rather than the sum of all the calls.

So how does it work? Queue.async returns a function. When that function is called, it executes (in sequential order) the results that have been found so far, making sure not to execute the same result twice. When it finds a call that doesn't have a result yet, it stops.

Where I've found this most useful is when creating extensions for Google Chrome. In the Chrome extension system, almost all of the API functions are asynchronous: you have to pass in a callback if you want to retrieve the results of the function.

Unfortunately, stacking callbacks inside of callbacks reduces readability, especially if they are deeply nested. Using the above construct, you can program in a linear style, thus avoiding race conditions while retaining the benefits of asynchronous code.

Also, if you want to push a synchronous function onto the queue, you can use Queue.run:

var results = {};

["foo", "bar", "qux"].forEach(function (url) {
    getURL(url, Queue.async(function (data) {
        results[url] = data;
    }));
});

Queue.run(function () {
    console.log(results);
});

The above executes three asynchronous calls in parallel, and when they're all done, it runs the function passed to Queue.run, which then logs the results.

Here's how you might do it if you didn't have Queue:

var results = {};

function end() {
    console.log(results);
}

var index = 3;

["foo", "bar", "qux"].forEach(function (url) {
    getURL(url, function (data) {
        results[url] = data;

        index -= 1;
        if (index === 0) {
            end();
        }
    });
});

Lastly, you can use Queue.make to create multiple queues. Ordinarily, if you use Queue.sync, only one function can execute at a time. But by using multiple queues, you can have one function executing per queue. This also lets you nest calls to Queue.sync:

var one = Queue.make(),
    two = Queue.make();

[".gif", ".jpg", ".png"].forEach(function (suffix) {
    one.sync(function (queue) {

        ["foo", "bar"].forEach(function (name) {
            two.sync(function (queue) {

                getURL(name + suffix, function (data) {
                    // do something with data here

                    queue.next();
                });
            });
        });

        two.run(queue.next);
    });
});

The above synchronously executes calls to getURL in this order:
"foo.gif", "bar.gif",
"foo.jpg", "bar.jpg",
"foo.png", "bar.png"

Why do you need two queues? Here is what the queue would look like if you used only one queue:

.gif foo bar .jpg .png foo bar foo bar

But by using two queues, it looks like this, which is what you want:

.gif foo bar .jpg foo bar .png foo bar

In this particular case, we could have used a single queue and it might have worked okay, but in a different case two queues might be necessary to avoid this issue.