March 27, 2003

(closures) The reason for Parrot, part 2

Okay, the previous entry wasn't titled part 1, but...

Anyway, I promised Jon Udell more of an explanation as to why perl's not a good fit for .NET and the JVM. This applies to Python and Ruby too, in their own way, though I'm not going to speak directly for them.

This time around, let's talk about closures and their relationship to lexical variables.

A closure is reasonably simple, at least in perl terms. It's a subroutine that captures its environment. Generally they're anonymous, and generated at runtime. A sample might look like:

  sub make_closure {
    my $foo = 0;
    return sub { return ++$foo; }
  }

This sub is desperately simple, and all it does is return a reference to an anonymous subroutine that will return a monotonically increasing integer, one per invocation of the subroutine. The anonymous sub is a closure because it closes over its environment, capturing what was in scope at its creation and remembering. Well, OK, arguably that's not desperately simple, so here's how it acts. If you had code that looks like:

  $sub_ref = make_closure();
  print $sub_ref->(), "\n";
  print $sub_ref->(), "\n";
  print $sub_ref->(), "\n";
  print $sub_ref->(), "\n";

It would print out the sequence 1, 2, 3, 4, each on its own line.

How it works is pretty simple. The first line calls make_closure, which returns a reference to an anonymous subroutine which increments and returns the value of the $foo in scope when the anonymous sub was created. The next four lines call it and print the results.

That's not the interesting part. The interesting part is code that looks like:

  $sub_ref = make_closure();
  $another_ref = make_closure();
  print $sub_ref->(), "\n";
  print $sub_ref->(), "\n";
  print $another_ref->(), "\n";
  print $another_ref->(), "\n";

because this prints out 1, 2, 1, 2. The reason this happens is that when make_closure is called, it instantiates a fresh version of $foo each time. Not too surprising, as this is what you'd expect from a sub--it instantiates the lexicals that belong inside it. The sub { $foo++ } part returns a reference to an anonymous subroutine, which for perl is inherently a closure, that accesses $foo. $foo is a lexical declared inside of make_closure, so since it's created anew each time make_closure is invoked, each new anonymous sub references a new version of $foo. There' s no collision there.

How does this complicate, and slow down, the system?

Well, remember how variable allocation works in non-closure languages, such as C. Ponder the following C function:

  *int foo() {
    int bar = 1;
    return &bar;
  }

What this function does is allocate a variable bar, give it a value of 1, and return a pointer to it. If you've ever done this in C, or languages that allocate lexicals like C does, you know what happens--the variable whose address this returns will soon turn to mush. Why? Because of the way C handles space allocation for lexical variables.

What C does is to use a stack, generally the system stack, to store lexicals. When you enter a C function, the function preamble reserves space for all its variables by adjusting the top of stack pointer a bit. The lexical variables are all in that chunk of stack that was just reserved. When the function exits, the stack pointer is readjusted and the stack space is freed up for the next function call.

Obviously, for closures to work, it means that lexical variables, what most folks think of as stack-allocated variables, can't be allocated on the stack. Or if they are, it means that the stack has to be a series of call frames, with each call allocating a new frame. Otherwise the variables the closure captures would just keep getting stomped on.

This slows down function calls in languages that support closures. In a non-closure language, all you have to do to reserve space is twiddle the stack pointer. Often there's not even a check to see if there's enough space on the stack--if too much space is reserved you fall off the end of the stack into unmapped memory, your program segfaults, and that's that. Leaving the function means you just decrement the stack pointer and that's that. (In a language with variables that have active destructors, like C++, destructors may get called)

For a language with closures, calling a function requires actively allocating a chunk of memory from the heap. While not horribly expensive, it does cost a lot more than just incrementing the stack pointer. Feeing up the frame for the lexicals also has to use the heap freeing system, which is also more expensive than just decrementing the stack pointer. A language with closures makes a garbage collection obligatory as well, though since the JVM and .NET already do this at least that's no extra expense.

Is this cost huge? No. But it does make function calls more expensive, which can be felt if you're using a system with a lot of small leaf functions, a common scheme in OO programming.

Closures are terribly useful, but there is that cost. If your language uses them, then of course the cost is justified, since it gets you the feature (closures) that you want. If you're working in a language that doesn't do closures the cost is useless overhead--you pay it because you must pay it (there's no good way to optimize the cost away) but you get no benefit from it.

That's why it's silly for the JVM or .NET to add support for closures. Java doesn't do closures, and none of the current .NET languages do closures. Wedge them into the engine and all of a sudden all the programs running on those VMs suddenly run slower, but get no benefit from it. Could they add that feature in? Sure. But why? To run perl or ruby? Is that really their target audience, and do they (Sun and/or Microsoft) really get any benefit from it?

Personally, I think not and not, respectively.

Posted by Dan at March 27, 2003 11:09 AM | TrackBack (5)
Comments

This is one of the better explanations of closures I've seen, from a non-lispy perspective. Of course, I already understand closures, but...

don't suppose you could do the same thing for continuations? Those, I still haven't wrapped my mind around.

Posted by: Todd Larason at March 27, 2003 02:27 PM

Thanks. :)

Continuations are next, after a day or three--there's a limit to how badly I want to make people's heads explode. I may need a diagram as well, I'm not sure.

Posted by: Dan at March 27, 2003 02:41 PM

That was a good explanation of closures. It's similar to some I've seen in JavaScript.

Although they are unlikely to be used by siginificant percentage of .NET developers, F# and SML.NET do closures. There are a couple PPT (they are MS Research projects) presentations on their implementations. The SML slides do wish for closures in the runtime, but they were able to implement them anyway.

On continuations, the best resources I've come across are Dan's own questions on them in the ll1-discuss archives (answered by Steele and Graham, among others) and chapter 3 of Lisp in Small Pieces by Christian Queinnec.

Posted by: Robert Sayre at March 29, 2003 01:57 AM

Like I said, closures are doable, just slow if there's no engine support. I've no doubt they can be done (I even know how :) but there's definitely a performance impact.

Posted by: Dan at March 29, 2003 11:43 AM

Oh, I hadn't read part 1 yet. :)

Posted by: Robert Sayre at March 29, 2003 12:15 PM

The new version of C# contains support for closures.

The interesting part is that no changes are required on the virtual machine. It is very simple to implement, given:


delegate void Greet ();

Greet English ()
{
int count = 1;

return new delegate {
Console.WriteLine ("Hello {0}", count++);
}
}

You can then do:


Greet a = English ();
a ();
a ();

And the output will be:


Hello 1
Hello 2

The trick is simple: a closure is implemented by creating an instance of a proxy class, this is all done by the compiler:

class GreetProxy {
int local_count;

GreetProxy (int count)
{
local_count = count;
}

void Run ()
{
Console.WriteLine ("Hello {0}", local_count++);
}
}

The compiler changes the delegate creation for this:


new GreetProxy (count);

And any references to local variables or parameters are changed into references to an instance variable.

Posted by: Migue de Icaza at March 30, 2003 07:49 PM

While that's interesting, what would be more interesting is if this:

Greet a = English ();
Greet b = English ();
a ();
b ();

printed

Hello 1
Hello 1

Does it? If not, it's not that useful unless C# doesn't have a good way to get a reference to an existing named function/sub/method/whatever. If that local_count variable is an object attribute (or property, or slot, or instance variable, or whatever the heck the language at hand calls a value tucked inside an object) I can see that it would work, as long as all the captured variables are accessed by reference rather than by value.

In this example, if count is allocated on the stack as a simple integer, rather than off the heap and accessed by reference, then these closures won't work right in the cases where there are multiple closures with reference to the same outer-scope values. That's not uncommon in the sort of perl code I write, and certainly necessary for complete closure support. (Whether complete closure support is necessary is a separate argument, of course)

I can also see this running into big problems if .NET has limits on the number of arguments that a function can take, since it's not uncommon to close across dozens of variables. Which, yes, can be sloppy, but there you go.

Posted by: Dan at March 31, 2003 11:11 AM

First of all: nice description.

However, I think you're missing something here -- it may not be necessary to pay the cost of slow allocation of stack frames in all cases. After all, even in a language that supports closures, the closures aren't always used -- even in Lisp I'd venture to say that there are more functions that DON'T use closures than there are that do.

Depending on the degree of dynamicism in the language, it seems like the compiler could probably figure which functions return closures and which local variables stand a chance of being bound up in a closure. We could then use traditional stack-based storage for everything we know WON'T be involved in a closure.

Robert Sayre's notes about C# describe one way of doing this... have the VM support only stack-based storage but allow the language to support closures by recognizing when they're there and allocating things on the heap instead (in .NET terms, creating an object). Java can do something along these lines using the "final" attribute on local variables and then using them in an anonomous inner class, but it's SEVERELY limited because the "final" status means the variable can't be modified after capturing it in the closure. For instance, the following creates a closure in Java:

// Untested code, please forgive typos.
public class ClosureTest {
    public abstract class IntFunction() {
        public int call();
    }
    public IntFunction makeClosure() {
        final int foo;
        return new IntFunction() {
            public int call() {
                return foo; // can't say "return ++foo;"
            }
        }
    }
}

What I think would be ideal would be if the VM had support for this... perhaps providing two different primatives for function invocation, one that was stack allocated, and one which allowed for potential variable capture in a closure. Of course the compiler would have to be conservative... sometimes using the slower approach when a stack-based call would have worked, simply because the compiler couldn't be sure it was safe. But it's far better than ALWAYS paying the price, particularly in the OO world with lots of very short methods.

-- Michael Chermside

Posted by: Michael Chermside at April 4, 2003 08:35 AM

just a couple things to point out about using java inner classes as closures.

final only makes primitives "unmodifiable", but for Objects, it means you can't assign some other object to the captured variable. so if the Object you capture in an inner class has destructive methods (StringBuffer for example), then you can modify the object through its methods, but you can't re-assign a new object to the captured variable.

one possibility to fix the problem whole final problem is copying or cloning the captured value or object.

int foo = 0;
return new IntFunction() {
    int myfoo = foo; // copies the foo value
    public int call() {
        return ++myfoo;
    }
};
Posted by: Dave Lee at April 5, 2003 03:19 PM

Here's a (probably too simple) example of why closures can be difficult to optimize away when they're not being used:


sub make_closure {
my $foo = shift;
return sub { return $foo->{n}++ }
}

sub do_something {
my %thing;
my $f = make_closure(\%thing);
print $f->() for (1..10);
return $f;
}

make_closure() accepts a reference to a variable outside of its scope. In order for $f to be useful after it is returned from do_something() (imagine do_something() was interesting enough for that to make sense), it must allocate %thing on the heap and not on the stack.

However, there's nothing about do_something which can be used to figure this out. This is more difficult when make_closure() and do_something() are in different files/modules, and/or if make_closure() hasn't been defined at the time do_something() is compiled. Or if make_closure() is actually AUTOLOAD and you don't even know where it is until the interpreter looks for it at runtime.

Alan

Posted by: Alan Ferrency at April 8, 2003 01:22 PM

> However, there's nothing about do_something which can be used to figure this out.

Yes, there is -- there's a reference to %thing
that could possibly outlast the scope.
This is quite independent of closures;
given my %thing; foo(\%thing);
foo might simply store its argument somewhere, requiring that %thing be allocated on the stack.
*Most* lexicals, however, don't have their
references stored or passed as args, and those
lexicals can all be allocated on the stack.

Posted by: jqb at June 4, 2003 02:59 PM

> foo might simply store its argument somewhere, requiring that %thing be allocated on the stack.

Oops, I meant "allocated on the heap" here.
Sorry for the sloppy bandwidth waste.

Posted by: jqb at June 4, 2003 03:02 PM

The problem with lexicals in any system that supports perl (or python, or ruby) is that you can't truly store them on the stack since you can't know at compile time whether references leak out. With sufficient introspection just about anything you call can potentially take and hold references to objects. Certain languages can forbid this--C# may not allow redefining methods in the base String class at runtime (or maybe it does, I don't know) for example--but other languages do.

The make_closure call could, even if it completely failed to explicitly reference %thing, still take a reference to it by looking up the call chain at the lexicals in outer scopes. As, quite honestly, could overloaded methods. While you wouldn't expect + to take a reference to something, especially something it's not attached to, it certainly could if it wanted.

That's one of the things that makes it difficult with Perl. If you allow for a more restrictive definition of things, it does become somewhat easier, though still potentially too complex to know for sure.

Posted by: Dan at June 4, 2003 03:10 PM