April 29, 2004

Multimethod dispatch for fun and profit

Sometimes the more things change the more they stay the same. Other times the more they stay the same the more they change.

Anyway, for those of you keeping track at home, yesterday we just officially gave in and declared that all non-load/store operations on PMCs are always and unconditionally dispatched through Parrot's binary-op multimethod dispatch system. Before you had do actually ask for it, now you get it whether you want it or not.

Multimethod dispatch, as you might remember, is a way of finding which function to run based not just on the function name but also the types of the parameters to the function. So rather than having a single add function, there's a whole list of add functions each with a separate function signature, and when we need to do an add we find the one that's got a signature closest to what we actually have.

Why, then, go MMD?

Well, to start with, we were pretty close to it already. Perl 5 and perl 6 both define their operator overloading functions as mostly MMD. (Perl 6 does it completely, while perl 5 just does it mostly, kinda, sorta) The base object class in parrot does MMD, and we want to make it so Python and Ruby objects descend from it. (Which is doable) So basically... most of the PMC types parrot deals with do MMD.

The way we were doing it was to have the PMC's vtable function do the dispatch if it wanted it. That had some problems, though, the biggest of which is that it set up a continuation barrier. Because of the way Parrot's implemented, continuations can't cross a parrot->C or C->Parrot boundary--that is, once we leave bytecode and enter C, or call into C from bytecode, we set up a wall that a continuation can't cross. (That's because we can't save and restore the system stack) So... if you overloaded addition in bytecode, you could take a continuation just fine, but as soon as you left the vtable function that continuation becomes unusable. Hardly a huge deal, but still... annoying. Also, this is a little slow. Not a lot slow, mind, but we dispatch to the vtable function and then to the mmd function, and if we're doing this 90% of the time anyway then that vtable function dispatch is just wasted overhead.

With MMD as the default we get to change some of that.

First, of course, we cut out the extra overhead in dispatch, since we skip the vtable dispatching altogether.

Second, since we dispatch directly from the op functions to the MMD function, we can check and see if it's a bytecode or C function, and if it's a bytecode function dispatch to it as if it were any other sub, so continuations taken will live longer and will be more usable. (Why you'd take a continuation from within an operator overloading function I don't know, but that's not my call)

Third, we get to shrink down the PMC vtable. Now it's only got get & set functions and some informational and metainformational functions in it. That makes instantiating classes a bit faster, since each class gets its own vtable. (Though if class creation time's a limiting factor in your code then, well, you're gonna scare me)

Fourth, we lift the artificial limit on the number of operators one can do MMD on. Before, you could only really do it on the operations that had functions in the vtable and, while there was a pretty comprehensive list, it was still pretty fixed. Now, well... there's nothing to stop you from adding another table to the list. As long as you've got a default function to call as an absolute fallback, you're fine. This lets us add in the hyper ops as overloadable without having to screw around with expanding the vtables, and that's cool.

There's still a lot of work to be done to make this work out--the vtables have to be chopped down, the ops need to use MMD exclusively, and we need to rejig the PMC preprocessor to take what were vtable functions and turn them into default functions for the PMC type, but... well, this can be done mostly automatically, so it's doable.

This would probably be a good place to plug Eric Kidd's paper Efficient Compression of Generic Function Dispatch Tables, so I will. Useful stuff, that.

So now Parrot's got continuations as the basis for most of its control flow and does MMD for all variable ops... Makes you wonder what's next. (If it turns out we use Lisp as an intermediate language, with annotated Sexprs (encoded in XML) as our AST, you'll know we've crossed over into the Twilight Zone...)

Posted by Dan at April 29, 2004 03:01 PM | TrackBack (0)

Didn't you mention, a long time ago, that had you been familiar with the Scheme 48 virtual machine, Parrot might have developed very differently..?

(cue Twilight Zone theme)

Posted by: Gnomon at April 30, 2004 05:49 PM