July 09, 2005

WWIDD: MMD everywhere

On the top of the list of things I'd do differently in Parrot's design is fully embrace multiple dispatch. I mean fully. Any operation with two or more operands would be multiply dispatched, and this includes assignment.

When Parrot started MMD just wasn't on the table. Oh, sure, there were overloaded operators, but that was all left-side-wins stuff, wedged into the PMC vtable. We introduced a standard MMD system, with the assumption that PMCs which wanted to do it would have their vtable slots use the standard system so there'd be cross-compatibility. That then, after a while, led to the realization that we could shrink PMC vtables a lot, simplify the internals, respect a default left-side-wins anyway,make things faster, and generally make things work nicely if we just went entirely MMD for binary operations with some proper defaulting. Unfortunately this was after we'd tossed the keyed versions of the different binary ops, a loss I still think is a waste. That's a rant for another day, though.

The one binary operation I failed to consider was assignment, and it's an important one, and parrot didn't multiply dispatch it. Big mistake. (One that's fixable with the current parrot design, interestingly enough, the same way that we could make the binary operations use MMD without affecting the bytecode in any way)

Now, the whole point of MMD, at least as far as I'm concerned, is to allow you to cheat like hell when you know it's safe. That is, you provide a nice, generic, safe interface that, while potentially a little slow, lets you have those nice black-box data structures we're always told we really want (and never believe until it bites us hard, usually a year or two too late to fix the problem) while still being as fast as it possibly can be in those cases where we know we don't need to be careful.

For example if we're assigning an Integer to an Integer there's really no need to go jumping through any sorts of hoops -- the Integer class knows what the internal structure of an Integer looks like (we can hope, at least, and worry about the authors of the class if it doesn't) so making function calls to get values is silly. That is, the assignment function for an Integer-> Integer assign should look like:

dest_pmc->cache.intslot = source_pmc->cache.intslot

rather than

dest_pmc->cache.intslot = source_pmc->vtable->get_integer(interpreter, source_pmc)

(modulo wrong off-the-cuff C indirect function call syntax). Of course you can't have the first form as the standard assignment function, since it's wrong for so many things. Indeed, the only thing you can really do is have the standard assignment look like:

dest_pmc->vtable->set_pmc(interp, source_pmc, dest_pmc)

with the set_pmc function for the destination then:

dest_pmc->cache.intslot = source_pmc->vtable->get_integer(interpreter, source_pmc)

Which is definitely sub-optimal in specific cases, while OK in the more generic case. We could, of course, throw a flag test in the assignment function for the Integer class, but we know from experience that flag tests are more expensive than they're generally worth, are an inextensible pain since they lead to if ladders at the top of functions, and are an indication that we should be doing MMD anyway.

There you go. Straight assignment should've been MMD, but it wasn't, as much for hysterical raisins and evolving understanding as anything else, and it could be made multiply dispatched if

That then leads to the question 'should the full binary operation be MMD on the destination?' That's a valid question, since parrot requires (or did require) that the destination for a binary operation exist. There's been much bitching about that on the list, but it's a pretty significant win in terms of temporary objects created (or, rather, not created) and going on about that's a topic for another WWIT anyway, so I'll stop with the explanation there.

So. Should "a = b + c" dispatch on just the types of b and c for the addition, and then on the result type and a for the assignment, or should there be just one big dispatch on the types of a, b , and c?

That's a good question. I dunno what the right answer is. Or, rather, both Yes and No are perfectly fine answers, and the best one is a matter of figuring out which would be the common usage and going with that.

My gut feel is that generally it's not a win, so the two-step dispatch is better. On the other hand, a good case could be made for doing the three-arg dispatch. Pleasantly, since as far as the bytecode is concerned It's All Magic Anyway, either one could be chosen and later on the other could be switched in.

Who knows, if Chip wants it, this could go into parrot now. Certainly's going to in my tree.

Posted by Dan at July 9, 2005 09:18 AM | TrackBack (0)
Comments

Certainly's going to in my tree.

That's forking talk pardner!

Interesting to drop by every few weeks, read some great ideas and see your attitude (apparently) change. Reading SOTP just after you left I never would have thought you'd be talking about your tree.

Posted by: Fergal Daly at July 25, 2005 07:04 PM

That's forking talk pardner!

Why yes, yes it is. :)

Once the dust settled and some time passed, it seemed really stupid to abandon everything I'd done, so I snagged a copy of the tree and will slowly add in the things I want, and see where it goes.

Posted by: Dan at July 27, 2005 01:33 PM