May 27, 2003

What the heck is: Multimethod dispatch (MMD)

AKA Signature-based dispatch, and a number of other names.

This is a way for your program to have multiple subroutines, functions, or methods with the same name that differ only in the types of their signatures. That means you can have a subroutine foo that takes an integer, one that takes a string, and a third that takes a pointer to a structure of type bar. Same name, different parameter types. The compiler or your runtime then figures out which one to use. With statically typed languages it can be done at compile time, with dynamically typed languages it's done at runtime.

In fact, your code might look like:

sub foo(int a) {
print "An integer!";
}

sub foo(float a) {
print "A float!";
}

sub foo(string a) {
print "A string!";
}

What happens is that, when your code calls foo(), something (either the compiler or the runtime) takes a look at its parameter and decides which of your foo subs should be called. If the parameter's an int, the first foo is called. If it's a float, the second function is called, and if it's a string, the third function is called. This can be something of a disconcerting thing, if you're not used to it.

Now, before we go any further it's important to note that almost all languages that let you do this either require some sort of notation on the sub or method declaration, or have multimethod dispatch as a very fundamental construct, so either you've asked for it or you're expecting it. It's not, generally, something that springs itself on you. (If you were worried about accidentally doing this)

Why would you do this? Well...

If you've worked with a dynamically typed language, you've probably written code that looks something like:

method foo (object bar) {
if (bar.isa("integer")) {}
elsif (bar.isa("string")) {}
elsif (bar.isa("Thing")) {}
elsif (bar.isa("Widget") {}
}

where your method takes an object as a parameter, and at runtime you figure out what type of object it is. If you've done that, you've undoubtedly messed it up once or twice, and you've probably also used other people's code that does this and you've wanted to add a path for your object type, but couldn't.

If you've done operator overloading, you've probably also done MMD. Operator overloading (really a subject of another WTHI) is more or less when you can define how some basic operator such as addition or multiplication works for your class. Whenever you have this sort of overloading you're pretty much forced to do some sort of MMD if you want to Do The Right Thing, and that means checking the type of the other operand. Multiplication with an integer on the left and a matrix on the right does something very different than multiplication with a matrix on both sides. Or it does if you implement any sort of standard multiplication. Languages that do overloading without doing MMD make writing proper overloaded operators a pain.

Even in a language that has no concept of multimethod dispatch, it's still there, at least a little. Consider the following snippet of C:

c = a + b;

No biggie, right? Nothing multimethod there. Well... Try it. The C compiler will emit different code depending on the types of a and b. If they're both integers you'll get one set, if one of them is a float you'll get another, and if they're both floats you'll get a third. Multimethod dispatch, based on type. You just can't get to it in C.

One big benefit to multimethods is that they are generally (though not always) considered to live outside any namespace or object package, so you can add in your own foo method to any class, and it Just Works. Or, more to the point, you can add your "multiply(int, matrix)" function to the mix without having to thump the int class. (if you even can) One other benefit, in an OO language, is that generally the compiler or runtime will do the best it can. If you have a foo method that takes two strings, and instead call it with two objects that are subclasses of strings, the compiler (or runtime) will walk the inheritance tree of the arguments looking for the closest match, for some version of closest, and call that. If one can't be found, you get an error.

Now that you know what they are, there's the why question. Why do this?

Well, efficiency is one reason. You can have different versions of a routine that are optimized for the different types of the parameters. As in the C compiler case, you may want to emit different code depending on the operand types, to go as fast as possible.

Speed of dispatch is another--odds are the MMD code in the compiler or runtime is very well optimized for speed, and likely will outstrip anything you can do by hand. Even if your own code is faster, the next rev of the compiler or runtime may beat it, which is how these things often go.

Bug reduction is another. It's a lot less error prone to have one chunk of code, provided in the compiler or runtime, to figure out what routine to call based on the arguments than it is to have each of a thousand subs and methods hand-roll the code at their start. It's dull, boring code and very easy to get wrong.

Correctness is yet another reason. If you want your language to behave properly in the face of operator overloading (and you might not, depending on how you feel about overloading) you have to allow for very different behaviour based on the argument types. Matrix * Matrix does something very different than Matrix * Integer. (And different still from Integer * Matrix, but that's another story) This isn't even any sort of "I'm doing something wacky with operators!" it's just correctness.

Since multimethods generally live outside any class it makes doing proper operator overload extensions easier as well. While I know it "breaks encapsulation to inject a method into another class" you really do have to in many cases to get it right. Consider the matrix example we've been batting around. Integer * Matrix is an important case, but most operator overloading systems check the left-hand side to see what they should do. Since the Int class probably came standard, it undoubtedly has no idea what a matrix is, or what to do with it. Your Matrix class, on the other hand, does know what to do, so having the int*matrix version in the matrix class makes sense. It also means that when you create your own Foo class, which also knows about matrices (though matrix has no idea about Foo) you can have it provide code to handle the matrix case. While you can't handle all eventualities, with MMD you can at least handle more.

MMD. Interesting, useful, and in many cases what you're doing already, only in a less sucky way.

Posted by Dan at May 27, 2003 06:13 PM | TrackBack (0)
Comments

Dan -

I've been enjoying your WTFI columns, especially the issue on continuations awhile back -- keep it up! One suggestion on this one: your "method foo" example doesn't really distinguish between methods and multimethods. Sure, in some sense it's a difference of degree, but I think it's worth highlighting. First, while most people have worked in a language where they can do the former, relatively few have worked with full multimethod support. Second, the problem of finding the most appropriate method becomes vastly harder when dispatch considers more than the first parameter.

You might also mention "double dispatch", which you can use in C++ to effectively get multimethods: Call a virtual method on the first operand, which turns around and calls the appropriate method on the second.

/s

Posted by: Sean O. at May 27, 2003 06:45 PM

Good point. I'm going to let this sit for a bit, then go through and give it an edit. I'm not as happy with it as I might like.

Posted by: Dan at May 27, 2003 06:52 PM

Although multiplication is an interesting example for MMD, it's one of those ivory tower examples that mean something academics, but don't have a lot of resonance with "industrial programmers". Sort of like the Fibonacci sequence and recursion; when was the last time *you* needed to calculate F(20)?

What's a mundane example of MMD? So far, I haven't read much (elsewhere) to convince me that MMD is worthwhile unless I need to warp the language to get * and + to work on new object types.

In C++/Java, new() would qualify, since there are dozens of ways to instantiate an object based on initial conditions. But I don't see that kind of behavior in Perl because the fundemental types are richer (or, rather, there's less of a need to accomodate a static type system).

Posted by: ziggy at May 27, 2003 09:50 PM

Another example people like to use is shape intersection tests, where how you do it depends on the types of both potential intersectees. A perlier example is symmetric comparison -- without multimethods,
3 OP "3"
and
"3" OP 3
will mean different things.

/s

Posted by: Sean O. at May 27, 2003 11:09 PM

Another more industrial example of MMD is user interface. If you dispatch on both the event type and the object recieving the event, then your text pane does not have to go through the

if(leftClick)
elif(rightClick)
elif(escapeKey)

stuff...

Posted by: Boots at May 28, 2003 03:37 AM

Ziggy,

I'm begging for MMD for perl in my current mundane project. A perl 5 "Table Model" module. It's meant to model the type of table you'd eventually output as HTML,XHTML or PS.

The code to allow a user use any of the following variations of adding a row to a table:

# $table = Data::Table::Model->new();
#
# $table->add_row( $arrayref )
# $table->add_row( $arrayref,[$idx1, $idx2, $idx3 ] )
# $table->add_row( $obj,    [$meth1, $meth2,$meth3 ] )
# $table->add_row( $hashref,[$key1, $key2, $key3 ] )
# $table->add_row( $coderef )
# $table->add_row( $coderef,[$arg1, $arg2, $arg3 ] ) #aref is argument to coderef
# $table->add_row( $coderef, {key=>val, key1=>val1,key2=>val2})#href is argument to coderef

Let alone the variations for add_rows, add_col and add_cols.

I'm doing all this work so it's dead simple to create and use a table object from data sources you're likely to use like DBI or Text::CSV_XS.

It was a lot of work to find a strategy for doing this and to keep it maintainable and testable version. My solution basically involves
add_row using gotos to dispatch to the various internal methods after doing ref or isa on the input.

I'm really looking forward to the cleaner code perl6 should be able to afford me in this area

Dan, thanks for your work on Parrot and WTHI!

Posted by: clscott at May 28, 2003 11:00 AM

In p6 will multimethods have access to the private attributes/methods of the the invocants?

Posted by: John at May 28, 2003 12:38 PM

IIRC, the answer on private access is 'depends'. A method has access to the private attributes of the class in which it is declared. So if you declare a dozen foo methods, each with different signatures, inside the Bar class, they'd have access to the class attributes and private methods of Bar. They might have access to the Bar instance variables (though I'm not sure about that--you might only have access if the invocant is in the same class as the method is declared in), and they'll definitely not have access to the private attributes of objects from other classes, no matter where they are.

More succinctly, you don't have access to the invocant's private bits just because it's the invocant--you'll need to be in the invocant's class, too.

Of course, this is all subject to change, as Larry's not finished the object spec yet. And you'll be able to cheat and get around the rules if need be, of course... ;)

Posted by: Dan at May 28, 2003 02:31 PM

As I understand it you've just described what is usually refered to as overloading (and the same concept if often mis-taught as polymorphism!).

Multi-method dispatch is related to the concept of overloading, but it is a special case. It is where the type of the object that the method is being called on also determines the method to be called, as in CLOS.

So your example of addition where an overloaded + operator, used in a+b, would result in a.add(b) which requires mutlimethod dispatch on the type of a as well as b, but your first example is just overloading.

Posted by: Mark at June 4, 2003 02:05 PM

I put up a followup to this at archives/000198.html with more detail, but operator overloading is a subset of MMD, rather than a superset. (I'm a perl guy--I'm used to the invocant of a method being considered as one of the parameters, and dispatchable as such, so my examples tend to reflect that)

Posted by: Dan at June 4, 2003 02:22 PM

Ahh yes, you're a perl guy :-) I rarely use Perl and I tend to think in Java. I only really meant to mention that it was also called overloading (or ad-hoc polymorphism), but I wanted to qualify which bits were overloading.

But you're thinking Perl world, so all your examples were MMD, not overloading... doh!

Cheers

Posted by: Mark at June 4, 2003 04:40 PM

Well, as I said, overloading is a subset of MMD, generally a limited one since most languages that just support overloading don't seem to allow you to inject methods into other classes. (Which is necessary to properly handle non-commutative operations, if you want your magic widget which lives on the RHS of the operator to be dealt with properly)

Anyway, operator overloading is a subset of regular method overloading, as the overloaded operators are just methods with funky names. MMD, since it works on sub/function/procedure dispatch as well as method dispatch, is a superset of method overloading.

Posted by: Dan at June 4, 2003 05:40 PM