June 11, 2005

WWIT: All those opcodes

This is one that comes up with some frequency -- why the hell does Parrot have all those opcodes. It's wasteful!

Bullshit. What it is is fast.

Opcode functions have two points. The first it to provide basic functionality. The second is to provide a fast function call interface. We'll take the second point first.

Parrot's got three official ways to call a function. The first is with the basic parrot function call. You fill in the registers with your parameters, set the parameter counts, and invoke the sub PMC. Not slow, but not horribly fast either. People tend to gripe about that, but it is, bluntly, the lowest-overhead general purpose solution that we could get. Perl puts a lot of requirements on function calls, as do the other dynamic languages, and providing that information takes a little work. That's fine, it's not that big a deal. Languages also are under no obligation to respect the calling conventions for any function or subroutine that's not exposed as a parrot-callable function. That is, if you're writing a Java compiler, say, and don't like parrot's overhead then... don't respect our calling conventions. Or, more reasonably, internally use your own, whichever's best, and then provide versions with shim wrappers that do respect the conventions for other languages to call.

The second way to call is as a method. That's got all the overhead of a function call plus the search for the actual thing being called. This is not a problem, that's what you want with method calls -- it's what you asked for, after all. Given the dynamism inherent in perl, python, ruby and friends, there's no way around it.

The unofficial way is with the bsr/ret pair. Only suitable for internal functions, it's still quite fast and damned useful. There's no reason your compilers shouldn't use this internally where needed -- it's handy.

The fourth way is the opcode function. This is the absolute lowest-overhead way to invoke a function. Unfortunately those functions (right now at least) have to be written in C, but that's just fine. The important thing here is that they are very, very fast to invoke, since there's essentially no overhead. No putting things in registers, no setting up calling conventions, nothing -- they're just invoked.

In fact, one thing that was planned was that modules could provide things that looked like functions but were actually exported ops (using the loadable opcode library system). That is, your code did a foo(bar), but the compiler knew that foo was an op and emitted:

foo P16

or wherever the bar happened to be. Moreover, module authors could slowly migrate their code from an HLL, to C code via NCI, to opcode functions. In some cases compilers could actually generate opcode functions from the source, though that does require the compiler in question to be able to generate C code. (But, conveniently, parrot can generate C code from bytecode...) When you think about it, and you really should, there's no difference between an opcode function and a regular HLL function with a compile-time fixed signature (except for the difficulty in generating them from bytecode).

Just to be real clear, since it's important, opcodes are just library functions with very low call overhead. That's it. Nothing fancier than that. They're not massively special internal anything. They're just functions that are really cheap to call. Cutting down the number of opcode functions is not sensible -- it's foolish. Any library function that could reasonably have a fixed number of arguments and not need the calling conventions (and not need to be overridden) should be an opcode function.

Concentrate on the function part. Not the opcode part.

Posted by Dan at June 11, 2005 06:43 PM | TrackBack (0)
Comments