June 16, 2005

WCB: Loadable opcode libraries

Extensibility (to an extreme, perhaps) had always been one of the design goals of Parrot. This was on purpose -- if we learned nothing from history it's that people will take whatever you've got, break out the mutagens and gamma ray projectors, and have at it, because there's just no way you can anticipate everyone's needs in the future. So, rather than try and do that (we just looked at their needs in the present) we left a bunch of really big "WEDGE CLEVER THINGS IN HERE" spots into parrot.

Loadable opcode libraries are one of those spots.

A loadable opcode library is, basically, a library of opcode functions which are not built into parrot. The intention was that you could have a bunch of these sitting on disk as part of parrot's library, and load them on demand. (Either explicitly with code or, more likely, have parrot automatically load them for you based on metadata in the bytecode files) This dovetails nicely with the view that most of the opcode functions are just library functions with fast-path calling conventions. It also makes it possible to keep parrot's in-memory footprint as small as possible -- if you don't need the math or networking libraries, for example, you won't load them in, and don't pay the startup cost for them. (And yes, even if they're already in memory for other processes, there is a cost associated with loading them into your process)

What would you use them for?

Well, there were three big use cases.

First, there's the 'ancillary opcode/runtime library' case. The transcendental math ops were in this case -- they'd look like they were in the base set, but they'd only get loaded in if your bytecode used them, otherwise they wouldn't.

Second was the 'extras for languages/alternate bytecode loaders' case. That is, if you were a language compiler writer and you found that there were some operations that you needed that were essentially fundamental you could package them up into an opcode library and make sure code you emitted loaded the library up. (Again, probably using the metadata embedded in the bytecode files) This does require that the libraries be available to whoever gets your bytecode files, but that's not really a big deal -- this isn't going to be too common, and I don't think it's particularly onerous to have to install the, say, Prolog runtime libraries to run programs that have Prolog components. The same thing goes if you're writing an alternate bytecode loader -- it may well be a lot easier for the JVM/.NET/Z-machine bytecode loader to have a full library of JVM/.NET/Z-machine ops in an op library and just use those instead of recompiling the from the source bytecode to parrot's bytecode.

Third was the fast extension function case. This is one where your extension module explicitly declared that a number of its functions were actually opcode functions rather than traditional parrot functions. This was supposed to be fully supported. It wasn't supposed to be the general case, of course, since in general there's too much uncertainty around perl / python / ruby programs to do this as a regular optimization, but if you explicitly declare that a function is fixed at load time and can't be changed, well... that's OK.

Additionally, and very importantly, the list of opcodes was supposed to be per-sub. That is, rather than having one big table of opcode functions, mapping opcode number to function pointers, each subroutine would have its own table. (A table that might be shared, of course -- there's no point in having separate tables if they're all the same) This is a requirement for precompiled bytecode libraries to work, since it'd be really bad if you had a global opcode table but two separate bytecode libraries that each used separate extra opcode function libraries that mapped to the same opcode numbers. (That could be avoided by rewriting the bytecode when we load it, which we don't want to do, or by having a global opfunc registry, which we don't want either. Doing it this way is safest and easiest overall)

Posted by Dan at June 16, 2005 08:40 PM | TrackBack (0)
Comments