Squawks of the Parrot: WWIT: Universal bytecode

July 06, 2005

WWIT: Universal bytecode

From the very start, we declared that we'd have a mostly-universal bytecode format. That is, assuming you built parrot to use 32 bit integers for opcodes, bytecode you built on one machine would run anywhere. Not necessarily without translation, but parrot would provide that translation automatically.

Why? Simple. Binary distributions and multi-platform shared installs.

Now, before you get your freak on over the lack of source, a traditional perl community hot-button issue, remember that parrot's a multi-language interpreter, which means you might not have the compiler for the language in question. Just because you've got parrot with the perl and python compiler modules doesn't mean you've got the ruby, prolog, C#, BASIC, and Intercal modules installed, so you're kind of out of luck there even if you do have the source. (And the source can be embedded in the bytecode as metadata)

There are also times when binary installs are better, even with complete internal distributions -- you don't have to worry nearly so much that Joe from Accounting will use his "mad programming skillz" to helpfully fix the bugs in the app you're deploying. (You know those bugs -- the ones keeping Joe from trashing the database and destroying everything done since the last backup)

And, of course, combined with a linker, being able to do universal bytecode means you can link your program into one big file with all the bytecode for all the libraries built in and distribute it so people only need a base parrot install and nothing else. (Or you can then run it through the bytecode->executable converter to get a single-file executable) You are, of course, completely responsible for the social implications of that (including the legal bits) but we only do the technical bits, social things are your problem.

The multi-platform shared install is important as well, and how much is something you don't tend to notice until you've had to manage a shared install of some application that's used on multiple operating systems and hardware platforms. (Though this is becoming less common as the various hardware platforms and operating systems die out) That is, you've got a shared app install on some NFS mounted volume somewhere and all the systems on the network use the shared install.

Now, of course for this to work you need to build your main system (which in this case would be parrot) on all the different platforms, which is a pain. It also means that all binaries need to be built on all platforms, which is a really big pain when upgrade time comes if you've got modules that have binaries.

Parrot's NCI system should make modules with a C component less common, but it's still handy to do compilation to bytecode, and universal bytecode means you get to do this once and deploy it portably. This is useful in cases where compilation from source is slow (if, say, you've got a language with a strong optimizer, slow compiler, or one that triggers degenerate behaviour in parrot somewhere) or where the compiler module itself is platform dependent but the output bytecode isn't.

Anyway, the more you can share across platforms without having to do anything at all special, the easier it is to pass things around and make everyone's life easier. That's a good thing, so far as I'm concerned.

Posted by Dan at July 6, 2005 02:53 PM | TrackBack (0)

Comments

Dan:

One thing I actually wondered the other day wrt this: what about PMCs? You mentioned Parrot's NCI system briefly (which I think includes PMCs), but that didn't really cover my question.

As I understand it, languages *should* be implemented using dynclasses, which means that you have to find a way to have the PMCs installed. Or is this not the case? Should compiler writers try to write all their classes in PIR?

Thanks.

Posted by: Matt Diephouse at July 6, 2005 03:31 PM

Well, for a lot of languages they won't need custom PMCs -- the basic parrot set should be enough in many cases. For those cases where it isn't enough you hopefully implement your data types in terms of parrot classes and objects. If that won't work then you go with custom PMCs written in C.

I was hoping we'd see hybrid PMCs -- that is, language specific PMCs that were fully implemented both in pure bytecode and in C. You'd use the C version if you could, otherwise you'd use the bytecode version, much the same way that there are a number of perl modules that are either pure perl or a mix of perl and XS, depending on which you use.

There probably ought to be facilities in parrot itself, or in its library loading and management code, to handle this. (Likely in the library management library code, as I'm not sure it's something that the core engine needs to deal with)

Posted by: Dan at July 6, 2005 03:44 PM