June 10, 2005

WWIT: Generating executables

One of the things parrot is supposed to be able to do, and currently does (albeit with on and off breakage) is generate standalone executables for parrot programs. That is, you feed in source or bytecode and you get back something that you can run the linker against to give you a standalone executable, something you can run without having parrot installed anywhere.

Interestingly, this is a somewhat controversial decision. I'll get to that in a minute.

The upsides to doing this are several:

  1. Distribution is easier
  2. No versioning problems
  3. Execution's faster
  4. Fewer resources used in multiuser situations

And of course the downsides:

  1. You can get a lot of big executables with a lot of overlap
  2. Some of the dynamic features (on the fly compilation, for example) are somewhat problematic
  3. Bugfix upgrades don't happen easily

Now, it's very important to keep in mind that generating executables is not a universal solution. That is, there are times when it's the right thing to do, times when it's the wrong thing to do, and times when it's kind of a wash.

Building executables has never been a general purpose solution, and in most cases the correct thing to do is to either run a program from the source, or run it from the compiled bytecode. (and there are plusses and minuses to each of those) However...

The problem with all the 'scripting' languages is packaging and distribution. Not just in the commercial sense, which is what a lot of people think of (and which causes a lot of the knee-jerk reactions against it, I think), but in the general sense. If I have a program I want to distribute to multiple users, it's a big pain to make sure that everything my program needs is available, especially if I've followed reasonably good programming practice and actually split my code up into different files. In that case I've all the source to my program that needs to be distributed, and a list of modules that the destination system has to have installed, along with their prerequisites, along with possibly the correct version (or any version) of the driving language.

This isn't just a problem with people looking to distribute programs written in perl commercially, or looking to distribute them in a freeware/shareware setting. It happens institutionally, a lot.You may have ten, or a hundred, or ten thousand desktops that you need to distribute a program out to. The logistics of making sure everything is correct on all those desktops is a massive pain in the ass, made even more complex by the possibility that you've got multiple conflicting requirements across multiple apps. (That is, you've got one app that must have perl 5.6.1, another that has to have perl 5.8.x but not 5.8.0, a third that requires one particular version of GD, and a fourth that doesn't care but has been tested and certified with one particular set of modules and you can't, either by corporate policy or industry regulation, use anything else)

That's when the whole "just install [perl|python|ruby] and the requisite modules" scheme really sucks. A lot. Pushing out large distributions with lots of files is a big pain, and pushing out several copies is even worse. Then there's the issue of upgrading all those desktops without actually breaking things. Ick.

This is where building standalone executables is a big win. Yeah, the resulting file may be 15M, but it's entirely self-contained. No worries that upgrading some random module will break things, no need to push out distributions with a half-zillion files, and if you want to hand your app to Aunt Tillie (or random Windows or Mac users) you've got just a single file. No muss, no fuss, no worries.

Yes, it does mean that end users can't upgrade individual modules to get bugfixes. Yes, it does mean the executables are big. Yes, it does mean there may be licensing issues. Yes, it does mean that pulling the source out may be problematic. Those are all reasons it's not a good universal solution, not a reason to not provide the facility for times it is. (That people have felt the need to roll their own distribution mechanisms to address this problem in the current incarnations of the languages is an indication that it is a real problem that needs addressing)

Like many other problems that there were multiple implementations for (like, say, events) Parrot provides a solution as part of the base system so folks can use their time reinventing other wheels more productively.

Posted by Dan at June 10, 2005 12:27 PM | TrackBack (0)

Unless, of course, you have Debian installed on all of those systems :-)

Posted by: anonymous at June 10, 2005 05:00 PM

Nope, that doesn't come anywhere close to cutting it, and neither does "unless, of course, you have OS X installed on all those systems". Debian actually makes it worse -- for these issues you want to actively avoid using a system-level packaged solution, since it usually makes things much more of a pain to deal with, being just one more thing you need to avoid.

Posted by: Dan at June 10, 2005 05:24 PM

Why do the executables have to be big?

Posted by: Toby at June 10, 2005 10:05 PM

The executables are big because they link in all of the parrot runtime library, which for me is 3M stripped (and 10M unstripped) plus whatever the bytecode compiles down to. This is because the code'll end up using pretty much all of parrot's facilities, so there's not much to leave out. (And because many linkers are pretty stupid and will yank in entire modules if a single routine from them is used)

This isn't too surprising, since runtime libraries are normally reasonably large. The C runtime library on my machine, for example, is 4M all by itself. The difference here is that you can dynamically link to the C RTL and not include it since it's heavily tested, very standard, and nearly changes, and is universal, while since parrot isn't you need to statically link the parrot RTL in.

OTOH, disks are less than a dollar a gigabyte these days, so it's not a huge problem, and at 3M parrot takes the same space as 3 minutes of a good-quality MP3.

Posted by: Dan at June 11, 2005 11:31 AM

At my work we deployed AFS about 10 years ago to address this sort of thing. It's basically a larger hammer and solves the problem by letting you install many versions of the same damn thing (we've got about 8 versions of perl, from 4 on up, some duplicating versions with differing modules available). While disk is cheap, that many apps can add up and it's nice that they can be used on demand.

Granted, AFS introduces a new set of problems and quirks to deal with, but I don't think this is a problem that can be served perfectly by one solution. =)

Posted by: Ducky at June 12, 2005 01:38 AM

If I see one more person tell me to "just install x, y, and z" I'm going to start throwing things. It just ain't that easy outside the lab. I fought with this all through the last four years on the job, and last time I looked, I think I lost that fight.

Posted by: adamsj at June 12, 2005 09:53 AM