October 23, 2003

Parrot Forth

Or, "Fifth", I suppose, since it's not quite Forth. As I said earlier, I'm working on Parrot's Forth implementation. It was originally written by Jeff Goff, and the core (what I consider most of the tough bits) is done and has been working for ages, just nobody noticed. The plan (subject to change if it doesn't pan out) is to use Parrot as the core engine for the language the big Work App uses. (The current engine is old and has a number of issues, not the least of which is a really primitive syntax and a simple integrated ISAM database with limits that we're hitting every week or three)

This project's actually coming along nicely--I've a compiler of sorts for the language that'll translate it into perl code, and we'll use that as a fallback plan if need be--and I should be able to start emitting PIR code (Parrot's intermediate representation, the stuff we feed into IMCC) with about a week's more work. Unfortunately that's not nearly good enough to actually do anything, since most of the interesting functions of this language live in its runtime--specifically its screen and database handling.

I've got code that converts current-format databases over to Postgres databases, complete with triggers and views to preserve the ISAM flavor, and Parrot has interface libraries for both ncurses and Postgres. What I don't have is the library code that'll get between the compiled code and raw ncurses and Postgres, to make sure the semantics of the current language are preserved without having to emit great gobs of assembly for each statement compiled. I could write that library code in assembly, and Parrot assembly is awfully nice as assemblys go, but still... Don't think so.

The sensible thing, then, is to grab a language that compiles to parrot and use that. I could use the language I'm writing the compiler for but, let's be honest, if it was good enough to write that sort of library code I wouldn't have to be writing a compiler to retarget the damn thing. (Well, OK, I would, as the database part of the code is still running us into walls, but the language makes COBOL look sophisticated)

Parrot's got a number of partial and full languages that compile to it, but throwing away the gag languages (besides, Befunge doesn't support Parrot's calling conventions) it's down to either a nice compiled Basic or Forth and, for a number of reasons, I chose Forth. It's simple, I like it (I like Basic too, FWIW), and expanding it isn't a big deal for me, unlike with Basic, at least our current Basic implementation. (Which is nicely done, thanks to Clint Pierce, but the code requires more thought than my gnat-like attention span can muster at the moment)

Now, the current Forth, as it stands, is only a partial implementation, with the lack of control flow its biggest issue. I've been throwing new core words into it all day, in between handling the fallout from Parrot's directory restructuring today. It's dead-easy, and with the cool assemble-and-go bits of parrot (no need to even assemble the .pasm file, just feed it straight into parrot and you're good) there's not even separate compile and run phases. Can't get much easier than that. I snagged a draft copy of the ANS Forth standard (Draft 6, from 1993, so it's not exactly up to date, but I don't have Starting Forth handy) and have been going for it with some glee.

With it working, at least partially, there comes the urge to tinker. Meddle if you will, and alter the essential Forth-ness of the implementation. Having a combined int/float/string/PMC stack, rather than separate stacks, is the first big urge. Having strings as atomic data (rather than addresses and counts on the stack) is a second urge. Adding in OO features is the third. (OO Forth, after all, is less bizarre than an OO assembly) And integrating into Parrot's calling conventions is a fourth. I think... I think I may well do them all.

While I'm at it, I think I may well redo its compilation stage as well. Right now all the words with backing assembly just dispatch right to them, while the user-defined words are kept as a string, as if the user typed them in, and re-parsed and interpreted each time. Which is a clever way to do things which gets you up and running quickly, but as Parrot has the capability to generate bytecode on the fly, well... I think I might build a full-fledged compiler for this stuff. Which should also take very little time, be very compact, and awfully forth-y. We'll see what tomorrow brings.

Posted by Dan at October 23, 2003 08:42 PM | TrackBack (0)


FIFTH is a precision mathematical language in which the data types refer to quantity. The data types range from CC, OUNCE, SHOT, and JIGGER to FIFTH (hence the name of the language), LITER, MAGNUM and BLOTTO. Commands refer to ingredients such as CHABLIS, CHARDONNAY, CABERNET, GIN, VERMOUTH, VODKA, SCOTCH, and WHATEVERSAROUND.

The many versions of the FIFTH language reflect the sophistication and financial status of its users. Commands in the ELITE dialect include VSOP and LAFITE, while commands in the GUTTER dialect include HOOTCH and RIPPLE. The latter is a favorite of frustrated FORTH programmers who end up using this language.

Posted by: Chris at October 24, 2003 10:21 AM

Heh. I'd wondered how long before someone piped up with that. :)

Posted by: Dan at October 24, 2003 10:26 AM

You don't need very much of a standard Forth dictionary to have OO built with Forth. For example, Gforth comes with three object models: object.fs, oof.fs, and mini-oof.fs. Mini-oof.fs is only 12 lines of code.

Once you have CREATE and DOES>, you are pretty much there.

Posted by: Bruce at June 29, 2004 02:39 AM

The big issue with getting OO into the forth (well, besides me just having the time to do it) is getting it integrated with parrot's base OO system. To do this 'right' it'd be best if you could make method calls on any object, as well as make objects that can be used as objects by other parrot code. That's the tricky part.

Posted by: Dan at June 29, 2004 09:26 AM