August 28, 2003

A dirty little secret of CS

Compilers are easy.

No, really, they are.

Not simple, mind, but easy. The two aren't the same thing, though often people conflate the two. (I'm not saying that optimizing compilers are simple, however. They might be, but as I've not written one yet I can't say for sure. The literature makes me think that they aren't, and I'm willing to trust it)

Raw parsing is a pain, but tools like yacc and Parse::RecDescent make that darned trivial for most things. Interpreting the parse tree you get back is, once again, really simple. As far as I can tell (and I'm 95% of the way through doing it for a language that I thought was going to be tricky) doing the transform from the tree to source of some sort, be it C, perl, assembly, or whatever, is also trivial in the majority of the cases, with the only trouble coming in for weird corner cases and source/destination semantic mismatches, but most of that can be dealt with by a bit of thought when generating the parse tree. (A good parse tree is 90% of the battle in handling things on the back end)

The runtime, honestly, is a much bigger hassle.

Posted by Dan at August 28, 2003 07:00 PM | TrackBack (0)
Comments

I agree: a good parse tree is essential for writing a good code generator.
What language are you writing a compiler for?

Posted by: Klaas-Jan at August 29, 2003 03:37 AM

As I understand it you are implementing a compiler for a simple primative language. Do Ada before you call compilers "easy".

Compiler literature has not changed much since the subject ceased to be of academic interest. At that time a "state of the art" computer might have as much as 64K. Compilers are much tougher when memory constraints force one to write them in 5+ passes.

Posted by: AC at August 29, 2003 02:59 PM

Dan, can you give an example of a good parse tree?

Posted by: chromatic at August 29, 2003 03:00 PM

Unless Ada's changed a bunch since I last saw it, I'd still call it easy. Not simple, though--it's a darned big language, with lots of nooks and crannies. The runtime support for Ada is, I expect, pretty darned extraordinary. Luckily I don't have to worry about that for what I'm doing, which is nice.

It's sort of a pity that compilers have fallen out of the realm of interesting research outside the optimizing stuff (and I'm not sure how much that's even being paid attention to) since what I've found is that the tools at hand to build compilers are... suboptimal. While I'll certainly take lex and yacc, or Parse::RecDescent, over fully hand-rolling the parsing part, they still leave a lot to be desired, and there doesn't seem to be much beyond that. Not that the transform from parse tree to code's that big a deal, but it's something that everyone seems to have to roll themselves, and there's just not much beyond "it's simple, just do it" in the literature. OTOH, maybe there isn't anything beyond "it's simple, just do it" to things....

Posted by: Dan at September 3, 2003 02:03 PM