June 30, 2005

WCB: Full bytecode metadata

One of the things that was on the list 'o things to be put into parrot was fully annotatable bytecode. This would allow you, if you were wondering, to associate any number of sections of metadata to your bytecode, and within those metadata segments associate any data you like to individual positions in the bytecode.

Or, more simply, your code can say "given I'm at offset X from the start of my bytecode segment, what data of type Y is associated with it?"

This facility was going to go in to support proper error message handling, so compilers could associate source line numbers and source line text with the bytecode generated for that source -- this way you could have code like:

  getmeta P1, [segment_offset,] "line_number"
  getmeta P2, [segment_offset,] "filename"
  getmeta P3, [segment_offset,] "source_text"
  getmeta P4, [segment_offset,] "column"

to get the line number, filename, source text, and column offset for either the current position or the position at the given offset (if it's passed in). Assuming, of course, that there are line_number, filename, source_text, and column metadata segments attached to the current bytecode. (This actually makes me think that exceptions should send along as part of themselves a thingie that represents the point in the bytecode that code can query for metadata. Ought to be able to do it for any continuation as well, for walking back up the call stack)

This is one of those things that I'd really prefer to be an object method thing, since I'm not picturing this ever being particularly speed-critical. (Yes, hell has frozen over -- I'm actually advocating an OO solution) The problem here is that when we need this information we likely don't have an object, or are operating at a level below the object system, so we're stuck welding the capabilities into the interpreter's bytecode system itself.

There was also going to be a corresponding setmetadata op to allow annotating a bytecode segment, though that would generally not be used by anything but a compiler module. (And odds are, given what we were seeing with folks writing compilers, that would mean adding directives to the assembler or PIR compiler, and defined annotations to the AST)

Would've been nicely useful for attaching all that info that compilers want to attach, and that runtimes like to have, without fluffing out the actual executed bytecode.

Posted by Dan at June 30, 2005 04:12 PM | TrackBack (0)