May 24, 2006

Passing parameters

It seems like a truism, but no matter what you do, if you're working at a low level you're going to find parameter passing to be a major pain in the neck. (by which I mean it's easy to find a common scenario where your parameter passing scheme is going to be annoying and/or slower than it ought to be) Tornado, of course, is no different.

If you're going to make function calls, you need to pass in parameters. That's pretty much a given. And parameters, well... they're a problem.

Basically, how do you pass in parameters? And how do you get them in when you're a function? Heck, what exactly counts as a function anyway? (That is, where's your boundary for a 'real' function, as opposed to a spot you just jump to?)

There are a few ways to pass them in. If you're running on a stack-based system you can push your parameters onto the stack and jump to your function, but there are problems there if you're putting other things onto the stack like the return address. It's also potentially kind of slow, since you've got the cost of putting something onto the stack, which involves checking for stack depth as well as actually putting the data onto the stack (which can be pricey if you need to extend the stack in a segmented stack system, or runs the risk of throwing an exception on stack exhaustion in a fixed-sized stack system)

If you're in a register-based you can stuff the parameters into registers, but then there's the issue of what happens if you have too many parameters to fit into registers? Where do they overflow?

And there's always the possibility of packaging up the parameters somewhere else, in some out-of-bad location regardless of the existence of a stack. That's got the mild benefit of a register machine, since it's out of band and presumably not going to give you stack sizing issues, but you need yet another structure to store your parameters in, and getting to them requires special ops. (Presumably register access in a register machine, or stack access in either a register or stack machine already works)

Tornado's got an extra problem in that it's heavily threaded and built for speed and safety. Now, most of Tornado's data structures are write-once (which makes life ever so much easier in a threaded environment, though GC is always rearing its ugly head) which means that if we hand-wave on the GC issues we don't have to make copies of things to send to remote threads -- we can send the things themselves. And since Tornado doesn't have any global writable data (remember, we can store data into global 'slots', but it's explicitly an obliteration of the existing entry, and an atomic operation) there's precious little shared state to deal with.

I'm also just fine putting a hard limit on parameter counts for function calls. Want to call with thirty parameters? Well... too bad, you can't. Or I'm at least fine saying you can't.

The interesting thing here (well, one of them at least) is that the speed of your function calls can have a significant effect on the style of programming people use on your system, and the style of programming people use can, in turn, alter how optimizations are made for call speed.

For example, in OO programming you want function (well, method, but...) calls to be fast, because people seem to insist on having a huge number of tiny little methods. Personally I blame the lack of automatic accessors as part of the language design, but that's neither here nor there. Anyway, if you look at OO code you'll see a half-gazillion methods that are one or two lines long. Method calls have to be fast, since if they're not your code just runs horribly slowly. And yeah, it means that you will go to some lengths for this, and sacrifice speed in other places to make your methods go faster. (In this case, it means probably going for a stack-based system, since stack systems are almost universally faster for small-count calls, even if being stack based costs you performance elsewhere)

On the other hand, if you're writing code in a language that doesn't do much in the way of function/procedure calling, it doesn't make nearly as much sense to optimize for calls. Does it really matter if your call takes 10 msec instead of 5 if the language or coding paradigm you're working towards will make maybe a dozen function calls over the lifetime of a block of code? Probably not, and that's fine.

For Tornado, if you think about it, subroutines are pretty heavyweight. Yeah, sure, your sub might just be:

a = b *c
a = a % 4
a = a ^ d

but if a, b, c, and d are 1024 element arrays, well... that could take a while. And no, I've no clue if that does anything useful or not, but it's not like the guts of a gaussian blur or RSA encoding function are all that fancy if you've got automatic vector operations.

Then there's the question "what are all these subs going to do?" I mean, we've got a relatively specialized processing engine, one with a programming style that's likely going to annoy most people, so is it too out of line to figure that most of the subroutines are going to be reasonably large computationally, if not in actual program text?

This also brings message pools into the equation. (No, really, it does) Should fetching a message be the same as getting called with parameters? If so, you could arguably have a bunch of threads waiting on a thread pool's message queue and another thread posting a message is the same as making a 'remote' procedure call into a thread in the pool. If your global data's read-only, who cares where you live outside of the obvious speed and latency issues?

It's very tempting to treat all function calls as if they were messages being sent to other threads. More to the point, exposing it so that functions create messages to make a call, wait on their return, extract the return values from the return message, and get messages in when they're being called.

Tempting, but... no, at least not user-visibly. Might do it under the hood but that's just an implementation detail, and for right now we're not letting those get exposed.

Anyway, what will we do? Again, looking over the list 'o features:

  • Known function signatures
  • Likely big-ish functions with little function calling
  • Might want to integrate with messages at some point
  • Everything's all isolated

Given all that, it seems clear we want to restrict messages and parameters to the same count of things. I'm thinking 16 is a good number -- certainly no more, as that's 64 bytes of data to copy for each function call. Each function activation gets its own activation record, which can store the function's pad, its registers, and the return address, and with that we won't need a stack. We're also going to strictly enforce declared in and out parameters, and leave the rest untouched, so if a call takes two parameters and returns one, only R0 and R1 go in, and only R0 is touched in the caller on return.

I think a small piece of Clever is in order here too -- when making a call the parameters go in the low registers (starting with 0) but when you're activated your parameters go in your high registers. That is, if function X takes one parameter, I put it in R0 when I make a call to X. When I'm in X, the parameter is in R16. Why? Well, assuming functions that don't have all that much code in them it means there's a good chance that most of your operations will be on the data you're passed in, and you'd rather not have to shuffle that around to keep it from being stomped on by a function call.

Messages will be limited to 16 elements as well, when they're constructed. This may end up being a bigger hassle, but for now I think it's OK. That's for another time, though.

This isn't, I'm sure, anyone's ideal scenario, but it seems like it'll handle the common case the way it ought to -- quickly, and with as little fuss as reasonable. Now to get it all implemented so it can be tested...

Posted by Dan at May 24, 2006 08:09 PM | TrackBack (0)

some tight shit can u show me how u can implement this using VB6

Posted by: paul at September 30, 2006 10:03 AM

VB6? Nope, sorry, though it'd be interesting to get this put together so Tornado could be a DLL on windows. There's no reason it couldn't be (and I do need to finish the windows port of the base code) I just don't have the expertise to make it happen.

Maybe we'll get lucky and someone who does have the requisite knowledge will be able to wrap it up when things are a little more stable.

Posted by: Dan at September 30, 2006 10:57 AM