May 31, 2006

Re-examining message passing

So, this morning when I got off the train it hit me that making message passing really really lightweight is a Good Thing. Yeah, I know, withhold your "Duh!"s for a moment.

For the most part, up until now, I'd been looking at the communication between threads as involving a relatively large quantity of data and, therefore, as relatively heavyweight. In that case a few extra cycles isn't that big a deal, since you're making few inter-thread calls and slinging great masses of data (well, 4K or more) between them to be processed. And that's a fine way to view things, since it matches a number of use cases well. Large quantities of data matched with large amounts of processing on that data.

But...

It doesn't work at all well when your threads are communicating a lot but not exchanging all that much data. For example, if you're running a simulation where each 'object' in your simulation is represented by a separate thread, running mostly independently, and interactions between the objects is done via passed messages. You know, like when you've got a few hundred objects moving at different rates, with different collision avoidance algorithms, and different preferred distances with other objects.

Looking at threads like this, as producers and consumers of messages, rather than as potentially parallelized functions, has some other ramifications as well. As a for example it means that we don't want to look at messages as if they were remote function calls, since they're really not. A function may well get a message somewhere other than at the beginning of the code, and may well send a message out somewhere other than at the end.

Now, both ways are perfectly valid, and each of them has some benefits and drawbacks.

An RPC-style way of looking at threads makes for more easily auditable code. This is a good thing, since it means it's easier to do machine inspection, and it's easier for programmers to write correct code. (Not actually easy, of course. Nor likely, I suppose, but one can always hope) It also makes the implementation relatively simple, since you can make function calls and message passing share code, and that code doesn't have to be hugely efficient. The downside there is that it makes message passing less handy as a tool for communication, both because of its awkwardness and its mild inefficiency. (Though whether anyone but me cares about the inefficiency's always an open question) Coding this way tends to reduce the number of deadlocks you run into as well, which is a good thing.

A communicating agents way of looking at threads makes for more interesting code, which is good. Your code can be more flexible (or at least it's easier to be flexible) and it conforms to a relatively poorly served programming paradigm. The downside is that auditing and writing the code's potentially a lot more difficult.

Given how I want Tornado to behave and one of the places I want it well-suited to, we're heading to the lightweight message passing scheme. More of a pain for me, but I think the wins are worth it.

Posted by Dan at May 31, 2006 07:44 PM | TrackBack (0)
Comments

It sounds like you're moving towards an occam-style model of concurrency, with extremely lightweight message-passing communications and context switching. You might find it interesting to have a read through some of the introductory material on http://www.transterpreter.org/books/ (and if you're still interested after that, then look at http://occam-pi.org/ and http://www.wotug.org/ for what we've been up to in the intervening 20 years or so).

You're right that RPC-style communications can be used to guarantee freedom from deadlock; the client-server design pattern that's often used in occam-pi programs provides a generalised set of design rules that takes advantage of this.

It's nice to see other people getting into this area -- perhaps we might even see a Tornado presentation at CPA next year!

Posted by: Adam Sampson at June 1, 2006 08:38 AM