January 18, 2006

Looking sideways at paradigms

Or something like that. Gotta love the big words, y'know?

Anyway, as everyone's undoubtedly noticed, there's not been a whole lot here. Odds are, at least someone cares, too. :) The time's not been spent idly, though.

Besides the standard "do computer stuff for money" thing, one of the projects I've been involved with has an interesting question it's spawned -- what does a massively threaded vectorized virtual machine look like? That is, assuming you've got a lot of data, which you've split into pieces, and you need to do operations on all the bytes/words/longwords in those pieces, and the code processing each piece occasionally needs to talk to the same code processing a different piece (or potentially different code even)... what does it look like? What sorts of primitives do you build into the VM, how skewed should things get for speed, and what the heck do languages designed to work like this?

I know there has been some work in this area, mostly in automatic extraction of parallelizable code from serial code, but that's not exactly what I'm interested in. Erlang has a certain appeal in that it's designed to be massively multithreaded, but there's no real vector stuff built into it. (I think. I'm still poking around at it) There are things you can do in general with some of the languages, but pretty much all the languages I've come across at best make it so that the compiler can tease out any sorts of parallelizability rather than make it easy to write parallelizable code. (And no, the lazy functional languages aren't much use here. There's still ordering and dependency in the code, the languages just make it easier for the compiler and runtime to decide to not do things. Which is good, certainly, but not the same thing) Occam has some appeal here, but as I remember it was targeted at really small systems the last time I came across it (granted, that was twenty years ago or something like that) so I'm not sure it's useful. Poking around the WoTUG website's on the big list 'o things to do, though.

Even with the parallelizable stuff taken out of the loop, there's the question of vector or matrix operations, though this is much more commonly trodden ground, being the sorts of things people've been doing in Fortran for decades. Yay for numerical simulations, and more recently high-performance graphics operations. (There's a lot of this sort of thing involved when you're spinning a few million shaded triangles around at sixty frames a second)

Still, the two worlds don't seem to have met, or if they have I haven't found out where. Bizarrely, this is one area I think I can't steal thirty years of past practice from the Lisp guys. And yes, I'd be thrilled if someone'd fill me in. I don't mind doing my research, but I'm just as happy to have people point me to some giants whose shoulders I can stand on.

Anyway, that's from the language end of things, a place I'm not that strong in. For better or worse I'm a procedural language kind of guy, and I don't have the background to properly (or improperly) design a language for this. Standing joke or not, I really don't do syntax.

I do, however, do semantics, and that's the interesting bit. Most of what you want in an underlying engine is pretty standard -- flow control's pretty much the same no matter what you do, unless you want to get really funky. In this case funky is doing control flow checking against a vector rather than a scalar and spawning multiple threads, one for each value in the vector, but we won't go there because things get profoundly twisted very fast, and twisted isn't the point. Well, at least not the point of this, it can be someone else's point if they want it.

That leaves an interesting question, though. What sorts of primitives should be provided? In this case I don't necessarily mean low-level VM operations, though since we're talking thread and vector operations they pretty much have to be.

This is the sort of stuff I've been poking at lately. It's been fun thinking about, and I expect I shall go on at some length about it over the next month or so. Lucky you!

Posted by Dan at January 18, 2006 02:11 PM | TrackBack (0)

Have you taken a look at gpgpu.org? This site has a lot of information about running general purpose computing on graphics hardware. There's a "Categories" frame on the right hand side: try clicking on the "High Level Languages" link.

Posted by: Dave at January 18, 2006 04:10 PM

I'd not seen that. Poking around, it looks like I've got Cg and Sh as two potential languages. I'll have to take a look at those and see what's there to see.


Posted by: Dan at January 18, 2006 04:18 PM

I heard a talk from a guy from the MIT compiler group (or architechure group, maybe )I think at UPenn. It was on a non-vonnueman architechures. They were talking about Cell as a hybrid of a serial and dataflow execution engine. (as well as other hybrids various people tried) In Cell the instuctions are essentially directed acyclic graphs that include which ALU gets the data next. Cell machine code might be interesting as for what VM might look like.


Posted by: Garick at January 19, 2006 06:32 PM

Re Lisp, you might want to look at http://library.readscheme.org/page9.html, QLisp (low-level parallel computing support), MultiLisp and *Lisp (a dialect used on connection machines -- the CMU AI repository has a bitrotted emulator in CL). I particularly liked Henry Baker's paper on futures (http://citeseer.ist.psu.edu/baker77incremental.html). However, I believe later research has uncovered problematic interactions between futures and side-effects. I unfortunately don't know if any fix was suggested.

Posted by: pkhuong at February 3, 2006 11:45 PM

someone else poking at some of the same things:

vector ops are indeed interesting because they apply to a surprising number of different problems and actually simplify the representation of the algorithms

(note: i'm just an interested bystander)

Posted by: mrusoff at March 23, 2006 05:41 PM

Great reading, keep up the great posts.
Peace, JiggaDigga

Posted by: JiggaDigga at April 7, 2006 12:38 PM

Check out Fortress...


Posted by: Ed Price at April 7, 2006 10:08 PM