Since I seem to end up going on about a variety of less well-known computer tricks, I figure I might as well make it a semi-regular feature. So welcome to the first official entry in the "What the heck is" series. :)
These are, or will be, questions that crop up as part of Parrot development. They're the sorts of things that you might run across when writing interpreters, OS kernel code, language compilers, and other relative esoterica. These are things that nobody (well, to a first approximation, at least) does, so the concepts are at best really fuzzy for most folks, and in many cases (such as with the continuation) completely unknown. So, rather than just grumbling about the sad state of people's knowledge about low level hardware and software concepts, I should try and do something about it.
Today, we're going to talk about walking the system stack, something that comes up with garbage collection. (I should talk about garbage collection. Later. Hopefully I'll remember to make a link here)
The system stack, for folks that aren't familiar with it, is just a chunk of memory that your machine's CPU uses to hold temporary values. There's usually a CPU register dedicated to it, either by convention or as part of the hardware itself, so access to the stack is fast, something that's generally considered a good thing.
Walking the system stack is a process where you figure out where the stack starts and ends and looking at all the data on it. Not necessarily changing the data, mind (as it's often write-protected, at least in part) but just looking at it. This is generally considered mildly evil, in part because there's no use you can make of the data on the stack that doesn't break encapsulation, any data hiding, or code modularity. Still, there are often good reasons to do it.
Garbage collectors (or GCs), for example, are one of those reasons. The whole purpose of a GC is to see what data is in use and what isn't. That means if there could be a reference to data on the stack, perhaps because the language accessing the data uses the system stack for data storage. (C does this, for example, as do nearly all the compiled languages that don't support closures) You'd really hate to clean up after a variable you thought was dead because you didn't look at all the places that the variable could be referred to from.
How does one walk the stack? Well, you need to get the base of the stack, the spot where the stack is as empty as possible. Depending on the OS and CPU architecture, there may be an easy way to do this. If not, what you can do is get the address of a variable that's been allocated on the stack at the very beginning of your program, at a point where there can't possibly be anything of interest on the stack. Then, when you want to walk the stack, you just get the current stack pointer. Everything between the two is your stack data, and you just run through it like any other array of binary data. Depending on what you're looking for you may be able to cheat a bit--many systems require that stack data be aligned, so that a 4-byte pointer must be on a 4-byte boundary, which reduces the amount of data you have to look at. (And other systems don't put any requirements at all on the stack data, which makes things a bit of a pain)
And that's it. Nothing at all fancy--walking the stack is just a matter of figuring out where and how big the current stack is, and grovelling over it. A trick you should almost never do, but when you need to, well, you need to.Posted by Dan at April 26, 2003 05:04 PM | TrackBack (0)