I've been peeking around under the hood of perl again, which is always an interesting undertaking. It was ultimately prompted by this blog entry that Jeremy Zawodny made that had an off-hand comment about Devel::Size, a module I wrote a while back. (Hanging off my CPAN directory) It calculates the size used by data structures in perl, so you can figure out how much memory your multidimensional nested tied hash widget really uses.
Anyway, the results Jeremy got were less than stellar, which wasn't good, since it mean that either the module had a problem or that something really odd was going on. That prompted a lot of poking about, since I don't really like the idea of having broken modules out there. The reality, as it so often is, was rather more complex.
Part of the problem that Jeremy was seeing was due directly to Devel::Size. When it runs it uses a hash to track structures that it's seen so it doesn't count them multiple times, and so it doesn't get into infinite loops chasing the tails of circular references. Hashes use up great gobs of memory--on the order of 60 bytes per hash entry, not counting the size of the key or value--so if you're looking at a large structure Devel::Size will use up a lot of temporary memory. The docs made minor reference to this, but not nearly strongly enough (since fixed in the 0.55 release).
This is a problem if you go to double-check Devel::Size by looking at the output of ps, because if you do a ps after a Devel::Size check you'll find your memory usage is extraordinary relative to what Devel::Size reported, because your process' memory space has been blown out to handle all the (since freed) temporary storage Devel::Size uses. This is arguably a problem with Devel::Size, since you could easily say it ought not use nearly that much memory, and that's valid, but at the moment beside the point.
Another issue that it raised, as I went poking around to see where the memory was used that I couldn't find, is how the C RTL handles and tracks memory allocation, and how perl gets memory for its structures. When you ask for a chunk of memory from malloc, it has to hand you a naturally aligned chunk, which in the general case means the memory allocation gets padded out to a natural alignment boundary, usually 4 or 8 bytes depending on the CPU. And, since the RTL needs to track the size, generally an extra 4 or 8 byte length field is silently prepended to the chunk of memory you asked for. In the worst case, this means the memory allocation is 15 bytes larger than you asked for. On a 10M allocation that's noise, but on a 17 byte allocation it's a darned sight more than just noise.
And that's part of what's going on with memory usage. Each key has a structure attached to it, and because this structure's variable length, perl has to ask the system for a separate chunk for each one. The structure's only two ints, two bytes, and the string data, so for short keys you can image that there's a lot of overhead. Even on a minimal, no waste key (one that's two characters) you see a 25% overhead on 32-bit machines. Larger keys slowly ramp down the overhead, but a 3 character key has 13 bytes in the struct and another 7 bytes in overhead (padding and the length field) which is more than 33% wastage. Ick.
I'm not sure what, short of completely redoing the key allocation system, can be done for this, at least in perl 5. The key structs really can't come from an arena, as they're variable length, and throwing in a pointer to the key string just bloats out the struct by 4 bytes which isn't much of a win either. Something to ponder for Parrot, I guess.
Posted by Dan at February 23, 2003 01:45 PM | TrackBack (1)Maybe you could collect all 1, 2, 3, 4... size
keys together in a data structure. I guess this
would add a lot of overhead for dereferencing,
but you wouldn't loose the space.
While messing around with hashes for general perl use isn't really worth it, I've definitely considered doing so for my particular app. Given that the keys are all of fixed size (either 4 or 8 bytes, depending on your pointer size) and don't need to track anything about a key other than its existence, there are a number of optimizations I could do. I just haven't, out of sheer laziness. :)
If I do rework this, I'll probably go with either a pudgy B-tree or a plain sorted list of some sort, perhaps a sorted blocked linked list or something.
Posted by: Dan at May 1, 2003 05:16 PM