So, I wasn't too clear in the last post about the problems with finalization. So here's some more detail.
If you remember, there are three ways to handle finalization. You can call every finalizer method in a dying object's hierarchy, you can call the finalizer method like any other method and count on code to properly redispatch, or you can disallow finalization methods altogether and instead pass in relevant bits of the object to an external cleanup routine.
Now, let's first assume that we can do redispatch properly. That means with a class hierarchy like:
A B \ / C
If A, B and C all have a method foo, doing SUPER.foo in C calls A's foo, and if that method then calls SUPER.foo then we'll invoke B's foo. (Note that perl 5's SUPER doesn't do that, you need to use NEXT instead) This is an important assumption for systems with multiple inheritance. (With a single-inheritance system it isn't an issue)
And, so we have a nice diagram to refer to, lets assume our class hierarchy looks like:
A B A E \ / \ / C D \ / \ / \ / F
Note that A is in the hierarchy twice. We'll assume, for the sake of argument, that all the classes have DESTROY methods.
So, how do we finalize an object of class F?
Well... we could call the DESTROY method in F, whatever that might be called. That means we call F's DESTROY. If all the methods SUPER properly, that means we call them in the order F, C, A, B, D, E, or F, C, A, B, D, A, E, depending on whether we prune the tree or not. For the sake of argument we'll assume that we do prune the tree, as that's the common thing to do.
There is, right there, a problem. If we call finalizers in normal method order, then A gets cleaned up before D has a chance to clean itself up--if D depends on its parent classes being still in good working order (not an unreasonable assumption, even in a finalizer) then you're hosed because it isn't--it already cleaned up after itself. The sensible thing to do in this case is to have a separate traversal scheme for finalization. Yeah, special cases are sub-optimal, but it beats screwing things up. Of course, if the DESTROY method in E reparents the object into the root set, well... you're in some serious trouble anyway.
The big problem with this scheme is one of trust--what happens if some DESTROY method, say C's, decides not to call SUPER.DESTROY? Well... then you get an object that's only partially finalized, and its possible it'll leak some resources. (This, interestingly, is not common in practice since usually each resource that needs active finalization has a wrapper object so the worst that happens is that those objects get cleaned up by their own destroy rather than the finalize of the wrapping object. This is a good reason to not have one object wrap multiple bare things)
The big advantage to this scheme is that you can actually call your parent finalizers before you're done with your own finalization. At least, I assume this is an advantage. I dunno, I don't do objects.
The alternative means of doing finalization is to make sure we call all the finalizers for the object. No redispatch or anything, the object cleanup code just scans the hierarchy for an object and calls all the DESTROY methods. For our example class, we'd likely call the finalizers in F, C, D, A, B, E order, as we'd prune the tree and call from shallowest to deepest. If a class is in the hierarchy at multiple depths we use the deepest version of it.
This alternate method actually works pretty well, since you're insulated from potential problems in the tree--you don't have to worry that your child classes might forget to redispatch or something. It can also be done reasonably quickly, as you can actually cache the traversal order and methods in the class somewhere and not have to do any dynamic lookups, but that's not likely a huge issue. (If you have issues with dispatch speed for finalization you've probably got bigger issues) There's still the issue of reparenting--it's possible once again for that pesky E class to reparent the object that's now mostly dead.
The third way is to attach a closure of some sort to the object, or to each class, that takes some of the data out of the object. When an object is slated for destruction you yank out the bits and call the closure. The advantage here is that, unless your object is self-referential, there's no way to reparent the thing. (And if it is, the assumption is that you're really out of luck and reparenting will just fail, or die horribly, or something of the sort)
Each of these methods is fine. Personally I prefer the automatic calling of the finalize methods, so I don't have to worry about forgetting or screwing things up. (There's that whole 'encapsulation' thing--how can a child class have any clue as to whether my parent class should or shouldn't clean up? It shouldn't know, as it's just none of its business) The nasty bit for Parrot is the mix'n'match issue.
I'd love to choose just one, but I can't--we've committed to making perl 5, perl 6, python, and ruby work. Perl 5 and python use the "you better delegate right" scheme, Perl 6 uses the "call 'em all!" scheme, and ruby uses the closure scheme. So... what, then, do we do if we mix it up?
For example, in our diagram, lets think that F and B are Python classes, A is a Ruby class, and the rest are perl 6 or C++ classes. And, just to make it more difficult, B doesn't redispatch its DESTROY method. (After all, why should it? It's a top-level class) Contrived? Maybe. We are pushing Parrot's interoperability, though, so someone'll do it. And even if it doesn't get so bad, there'll be plenty of perl 5/perl 6 mixing, and you'll likely see perl 6 on the top and bottom with perl 5 in the middle in a lot of cases. (As perl 6 is used for new code, and people start refactoring old code from the bottom up)
The table looks like:
The question, then, is... what gets called? And in what order?
The first thing that springs to mind is taking them by group--call the Python-style finalizer, then call the automatic finalizers, then call the ruby finalizers. That, though, will destroy the object out of order--there'll be classes where the parent class attributes are gone before the child class gets to do its thing. That's A Bad Thing. No joy there.
The next thing to do is assume that the C++/Perl 6 finalizers just automatically redispatch, as does the ruby finalizer. But... in that case, B's failure to redispatch means we never call E's finalizer. That's bad too.
The third thing to do is redispatch anyway if there are automatic finalizers left to run, even if a manual finalization doesn't redispatch. But... what if the finalization didn't redispatch on purpose? There may be a reason. (I'd not bet on a good one, but I don't get to not implement things based on judgement calls on other people's code, even if it really sucks. Still gotta make it work)
You could continue the redispatch if a manual method didn't redispatch, but only go for the automatic methods, which wouldn't be too bad, but still sits poorly for me. Ick. I don't, I'm afraid, have a good solution. There may not be one, in which case it's a matter of choosing the least bad. Or punting this to Larry, Guido, and Matz and letting them hash it all out.
Finalization. Who knew death would be so darned complex and annoying?Posted by Dan at March 5, 2004 04:55 PM | TrackBack (5)