One of the problems with objects is that, like cats, they occasionally leave dead things under the sofa that you need to explicitly clean up. Generally called finalization (or incorrectly called destruction, but that's a separate issue) this cleanup is a handy thing, as otherwise you'd have all sorts of crud building up as your program ran. Finalization's usually used for things that aren't memory--filehandles that need closing, database connections that need closing, handles on external library resources that need some sort of cleaning up, or weak references you want breaking so other dead things can be found.
Like so many other object things, there's the inevitable fight over what the "right" way to do things is. So far as I can tell, there are three ways, to wit:
The Python/Perl 5 way (in that order, since Larry did swipe it from Guido) is to have the finalizer be a method just like any other method, you look for it and you call it if you find it. If you want your parent methods called, well... you'd darned well better redispatch the call or you're in a lot of trouble. Needless to say, people are sometimes in a lot of trouble. Compounded in perl 5 by a sub-optimal redispatch method, though the NEXT module's finally made it possible to do things right, though only in new code. (Perl's SUPER only looks at the parent classes of the current class, so if you're on the left-hand leg of a multiply inheritant tree when you SUPER you'll never see the righthand leg. I don't know if Python is similarly broken)
Both schemes, as they're actual object methods, have the interesting property of potentially ending up reparenting the dying object, thus making it not dead any more. This can be something of an issue if the object is half-dead when some finalization code decides that the object isn't really dead, and you have some half-deconstructed object lurching across the system like an extra from one of the Dawn of the Dead movies. (Or a bad freshman english essay) The Frankenstinian possibilities tend to keep people from doing this, though, something I approve of.
Ruby brings a third scheme to the table--rather than a method on the object, you have a finalization closure that gets called when the object dies and bits of the object are passed in. This way you have a means of cleaning up without actual access to the object, so there's no chance of bringing back to life--if you want it not-dead you'll have to clone the thing rather than resurrect it.
Anyway, three ways to pick up the trash. Each can be dealt with, and each has its drawbacks when taken individually. The real fun comes in when you mix and match, because how then do you satisfy the constraints and expectations of all the classes in an inheritance hierarchy? More importantly, how the heck do you order them?
If someone's got a good answer (besides "don't do that!") I'd love to hear it...
Posted by Dan at March 1, 2004 05:17 PM | TrackBack (0)Well, I know how you do the first two. Two methods. Perhaps FINALIZE and DESTROY. If no DESTROY is given, it defaults to calling FINALIZE on itself then each of its parents' DESTROYs. If you want to take over your parents' destruction, provide your own DESTROY. Reminiscent of new and BUILD, no?
Posted by: Luke at March 1, 2004 10:40 PMFinalization is a topic of debate right now in the CLR space and I wonder if the research thats going on could be of interest.
http://weblogs.asp.net/astopford/archive/2004/02/26/80506.aspx
Posted by: Andrew at March 2, 2004 04:07 AMLuke, it's not nearly that simple--if it was this wouldn't be a problem. :) It gets really nasty when you mix and match schemes, because you don't want to destroy things in the wrong order, or hand off a partially destructed object to a destructor that assumes a functional object. I'll probably go on about that later.
Andrew: Thanks for the pointer. Looks like you .NET guys now get to fight the timely destruction battle. Ain't it fun? :) Alas, I think you're going to find that refcounting isn't going to get any better performance without a major overhaul in the way folks think about things. (Been there, done that, I've a good idea exactly where the problems are, unfortunately)
Posted by: Dan at March 2, 2004 06:24 AMHmmm... Intractable problems are usually a matter of perspective.
I'm actually not sure that I see this as a problem at all in fact, and you should just be able to treat it like a Gordian Knot of sorts.
Why not leave this up to the languages and simply require that they provide some boilerplate bytecode that finailizes correctly and sets a status indicating what kind of finalization it did.
Perl 5 is a great example. Ponie code that assumes its super-classes are also Ponie will be sorely mistaken if they are Python code that just happens to define a DESTROY. This is a problem the compiler writer has to solve, and I think Parrot can only provide assitance. Ultimately some practices are going to be deprecated in these languages, and until code complies it won't be able to play in the larger Parrot sandbox.
As for the living or dead state of an object after finalization... again, that's just status that the compiler's finalization code should indicate, no?
Now, what Parrot WILL have to handle is re-entering the finalization code after it begins, because a class in a Perl5/Python-like language has tried to invoke its parent's finalizer.
Well, if it's not that simple, then my gut tells me that there's no right answer. You've got to destroy them in some order, and child-first is the only order that makes sense in the general case. If a parent calls a method that a child has overridden in its destructor, that's its fault.
On the other hand, one might say that once a child has been destroyed, all of its methods die. That way when a parent calls one of its methods, it calls it in the current (or an ancestor) class, ignoring any methods that were previously overridden.
But the clarity with which I see the problem is telling me that I'm missing some key element. I'll wait for your alledged post.
Posted by: Luke at March 2, 2004 10:19 PMIn the CLR space research is being done so that Ref counting can be used along side the CLRs deterministic finalization . In this way objects finalize them selves (when used with the IFinalize interface). However the addition of refcounting means that developers can choose not to use that IDispose and leave it up to the GC. The deterministic finalization also takes care of any objects that end up not disposing missed through circular references. These two methods combined means that the thinking curve is reduced, if you have always used IDispose then fine, if not then fine also. Not sure if that is what you ment by thinking curve or indeed if the research will be of help in shaping what Parrot will do when dealing with Finalization, interesting all the same ;-)
Posted by: Andrew at March 3, 2004 01:05 PMFiguring out when to call destructors isn't the problem I've got. (Well, not now at least :) The issue I'm grappling with is which destructor(s) to call, and in which order, and how to handle redispatch to parent destructors. It's somewhat complex, and probably needs a diagram or two, so time to get off more details, I think.
Posted by: Dan at March 3, 2004 01:29 PM