...having callback function arguments that do not take a corresponding invocation-specific data pointer.
You want to have a function that takes a function pointer, and have your library call that function at some point in the future if some event happens? Cool! Works for me. I like those. (Well, sorta, event/callback/async programming is a pain) However.... the signature should never be:
int register_callback(func_pointer_t callback);
Bad! Bad programmer! No cookie! That signature should be:
int register_callback(func_pointer_t callback, void *extra_data);
Or, if you'd rather, take a struct that has the function pointer and callback data in it, if you don't want to manage the two pointers in your library. The signature for that callback pointer should be:
int callback_func(struct instance_data *lib_data, void *extra_data);
though I'm less adamant about that. Very very simple signature lists are best.
Why? Simple. While you may think that I'm going to have a custom callback routine with private embedded data in it all primed and ready for your particular call, you're wrong. This is C we're talking here--it's not like we have closures, so there's no way to have any sort of data bound at runtime. If I want to bind any data to the call it means I need to stick it in a global somewhere. Blech. Very, very ungood.
It gets even worse when dealing with any sort of indirect access to the library--like, say, if you're trying to do this from an interpreter. For that to work without any sort of data pointer requires creating a custom C function, either at compile-time for the module (which requires having a C compiler of some sort handy) or at runtime (which requires the capability of creating new functions on the fly) neither of which is particularly desirable. (Parrot, for example, doesn't need a C compiler handy to interface to most C libraries)
Postgres, pleasantly, doesn't make this mistake. You should endeavor to not make it as well.
If you want a really good reason, consider the following. Someone is writing an interface to your library for an interpreted language. Perl, Python, Ruby, Java, something on .NET--doesn't matter. The program runs, conceptually at least, on the interpreter. The interface writer wants to be able to write those callback functions in the interpreted language.
With a separate data parameter, it's easy. The interpreter builds some sort of closure structure, sets the callback function to be an entry point to the interpreter, and the callback data to be that closure structure. When the callback's made, the entry point function yanks all the info it needs out of the data parameter, sets up the world, and calls into the properly set up interpreter. While there may be a lot of really nasty funkiness going on in there, it's at least doable.
With no data parameter, though... you're stuck. The only way to do it, short of generating a new custom function pointer (which isn't that tough, but is painfully non-portable and something that gives most people a screaming fit to even think about) is to stuff the information you need for the callback into a C global somewhere. The problem there is that it means you can only have one pending callback (which is often suboptimal) and you've got potentially unpleasant threading issues. This is an especially egregious mistake with things like GUI interfaces where you may have dozens of hundreds of some sort of thing instantiated. At least there you've often got an OO interface, so there's the data in the objects, but even then it makes the low-level stuff annoying.
Generally people who use the libraries realize the problem straightaway, but the problem is that often the people using the libraries aren't the people writing the libraries...
Posted by Dan at December 15, 2003 01:22 PM | TrackBack (0)Also, your unregister_callback function, if you have one, needs to take the same data parameter. Otherwise, you screw over anyone who needs to add multiple callbacks from plugins or separate different libraries.
Posted by: Chris T at December 15, 2003 03:18 PMAs we learnt in the process of writing the libgtk+ libraries and related bindings, a simple data pointer is not enough, because there is still the issue of the lifetime of the data in the pointer.
So, in the general case, you need also an additional argument that is a pointer to a function that will be called with the user_data argument when the library is done with the callback.
BTW: .Net/Mono don't need the extra data pointer, since the runtime is perfectly capable of generating the proper function at runtime (though we still need a mechanism to manage the lifetime, since that can't be guessed by the VM it needs to be specified by the called library).
I didn't rant about the teardown aspect of things, but I should've--it's as important as the callback data pointer, though it's not as often necessary. (Most of the stuff I need callbacks for are one-shot things, or stuff that has to be explicitly torn down, so the implicit teardown functions are less likely. YMMV, of course)
Posted by: Dan at December 16, 2003 02:00 PMPaolo, I'm not sure I understand why a dispose-data callback is necessary -- are you trying to handle the case where someone releases the object that contains the callback, but doesn't own the callback?
Dan, perhaps you'll have an unfortunate opportunity to address teardown in a separate rant someday. :)
The teardown callback becomes necessary in those cases where you're handing off an object or some other piece of data to a library, and the library later (independent of your program code) destroys that data or is otherwise done with it.
This'll happen a lot with GUI code (and I presume specifically with gtk) where you'll create, say, a button, attach a callback to the button (along with callback data) and then stick the button in a window. Once you put it in the window you lose track of it, and can even drop all references to the button if you want. Later on the window gets destroyed and all stuff attached to the window goes away as well, at which point you need to destroy your object somehow. That's where the destruction callback comes in.
This'll happen with non-GUI libraries that have what're essentially external references to your data--you need to be told when those references go away, hence the destruction callback.
This is one of those places where a master, overarching, unified GC system can come in handy. Some day maybe we'll even have one. (I fully expect to be reading about experimental yet futile attempts at this well after I retire :)
Posted by: Dan at December 16, 2003 02:30 PMWith no data parameter, though... you're stuck.
Nah... that's when you reach for ffcall's trampoline and callback.
Software engineering... a self-inflicted maze of twisty little passages.