Configuration systems are a collection of data, a tree of rules, and a set of templates. (Which are arguably just funny rules, but it's convenient to treat template instantiators and rules as separate things) What rules get fired off and when depends in large part on the data. Lets, then, consider the data.
For a configure system we probably don't care much about what type an individual data element is. int, float, boolean, string, whatever -- ultimately these things will get turned to strings and splatted into templates, so adopting a dynamic-language-style morph-on-demand basic data type is a reasonable thing. So we'll do that.
There are things we do care about aside from the actual value of a data element, things we use to decide what rules need firing off. For example, we care if the element:
All of these things are needed to help decide which rules to fire off.
#1 is straightforward enough. If a data element has no value, then we can't use it yet. That means when we're ordering rules to see which we're going to run, we need to run the ones that set the value before we run the ones that use it.
#2 is also straightforward. If a rule depends on X, and X changes, then we probably need to re-run the rule. The exception here being that we don't need to run the rule if the rule doesn't actually do anything we need. For example, if rule Foo depends on X and provides Y, and we don't need Y, then there's no point in firing the rule off. Assuming we don't allow side-effects. (Which we aren't, but I'll get to that another time) Whether a data element has changed can also be an interesting question -- who decides? Does the solver engine decide, by comparing the old and new values? Or does a rule affirmatively declare that it's returning X and yes, in fact, X should be considered changed? (And what about the cases where it returns X, with a changed value, but says it hasn't changed? We'll get to that another time too)
#3 is somewhat less straightforward. Configure systems generally don't start from nothing when they run -- there's seed data that has adequate but not necessarily optimal values that you can prime the system with. As a for example, you may not know what the best C integer data type to use is, but it's certainly safe to default to "int". The system may probe for a better value, perhaps long or int64 or something, but in its probing it can use int until it finds a better answer. Or you may, by default, assume you can't fork (a safe assumption) but you can always come back and probe to see if you can.
The utility here is that you can ship safe defaults, enough to get the system in a state to go find better answers. Sure, maybe you've filled in the defaults with lowest-common-denominator answers from the C89 and ANSI operating system standards, but that's enough to get you in a position to figure out what the actual useful answers are.
Alternately you could mark values loaded up from the default cache as changed, but... maybe not. There are actually cases where you don't want to mark a default value as changed.
Consider, for example, the person who's actually packaging up things for distribution. We've kind of hand-waved up until now on how these 'trivial' shell scripts, batch files, or default config.h (or whatever) headers and such get created, but this is a good time to un-wave the hands.
The packager could run the build on each OS you want to support and collect up all the results. But... ewww. That means the packager needs to be minimally fluent with each system, and have access to each system, and if you lose access (or lose the person with fluency) you have a system fall off the list. Not good.
On the other hand, the packager has a perfectly fine configuration system handy. If there's also a set of default values for each system (which someone else can build, can be handed down from maintainer to maintainer, or set to something really simple) then the packaging can just load up the defaults, generate files, and then go on to the next system. If you treat the defaults as unchanged then, if your rules are correct, you won't need to actually run any probes and you're good.
When the end building person gets the package they run the generated simple shell scripts which builds the configurator with the defaults, which then loads in the default values, starts probing, and rebuilds itself. Same templates, same rules, the only difference is that the packager doesn't do the probe and rebuild step, since they don't have to. (Or, rather, if they do it's a sign there's a missing default)
None of this is all that interesting, but it's handy to have as a base when looking at the more interesting bits. Which we'll do later.
Posted by Dan at November 15, 2007 06:57 PM | TrackBack (0)Hey Dan, good to see you writing again! Your stuff made me think and reconsider a lot, so I hope you will write more..
Posted by: tom at February 21, 2008 07:27 AM