December 24, 2003

Blog spam... I give

I already have the MT-Blacklist plugin installed, but manual updates to the blacklist aren't cutting it--too much would've-been-caught crap's making it through. Time to break down and automatically update the blacklist. Luckily someone's already got a tool. (Yeah, it's python, but I honestly don't care what's in the Happy Fun Ball so long as it works)

At least I get comments mailed so the crap gets seen quickly and doesn't linger, but I'd rather it not be up in the first place.

Posted by Dan at 11:18 AM | Comments (9) | TrackBack

Atomic feeds

Well, looks like the Atom spec's hit 0.3, and MovableType 2.65's got support for it built in. Since I just yanked the blog up to 2.65 because of the nasty XMLRPC bug in pervious versions (I don't know what the bug was, but I do dislike providing remotely exploitable services) I went ahead and thumped the main index template to add in the autodiscovery link stuff, and dropped the new atom template into the list of autogenerated files. Dunno if anyone cares, but since it took all of 10 minutes total, and I've this conceit that people actually read this thing for the information that's hidden between the rants, I went ahead and put it in.

I probably ought to revisit the whole blog notification thing, but not for a while. Maybe in a few weeks.

Posted by Dan at 10:20 AM | Comments (2) | TrackBack

December 23, 2003

A server in my pocket!

Or almost, at least. (Maybe if I get the wireless network adapter for the Gameboy. (I may port Parrot to the Gameboy, too--it's got the power and enough memory, and I've a Gameboy emulator for OS X. Maybe on the plane to NordU) Which, interestingly, would make it worthy of the name server, as it's got a 33MHz processor and at least 32M of RAM, which makes it more powerful than the PDP 11/73 server box I did a lot of work on in ages past. And more powerful than the DG/UX box that replaced it, though the Gameboy has no external storage. But I digress) At the moment I'm sitting here in a coffee shop working on, well, work. The always-interesting1 work project, a combo compiler and runtime library for DecisionPlus, an antique "4GL" that has more than a passing resemblance to PIR except that PIR's a bit higher level in spots. (It has functions) In a terminal window I'm waiting on a rebuild of Postgres 7.4, since I forgot to build the perl and python extensions. (Rendezvous and SSL are being built in too, since I'm at it) I've also rebuild parrot from the latest CVS sources, and I've got a working set of runtime libraries for at least low-level access to Postgres and ncurses, with some higher-level wrappers around ncurses.

This is where we cue the old fogey moment. I'm sitting here working on a machine that would blow the doors off anything that I'd used for a good portion of my life (including the System/390 machine that UConn had as its primary CS training machine, though I'd bet it had better IO throughput). A machine that I sometimes grumble about because it's too damn slow. (Which it is, in part because the frontside bus only runs at 100MHz, (Only!) but mainly because laptop hard drives are slow to conserve power)

But... I've a full build of Postgres, and sufficient disk space to hold the entire work database. I've all the compilers and runtile libraries I need to develop new code--hell, I've got enough power handy, and all the resources I need, to completely develop the system I'm working on. Hell, I've got enough power in this little thing to run the entire company, albeit somewhat slowly.

There's just something somewhat surreal about that.
1 Granted, sometimes for very small values of interesting...

Posted by Dan at 01:14 PM | Comments (2) | TrackBack

December 22, 2003

One man, one vote

It works for the Patrician, why not for us?

Diebold's been in the news again, this time because the most recent California election (and Terminator references aren't really appropriate -- it seems more like something out of a Phillip K. Dick novel) that used machines whose software wasn't certified by the state elections board. (Most, but not all, of the machines met the federal standards, but that doesn't really matter as it wasn't a federal election. The state wins here) Including LA county, which last I heard has just a few people living in it. LA's machines were neither state nor federally certified, so arguably the election results could be called into question if someone with standing wanted to do so.

Like any self-respecting geek, I'm well aware of the problems with the current setup of electronic voting machines in the US. We've got machines whose internals can't be inspected (and therefore can't be verified), with hardware that allows external connections, sold by companies with strong political motivations, in some cases (Diebold in particular) with top executives who have strongly partisan positions, performing an activity that's arguably the single most important function in a republic. (It's not that machine-based voting's anything new--I'm from Connecticut and we've been using mechanical voting machines for somewhere around the past century, give or take a bit. With those, though, someone can look in and inspect the gears and cogs and see if they're in proper order, and tampering is both difficult to do and really difficult to do in a widespread way)

This doesn't, however, engender the expected moral outrage. Far from it. I've actually been waiting for the state to propose these, so I can go up and testify. Something like:

Ladies and Gentlemen, I'm here to testify in favor of the proposed electronic voting system before you. Not, I should point out, because I think it's trustworthy, accurate, and tamper-proof. Far from it. These systems are a piece of crap. Crap, though, which can be hacked en masse from a distance, and I figure at some point I'll feel the need to boot one or more of you out of office, and having these machines in place is the easiest way to make that happen.

So if you hear I've been arrested you'll know why. :)

If I can ponder that, though, so can other people. All the testimony and complaint I've heard has been in the typical condescending "You're an ignorant buffoon, let me educate you even though I know you can't comprehend the workings of my magnificent brain" way that geek arguments tend to have. Y'know what? That's a load of crap. I'll bet a dollar that not only have some folks in power not only realize this, but chose these systems specifically because they can be hacked into, queried, and altered with no audit trail left or record of the original votes. (I'd bet a dollar that it's already happened more than once, as well) Hell, I wouldn't be at all surprised if the lack of verifiable audit trail in the designs of at least one of the voting systems is on purpose.

This is politics. There's a lot of power and a lot of money involved. That tends to attract a considerable number of people who are, for lack of a better word, scum. Clever, suave, intelligent (yes, as smart as you, possibly more so), slick scum. Getting the right people in power, often not yourself, is a very good way to increase your score, build your empire, or get off making all the puppets and pawns dance to your tune.

Don't ever forget that. They certainly won't.

Posted by Dan at 01:02 PM | Comments (2) | TrackBack

December 18, 2003


I swear, text will be the death of me. (If you thought it was objects, nope--they're obnoxious, overblown, and the OO Kool-Ade tastes of almonds, but hardly a full-blown nemesis or anything)

As an example, a page that Kim found.

While on the one hand I do really like the shape and form of the alphabets and writing (no surprise, I'm a font magpie too) the implications in actually processing text in these languages is painful to think of. That's even ignoring the issues of rendering or OCRing these sorts of languages. (One big screaming example--you'll note on that page that the trailing sigma has a separate character in the Unicode set, but it should be treated as a plain sigma for text searching reasons. And imagine what should happen to the sigma character if you substr a string and the last character happens to be a sigma that was, up until a moment ago, in the middle of the word. Then you concat a space and another word for display....)

Posted by Dan at 01:51 PM | Comments (2) | TrackBack

December 16, 2003

Bloody stupid things to do when writing C libraries, #43

Failing to fully specify what happens with input buffers. Returned buffers too.

For example, ponder a hypothetical (yeah, right) library routine--let's call it, say, new_form which takes a NULL (rather than NUL, which is different, but you knew that) terminated buffer of field pointers. You call it and the fields in the buffer are now part of a brand-spanking new form. Yay, us. Anything that handles even part of form and field stuff is welcome, as it's a pain (though I could rant about ncurses for a while. But not today) to handle.

But... what happens to that buffer? Is it now the property of the form library? Did it make a copy and now I can delete my copy? If the docs don't say, I don't know.

Looking at the source is no help--I don't care what the source says, I care about what guarantees are made by the API. Just because the library makes a copy now (it doesn't, I looked, but that's a separate rant) doesn't mean that it has to in the future. Someone might, at some random time after now, decide that since the behaviour's unspecified and as such open to change. I'm perfectly fine with that, too, as flexibility is good, and I for one have no compunctions about viciously and egregiously changing behaviours that I've promised are undefined. (I take a certain glee in it, in fact :) But I should at least know what behaviours are undefined, dammit! Don't make me guess.

APIs are promises of behavior. Be specific with your promises, and clear about what you're not promising.

Posted by Dan at 01:55 PM | Comments (0) | TrackBack

December 15, 2003

Bloody stupid things to do when writing C libraries, #78

...having callback function arguments that do not take a corresponding invocation-specific data pointer.

You want to have a function that takes a function pointer, and have your library call that function at some point in the future if some event happens? Cool! Works for me. I like those. (Well, sorta, event/callback/async programming is a pain) However.... the signature should never be:

int register_callback(func_pointer_t callback);

Bad! Bad programmer! No cookie! That signature should be:

int register_callback(func_pointer_t callback, void *extra_data);

Or, if you'd rather, take a struct that has the function pointer and callback data in it, if you don't want to manage the two pointers in your library. The signature for that callback pointer should be:

int callback_func(struct instance_data *lib_data, void *extra_data);

though I'm less adamant about that. Very very simple signature lists are best.

Why? Simple. While you may think that I'm going to have a custom callback routine with private embedded data in it all primed and ready for your particular call, you're wrong. This is C we're talking here--it's not like we have closures, so there's no way to have any sort of data bound at runtime. If I want to bind any data to the call it means I need to stick it in a global somewhere. Blech. Very, very ungood.

It gets even worse when dealing with any sort of indirect access to the library--like, say, if you're trying to do this from an interpreter. For that to work without any sort of data pointer requires creating a custom C function, either at compile-time for the module (which requires having a C compiler of some sort handy) or at runtime (which requires the capability of creating new functions on the fly) neither of which is particularly desirable. (Parrot, for example, doesn't need a C compiler handy to interface to most C libraries)

Postgres, pleasantly, doesn't make this mistake. You should endeavor to not make it as well.

If you want a really good reason, consider the following. Someone is writing an interface to your library for an interpreted language. Perl, Python, Ruby, Java, something on .NET--doesn't matter. The program runs, conceptually at least, on the interpreter. The interface writer wants to be able to write those callback functions in the interpreted language.

With a separate data parameter, it's easy. The interpreter builds some sort of closure structure, sets the callback function to be an entry point to the interpreter, and the callback data to be that closure structure. When the callback's made, the entry point function yanks all the info it needs out of the data parameter, sets up the world, and calls into the properly set up interpreter. While there may be a lot of really nasty funkiness going on in there, it's at least doable.

With no data parameter, though... you're stuck. The only way to do it, short of generating a new custom function pointer (which isn't that tough, but is painfully non-portable and something that gives most people a screaming fit to even think about) is to stuff the information you need for the callback into a C global somewhere. The problem there is that it means you can only have one pending callback (which is often suboptimal) and you've got potentially unpleasant threading issues. This is an especially egregious mistake with things like GUI interfaces where you may have dozens of hundreds of some sort of thing instantiated. At least there you've often got an OO interface, so there's the data in the objects, but even then it makes the low-level stuff annoying.

Generally people who use the libraries realize the problem straightaway, but the problem is that often the people using the libraries aren't the people writing the libraries...

Posted by Dan at 01:22 PM | Comments (6) | TrackBack

Cyan, cyan, wherefore art thou cyan?

Or, rather, where art thou, since I know why cyan is cyan.

For some reason, I can't get ncurses to make anything foreground cyan. Magenta, red, yellow (well, OK, icky brown), green, white, black, blue... no problem. Just not cyan. Which is really strange. Change the definition of the color to anything else and it works. Hell, I can make the background cyan OK. Just not the foreground. (It shows as the immediately previously defined foreground color, or black if there wasn't one) Setting the foreground to cyan and then setting inverse video doesn't work either. Nore bizarrely, though, black foreground on cyan background in inverse video works! And displays what I want.

It's just really, really bizarre. Happens on both OS X locally on its terminal and the Linux boxen around the office with the GNOME terminal on a local X station, so it isn't my code so far as I can tell. ncurses just won't do foreground cyan.

Have I ever mentioned that computers really suck sometimes?

Update: Argh! It works from C, but not from parrot, but once again only foreground cyan. Ghods, I hate computers some days...

Update2: Turns out there was a bug in IMCC that prevented a literal 6 from being put in register I6 (and 5 into I5, and 7 into I7) that was causing this. Don't ask, it's been fixed.

Posted by Dan at 01:03 PM | Comments (0) | TrackBack

December 12, 2003

Voice over IP. Or not.

Sounds like VoIP has hit the mainstream--there were stories both on All Things Considered and Marketplace last night and, while NPR isn't quite Fox News (Something I'm very happy about) it's a lot more mainstream than slashdot. VoIP is the next big thing in phone service, or so you'd think.

And, to an extent, it is. I know how the phone system is set up now (My dad worked for AT&T for decades) and it's a marvel in some ways. Phone systems, as they stand now, are circuit-switched networks, that is when you make a call to someone else there's a circuit set up and, for all intents and purposes, you've a single pair of wires running from your phone to the person you're talking to. (Well, sorta. The phone company's been doing call multiplexing on trunk lines between central offices for decades, and on some local lines for quite a while as well. Close enough, though) You make a call and the line between you and your local office's phone switch is energized, and when you dial the switch then connects to an outbound trunk line, which connects to another switch, which connects to another, and another, until you're finally connected to the local office switch of the person you're calling, which then sends its ring signal to the callee's line. When they pick up the circuit's complete and you talk. If, anywhere along the line, there is no free wire between switches you get the fast busy signal.

As you might expect, this is kind of wasteful, and awfully expensive. If there are only, say, 100 circuits on the trunk between your local office and the next office down the line then only 100 people who are connected to that switch can make or receive a non-local call at any one time. If they're all engaged and person 101 tries to make a call... no joy. That leaves the phone company with an interesting dilemma--to avoid those nasty "all circuits are busy" tones you want a lot of lines between offices, but trunk lines are expensive, and come in fixed quantities. (sets of 25 IIRC, but I may not) If all the trunks are full, if you need to run another line, and that's expensive since these things go long distances (tens or hundreds of miles, on poles and under roads).

Even for 'short' hauls it's pricey--you can just imagine how much it costs to dig up a few miles of street in Boston, London, or Paris to run an extra trunk between local offices. Not to mention the costs of the phone switches, which can cost megabucks (literally) when they're full. It's a tradeoff between service and cost. Phone companies, when they're not being run by hyperactive crack monkeys with 10 second attention spams, tend towards massive overkill when they do lay extra cable--while 20 miles of trunk line is expensive, the cable itself is by far the least expensive part of the job, and it's a lot cheaper to lay an extra trunk line when you've got the street ripped up than ripping the street up next year when you need another. (You end up with lots of cable pairs unconnected on either end, but that's fine too, you connect 'em when you need 'em)

So the phone system is, or at least used to be, a really big end-to-end circuit switched network. Run completely non-computerized, back in the day, BTW, which is damned impressive. (Remember the rotary dial days, where the switch on the other end had to count the clicks to figure out what numbers you were dialing) The phone number system, at least under the North American Dialing Plan, facilitated this. (1-A[01]A-EEE-NNNN where A is the area code, E is the exchange, and N is the number within the exchange) If the first number was a 1, it knew you were making a long-distance call and you were switched to the long-distance trunk. If the second digit after that was a 0 or a 1 you were dialing an area code, and you got switched to the real long distance lines. Then came the exchange (the local switch) and then finally the number within the exchange. All electromechanical at one point, with great masses of relays, transistors, and hum. Lots of hum. Damned impressive when you're 6, that's for sure.

Anyway, those days are long-gone--it's all electronic and optical, and AT&T (pre-breakup) was running fiber optic cable between offices and multiplexing calls across them decades ago. Fiber takes up less space under the street (and in the building) and you can run more calls across it than copper, and over the decades they've been stuffing more and more data across each fiber strand. The equipment's also a lot smaller--what used to take a full building full of wire and relays can now fit in a closet somewhere. The phone companies have also been recently moving towards IP as a transport for voice calls over the long-haul links as well, since it means only one set of equipment rather than two sets, and they can better manage their fiber assets. (You don't have to divide the fiber between phone and data)

VoIP to the end user, though... that's new. Other folks have gushed about it, and it is neat from an "oooh, shiny!" geek-magpie standpoint. Heck, we run it at the office, where the phone system's all VoIP. It's not bad, though I could grumble. All I could think about as I was listening to the analysts going on about how keen this stuff was were the ways it could go wrong.

VoIP, for one thing, is a lot less reliable in emergencies than POTS. The phone system as it stands now is self-powered--each office has a (potentially massive) battery room and generator setup with fuel to power the phones for a week or more if the power goes out. (Which has happened around here in New England, when hurricanes and ice storms have swept through, though the power grid's pretty reliable) If the line between you and the office is intact, you can pick up the phone and you will get a dial tone, and be able to call out. (The 99.999% uptime mandated by regulation may have something to do about that) VoIP, well... do you have a UPS that'll keep you going for a week? Day? Heck, an hour? If not, how do you plan on calling 911? (Cell? Hah! Cell service doesn't have the same reliability guarantees and many cell towers are not provisioned with emergency power. Besides, try making a cell call in Manhattan--I have a less than 50% success rate on a normal day. Now imagine none of the land lines work because everyone's VoIP system's without power...)

Besides the whole "power in an emergency" thing, if someone's phone has an IP address, well... Look, Ma, a DDoS Attack target! Which will happen, guaranteed. With some frequency.

Then, of course, there's the issue of quality. Really want your phone call quality to go to hell because your roommate is surfing the latest pr0n sites, or has BitTorrent going snagging eleven zillion megabytes of warez?

Finally... sobig. Remember that one? The virus that installed back door compromise software on millions of IP connectedPCs around the world that now distribute offers for penis enlargement and launch DDoS attacks on folks who complain? Well, three words, baby: Distributed VoIP Spam. SpamAssassin watches my inbox, but there's not much it can do about my phone.

Ah, it's a brave new world, so I ought not complain. Want a cracker? (Mmmm, soma and soylent green, two great tastes...)

Posted by Dan at 02:18 PM | Comments (4) | TrackBack