Yeah, I'm horribly geeky, but I leave apachetop running in a screen session. Most of what comes across isn't particularly interesting -- rss feed fetches, web crawlers, people not getting what they expect when they try and link to images on the webserver, and cretins trying to spam the referrer log. Every once in a while, though, bizarre links come across, in this case it was http://www.newscientist.com/feed.ns;jsessionid=PMHFBACKODPD?index=online-news
Now, don't get me wrong, I really like New Scientist magazine, but I'm pretty sure there's no reason for them to be linking to me, not even on my most overblown ego days. My first thought was this was just someone with a HTTP_REFERER scrambler plugin installed. (I've seen some of those. I assume people install them so their trail on the 'net is scrambled)
Nope, turns out not to be the case. If the browser ID string is to be believed, it's actually the yahoo feed seeker, and while I can't know for sure, the IP address resolves back to one of Yahoo's. So the seeker's putting in bogus info in its referer information. It's marked as the 2.0 tester, so it may well be that the thing's in beta and it's a bug, or it lies to see if the feed provider's spamming search engines. (I've done that in a previous job, Internet ages ago. It was kind of interesting to see how the various web spammers behaved when you did that)
It probably doesn't matter, and I hope they're doing it for a good reason (like spam control) and not just because someone made a whoops or (worse) decided that lying about where they came from was a clever thing to do. (I mean, sheesh -- you're identifying yourself as a feed crawler. That's hardly anonymous)
Posted by Dan at January 29, 2006 11:24 AM | TrackBack (0)