November 11, 2003

It's the googleware, stupid!

It's never good to leave things hanging (nor, I suppose, is it that great an idea to link to yourself) so I did the sensible thing--empirical testing.

Threw up a new image, mentioned it on IRC, and a few folks went to look. No googlebot.

Dowloaded Opera and installed it, telling it that I was OK with google's adware/spyware stuff. Threw up another image, and looked at it with the new install of opera, which I then shut down and haven't fired up since.

The result? Five minutes and 38 seconds after my look at the image, here comes googlebot! There are exactly two hits on the URL, one from me with opera and one from crawler9.google.com, if the PTR record for 64.68.87.66 is to be believed. With a browser string of Mediapartners-Google/2.1 no less. (plus the googlebot URL, as normal from googlebot hits) So...

It's the googleware.

Which, FWIW, I'm fine with. The opera splash was pretty explicit about sending info to Google, and it's pretty clear they have. I don't otherwise use Opera, so it's not like any other SuperSecretInfo is leaking out. (Not that I have any, of course, and claims to the contrary are lies. Lies, I tell you, lies!)

OTOH, it might be worth noting if you do have SuperSecretInfo and use the free version of Opera, which would potentially be Somewhat Unwise.

Posted by Dan at November 11, 2003 02:36 PM | TrackBack (0)
Comments

I love a good experiment--thanks for doing it! Sounds like IRC has nothing to do with it then. I searched for Mediapartners and it looks like that is the bot name for the content targeted ads that Google does. So the content-targeted ads in Opera are what causes the fetch. Here's a couple pages that I rooted out:
https://www.google.com/adsense/faq-tech
https://www.google.com/adsense/faq
One of the questions in the faq (#6), plus the fact that the user agent has a different name, makes me think that these pages only get fetched for showing ads in Opera/AdSense. Sounds like they don't make their way into Google's main index.

Thanks again for checking this out..

Posted by: PaulR at November 11, 2003 03:50 PM

At the moment they likely only do ad-targeting, and they may always only do ad-targeting. OTOH, there were a lot of companies roaming the 'Net that "would never compromise user's privacy" or some such stuff, and we all know how those turned out. :) I'm not sure about those FAQs, though--they seem targeted at people who've signed up for the ad program, something I've definitely not done.

Anyway, I don't much care, as I'm not using the well-noted spyware stuff, and I was mostly curious and a bit surprised at its speed. I'd be more worried if I was using google spyware-enhanced browsers regularly, and far more worried if I was using them and accessing hidden pages or going to sites that did stupid things like encode SSNs, account numbers, or whatnot in the URL. (Though I'd be pretty worried doing that anyway)

Posted by: Dan at November 11, 2003 04:55 PM

Did you enable the PageRank features? GoogleBar sends the URL of all viewed pages to Google when you have that one enabled. (It has to do so in order to display the PageRank.) If Google got the URL because of a PageRank request they might just as well index it.

Posted by: flgr at November 11, 2003 10:25 PM

All I did was check the "I want google rather than generic adware" box on Opera's startup. Dunno what that enabled. Don't care, honestly, as I was just seeing what would happen. Whatever's set up as the default is what happened.

Posted by: Dan at November 12, 2003 12:05 PM

seems like they are having enough trouble without
looking for more url's to crawl

but how about browsing to localhost sometime?
just a thought.

Posted by: hotlinks at November 21, 2003 01:30 AM