In Praise of Logical AND. In Censure of Invasive Cookies. (Shallow Thoughts)

Akkana's Musings on Open Source, Science, and Nature.

Tue, 05 Aug 2008

In Praise of Logical AND. In Censure of Invasive Cookies.

The tech press is in a buzz about the new search company, Cuil (pronounced "cool"). Most people don't like it much, but are using it as an excuse to rhapsodize about Google and why they took such a commanding lead in the search market, PageRank and huge data centers and all those other good things Google has.

Not to run down PageRank or other Google inventions -- Google does an excellent job at search these days (sometimes spam-SEO sites get ahead of them, but so far they've always caught up) -- but that's not how I remember it. Google's victory over other search engines was a lot simpler and more basic than that. What did they bring?

Logical AND.

Most of you have probably forgotten it since we take Google so for granted now, but back in the bad old days when search engines were just getting started, they all did it the wrong way. If you searched for red fish, pretty much all the early search engines would give you all the pages that had either red or fish anywhere in them. The more words you added, the less likely you were to find anything that was remotely related to what you wanted.

Google was the first search engine that realized the simple fact (obvious to all of us who were out there actually doing searches) that what people want when they search for multiple words is only the pages that have all the words -- the pages that have both red and fish. It was the search engine where it actually made sense to search for more than one word, the first where you could realistically narrow down your search to something fairly specific.

Even today, most site searches don't do this right. Try searching for several keywords on your local college's web site, or on a retail site that doesn't license Google (or Yahoo or other major search engine) technology.

Logical and. The killer boolean for search engines.

(I should mention that Dave, when he heard this, shook his head. "No. Google took over because it was the first engine that just gave you simple text that you could read, without spinning blinking images and tons of other crap cluttering up the page." He has a point -- that was certainly another big improvement Google brought, which hardly anybody else seems to have realized even now. Commercial sites get more and more cluttered, and nobody notices that Google, the industry leader, eschews all that crap and sticks with simplicity. I don't agree that's why they won, but it would be an excellent reason to stick with Google even if their search results weren't the best.)

So what about Cuil? I finally got around to trying it this morning, starting with a little "vanity google" for my name. The results were fairly reasonable, though oddly slanted toward TAC, a local astronomy group in which I was fairly active around ten years ago (three hits out of the first ten are TAC!)

Dave then started typing colors into Cuil to see what he would get, and found some disturbing results. He has Firefox' cookie preference set to "Ask me before setting a cookie" -- and it looks like Cuil loads pages in the background, setting cookies galore for sites you haven't ever seen or even asked to see. For every search term he thought of, Cuil popped up a cookie request dialog while he was still typing.

Searching for blu wanted to set a cookie for bluefish.something.
Searching for gre wanted to set a cookie for www.gre.ac.uk.
Searching for yel wanted to set a cookie for www.myyellow.com.
Searching for pra wanted to set a cookie for www.pvamu.edu.

Pretty creepy, especially when combined with Cuil's propensity (noted by every review I've seen so far, and it's true here too) for including porn and spam sites. We only noticed this because he happened to have the "Ask me" pref set. Most people wouldn't even know. Use Cuil and you may end up with a lot of cookies set from sites you've never even seen, sites you wouldn't want to be associated with. Better hope no investigators come crawling through your browser profile any time soon.

Tags: , ,
[ 10:10 Aug 05, 2008    More tech | permalink to this entry ]