The Use of Search History

Today's big deal seems to be a plethora of stories about new AI techniques to be applied to search. There are some good points there, about notions like using facial recognition to better understand photographs, or applying natural language techniques so that the search engines can understand real language instead of requiring "keywordese".

That said, there's still a lot of room for improvements on pure brute-force techniques, if you accept the notion of tracking previous searches. A simple case is progressive search refinement -- essentially playing Blind Man's Bluff with the search results, where you could say "cooler" and "warmer" as you wander through pages, actively triangulating the more and more relevant-looking pages and downgrading the groupings that seem to have less relevance. Another is remembering my previous searches *and* feeding them into the PageRank algorithms, so that pages I previously found useful would increase the network weight of those and related pages. (It's possible that this latter is already being done, but I have no evidence of it.)

Another subtle but serious improvement would be to make the Google Toolbar smarter about paying attention to how I *use* the search results. A page that I click on would be weighted higher than one I didn't. The *last* one I clicked on would be weighted higher than the rest, on the theory that this is typically the one that provided the right answer. Also, pages that I left open for a real period of time would have their weights increased over ones that I closed quickly. This would essentially build PageRank ratings automatically, working around the fact that most people aren't going to go to the effort of clicking on the "good" and "bad" icons.

All of these ideas would be controversial, to be sure -- the privacy implications are quite real, especially in light of last week's AOL search-history debacle. Still, this is an area where there is at least a genuine economic tradeoff of privacy for utility, so the potential privacy loss has some value. (As opposed to many modern privacy intrusions, which have no value at all to the person intruded upon.) I don't take it for granted that everyone would be repulsed by the potential privacy loss -- heck, I'm not even sure which way I'd come down on it. If a company made a real effort to improve the privacy of the history recording (say, by storing URLs as encoded hashes rather as plaintext, so that my search history could not simply be read out later), I could easily see myself accepting the risk...
