Tuesday, January 18, 2005

Adam Bosworth on personalization

Adam Bosworth (VP of Engineering at Google) talked to the Gillmor Gang on IT Conversations. Early in the discussion, he seemed to be proposing a personalized feed reader:
    Imagine that one could take any number of RSS feeds coming in every day ... and you could put all that in a database on a very large scale, so 100 million people are posting to each other every day ... I'd get [a] list in sort of a relevance way, so I could wade through the ones that are most likely to be interesting to the ones that are least likely to be interesting to me. That would be very cool. It would help me find out what's going on out there in a richer way...
His emphasis on relevance made me think that Adam was proposing something like Findory. Adam said he wants a personalized relevance rank of RSS feed posts that focuses you in on the most interesting ones for you. That's personalized news.

But, later, I was surprised to hear Adam criticize personalized feed readers:
    One of the things that works pretty well today -- even with Amazon -- is things that are global where the personalization is a global and essentially says here's what I know about you and here's what I know about the world. Based what I know about the world and how you fit in, here are the recommendations I can make. And that model seems to work, partially because people assume it's not perfect. They understand that this is a pretty imperfect model.

    But if it started filtering -- you know, it's one thing to say, what are the recommendations. It's another thing to say here what are the new posts and it only shows you the ones it thinks you need to see. That would be kind of frightening. And it's hard to be that smart.
It wasn't extremely clear what Adam's concerns were, but, pulling from many other comments in his talk, they seemed to be around three issues: loss of control, loss of breadth, and loss of serendipity.

For a power user like Adam, loss of control is a really big deal. Spending hours managing hundreds of subscriptions in Bloglines or configuring a customizable portal is just fine. The more knobs, the better.

But, as Adam said at one point, his mother doesn't agree. Adam's mother doesn't want control. She won't customize. She just wants the right thing to happen. She just wants to read news.

Most current RSS readers are for people like Adam, techies who love to push buttons and get great joy out of programming their VCRs. For RSS to enter the mainstream, it needs to be easy. No effort. No configuration. No hunting down feeds. It needs to just work. That's what personalization does. It makes it so it all just works.

Adam was also seemed concerned that personalization might cause loss of breadth and serendipity; he seemed to think it might pigeonhole readers and only show them a small selection of content. To explore this, let's start with what Adam said about why he likes Google News:
    Like I go to Google News and I look at their news .. Most of the time, I find it actually of intriguing because there are stories and I look at who's writing about it and I see all these people like the Times of India or the Australian-whatever that I wouldn't normally see ... with totally different points of view ...
Adam likes Google News because it helps him discover articles he otherwise would have missed. What's so interesting about this comment is that Google News is stunningly bad at discovery. It shows the same front page to everyone. With 100k+ articles available, everyone sees the same thin slice of 20-30 articles. All the depth of information is lost.

Personalization offers a way to show different front pages to different people. It plucks the interesting bits and pieces out of a sea of information. Everyone sees a different slice of the data. Readers see new sources, are exposed to new viewpoints, and discover articles they otherwise would have missed.

For example, if you read Google News over the last couple weeks, there were hundreds of articles on the tsunami. Buried somewhere in there was what I thought was a fascinating article on the science of the tsunami from National Geographic. Google News had no way of showing this article to me; it shows the same thing to everyone. A personalized news site like Findory could (and did) surface this article for me by learning my interests and pulling the article out of the noise.

In weblogs, it's even better. There's millions of weblogs out there. It's quite hard to find good ones. The signal to noise ratio is shockingly poor. A personalized weblog reader can recommend relevant weblogs you've never heard of (like, perhaps, this one). And it surface the occasional gem on high traffic but low value weblogs.

Discovery in vast quantities of data is what personalization is designed to do. The key is to make sure the personalization reaches beyond the obvious and into the surprising. If you do that, personalization reveals the full breadth of the data and enhances serendipity.

Adam certainly seems to recognize the value of personalization for reading news and weblogs:
    If you have participatory client like a blog reader obviously you can do more tracking and obviously that would be useful information to have. I would love to know what are the usage patterns in terms of reading each of my posts and more importantly I think someone else would be interested in looking at that and how that correlates to other things that are read and thinking about how to make suggestions about what people might want to read.
And he talked about the value of personalization for information overload:
    You want to actually -- because you have so much information overload -- find out what other people are reading as a way to filter what you read.
But he seems concerned about how difficult it is to do the personalization right.

In addition to personalization, Adam also talked quite a bit about distributed databases. He described wanting a generic virtual database that has "data routers" that know where data is stored on a very large cluster of databases and routes the request, much like Google does with replicated shards of its search index distributed across its cluster. Very cool stuff.

All in all, a very interesting talk. It's long and time consuming -- and there's no transcript available, unfortunately -- but it's worth a listen.

[Bosworth talk via Scoble]

No comments: