Sandra - in learning with the world: Are Recommendation Engines a Threat to the Long Tail?

Are Recommendation Engines a Threat to the Long Tail?
推荐系统对长尾的威胁？

Written by Marshall Kirkpatrick / October 8, 2007 / 10 comments

Two Wharton academics released an interesting paper last week that asks whether online recommendation services are a threat to the aggregate diversity of items discovered by their users. The study is titled "Blockbuster Culture's Next Rise or Fall: The Impact of Recommender Systems on Sales Diversity" and I found it via a good summary article at PaidContent this weekend.

威胁在于多样性不如来自用户的多样性。

All indications point towards a rise in importance by recommendation engines, so this argument deserves examination. From eBay's acquisition of StumbleUpon to the CBS acquisition of Last.fm to this weekend's MSNBC acquisition of Newsvine - recommendation engines are big money. We've covered quite a few startups in this space and I'm sure it will continue to grow in prominence.

从已有的系列收购案例看，推荐引擎都是大生意。而且还会继续快速发展。

Perhaps more importantly, the "Long Tail" of diverse discovery is an important part of the meritocratic and democratic promise of the new web.

可能更重要的是：来自长尾的多样性是精英和民主化对于新一代网络的承诺。

Good recommendation engines are also just plain fun.

After just a little consideration, the Wharton study seems more meaningful as a cautionary tale than as a critique of the inherent nature of recommendation engines. In discussing this with others I've found that most people swing quickly from believing the study is either obviously wrong or obviously correct. It's a more complex question than it might seem.

沃顿学者的研究更像一种警示，而不是对推荐引擎本身的批评。大家一致认为，这个问题要么是显然的错误，要么是显然的正确，换言之，它远比看起来的复杂。

Recommendation engines should strive to be smarter than simply finding that "there is a high correlation between people who liked X and people who liked Y." I would argue, for example, that recommending other users of a system and highlighting their less popular discoveries could be a good way to solve the problem. Getting it right is probably easier said than done, but it seems there's still plenty of potential for recommendation engines to expand the long tail. The study's arguments are important to consider, though.

推荐引擎应该比简单发现喜欢X的人也喜欢Y这样的简单相关更智能。比如，推荐系统的系他用户以及他们的不太流行的选择可能就是解决的好办法。说远比做容易，但这意味着，推荐引擎在扩展长尾上还有很大的发展空间。当然，学者的意见值得仔细思考。

What the Study Says

A Wharton summary of the paper excerpts the following to explain the study's conclusion: "Because common recommenders recommend products based on sales and [consumer] ratings, they cannot recommend products with limited historical data, even if they would be rated favorably," the authors write. "This can create rich-get-richer effects for popular products and vice-versa for unpopular ones, which results in less diversity."

常见的基于销售和消费的推荐系统容易造成马太效应，加剧差异。这就减少了用户选择的多样性。

There's also some discussion of the Facebook app landscape, arguably an environment where the long tail doesn't hold up. See also this related discussion at TechCrunch.

The authors argue that individual users may consistently be exposed to items that are new to them, but we're all exposed to the same new items - resulting in greater individual diversity but less aggregate diversity.

面临同样的推荐和选择，带来更多的个人多样性和更少的总体多样性。

Counter Arguments

The study includes a counter argument from Greg Linden, who helped develop Amazon's recommendation engine. Linden says "recommendation algorithms easily can be tuned to favor the back catalog -- the long tail -- as Netflix does."

来自amazon推荐算法创始人的反面意见：推荐引擎可以很容易的实现对于藏在后面的很多长尾数据的发现，如同netflix已经实现的。

The role played by early adopters, "cool hunters", taste makers and advertisers relative to recommendation engines would also be interesting to look at.

My personal fantasy for recommendation engines is this: I want del.icio.us to look at my bookmarks and recommend not just other URLs I might be interested in, but also other users whose tastes are similar to mine. I'd also like to see which of those recommended users tend to find items of interest earliest, so I can prioritize following them.

Repetition, perhaps another way to describe popularity, will probably always drive consumption - but if I can see all of the things that are discovered by people recommended to me then I can use their less popular picks as guidance.

If other metrics are considered, and surely they are in any sophisticated recommendation engine, then what's called "Attention Data" can help augment recommendations beyond merely what's most popular among people with similar interests. (Need an intro to Attention Data? Here's one that could work for you.)

其他纬度引入，复杂的推荐系统，比如注意力数据？

It would be ill advised to reject recommendation engines as dumb popularity machines based on this study, but it is also important to take its arguments into consideration.

我的思考：

问题是，所谓的注意力数据和销售数据有本质的差异么？

问题是，长尾本身存在么？或者存在新生的小力量们能把握的价值么？我们一直都是个巨头主导的社会，小的创新企业有什么法宝能够轻易取代传统巨头的力量么？好的推荐系统是个高技术含量和海量数据积累的高门槛系统。这些没有长期的技术和历史数据的积累是不可能的。新来的革命者们拿什么颠覆他们？

问题是，复杂多维的推荐系统，没有海量计算能力，没有深厚的技术功底实现不来。看看各大厂商的数据挖掘产品都卖给谁了，就知道应用这个东西需要什么样的门槛。更不用说去开发一个好的结合特定应用场景的、多维度的、面向海量大众的推荐引擎了。

怎么办？

用户不知道后台的复杂，仅仅是一个感觉，感觉推荐是否准确。而感觉有很多种纬度可以满足，不一定是多个纬度同时具备才行。

人的需求其实是很柔性和多面的。满足某一个侧面并不是太难。

把很多个纬度简单放在一起的结果可能并不令人满意。真正设计好不同纬度之间的关系就是一件相当不容易的事情。那么先拆开来做怎么样呢？

标签： Anlysis, personalization algorithm

Sandra - in learning with the world

10/11/2007

Are Recommendation Engines a Threat to the Long Tail?

What the Study Says

Counter Arguments

0 条评论:

我的简介

先前的博文