Rethinking Recommendation Engines - ReadWriteWeb

Rethinking Recommendation Engines

Written by Alex Iskold / February 25, 2008 1:37 AM / 7 Comments

Over two years ago, Netflix announced a Recommendation Engine contest - anyone who invents an algorithm that does 10% better than their current recommendation system will win $1 Million dollars. Many research teams raced to attack the problem, excited by the unprecedented amount of data available. Initially quite a lot of progress was made, but then slowly the progress stalled and now teams are stuck at around the 8.5% improvement mark.

In this post we argue that the improvement in recommendation engines is not an algorithmic problem, but rather a presentation issue. Respinning recommendations as filters and delivering them without setting high expectations is more likely to yield progress than crunching more data faster.

个性化推荐并不是个算法问题，而是个展示和表现的问题。过滤后展示给用户，不给他们太大的期望，比技术上数据处理更快都有用。（这个说法可是有些惊世骇俗，多少让很多技术自持者们不屑，让苦于技术无突破的人眼前一亮。是不是有什么其它自己没想到的东西可以代替技术实现同样的目的呢？就像当年flickr简单一个tag的妙招解决若干大学长期研究没有结果的图片内容识别问题。）

Building a recommendation engine is a complex endeavor, which we discussed here a year ago. But in addition to being a technical challenge, there are also fundamental psychological questions: do people want recommendations and if so, then when are they open to them? Perhaps an even bigger question is: what happens when the user receives one or more bad recommendations? How tolerant will they be?

建立个性化推荐引擎绝非易事。更重要的是还有更为基础的心理问题：

用户需要推荐么？

如果是，什么时候他们会打开它？

也许更大的问题是：如果收到不好的推荐，会发生什么？用户的容忍度多大？

Genetics of Recommendation Engines

All recommendation engines are trying to solve the following problem: given a set of ratings for a particular user, along with those of the whole user base, come up with new items that this user will like. There are many algorithms that can be applied to the problem, but all of them focus on three elements: personal, social and fundamental:

所有的推荐引擎都试图解决下面的问题：针对一个特定用户的一系列评级，综合整个用户群特征，给出这个用户可能会喜欢的内容。有很多算法用来解决这个问题，基本上集中在三项：个人、社会、基本推荐

Personalized recommendation - recommend things based on the individual's past behavior 个性化推荐-基于个人过往行为特征的推荐
Social recommendation - recommend things based on the past behavior of similar users 社会化推荐- 基于一群行为类似用户的过往行为的推荐
Item recommendation - recommend things based on the item itself 项目推荐-基于项目本质属性的推荐
A combination of the three approaches above 综合推荐-综合上述各种方法的推荐

A social recommendation is also known as collaborative filtering - people who liked X also like Y. For example, people who liked Lord of The Rings are likely to enjoy Eragon and The Chronicles of Narnia. The problem with this approach is that peoples tastes do not in reality fall into simple categories. If two people share the same taste in fantasy movies, it does not mean that they will also both like dramas or mysteries. A good way to think about this problem comes from genetics. Many times we meet people who have features that we recognize and have seen in others. For example, eyes might look familiar, or lips, but it is a totally different person.

社会化推荐又称协同过滤-也就是喜欢X的人也喜欢Y。

这种方法的问题在于，人们的趣味并不是基于一种简单的类别体系。

The other kind of recommendation is an item-based recommendation. The best example of this system is the Pandora music recommendation service. It works by ranking each musical piece by more than 400 different characteristic - musical genes. It then automatically matches the pieces based on these characteristics. There are challenges with tuning the algorithm to work well, but it is also challenging to apply it to other verticals. For movies, for example, you'd need to come up with ranking each movie along many scales, starting from director, cast, plot; and then obscure things like musical score, locations, light, camera work, etc. It certainly can be done, but this is complicated.

另一种是基于内容项目的推荐。最佳示例是潘多拉的音乐推荐。每首音乐抽象出400多个不同的特征点，构成音乐基因。然后基于此而匹配推荐。这里除了算法的挑战外，更有把这个方法应用到其他垂直领域可行性的挑战。这些事情可以做，但是成本巨大。

The Guy In The Garage

The complexity of the recommendation problem is due to its vast space of possibilities. Much like it's hard to figure out which exact gene is responsible for a particular human trait, it is hard to figure out which bits of the movie or music make us rate it as 5 stars. Reverse engineering human thinking is hard. Which is exactly why one of the contestants highlighted in the Wired article is relying on a very different trick to make his algorithm work.

推荐系统的复杂性在于其海量的可能性。就像很难确定究竟是那个基因决定人的特征一样，很难决定哪个基因决定了我们会将其评级为五星。反解析人类思考的过程是很难的。

Nicknamed Guy In The Garage, Gavin Potter from London is relying on human inertia. Apparently, the rating of the movie depends on the ratings of previous movies that we just saw. For example, if you watch three movies in a row and rate them with 4 stars, and then watch the next one which is slightly better, you will rate it 5. Conversely, if you rate three movies in a row with 1 star, then the same movie that you would otherwise rate as 5 would only get 4 stars from you.

来自伦敦的Gavin Potter，外号叫做“Guy In The Garage”的，他依赖于人们的惯性。显然，人们对于一个影片的评级依赖于此前看到影片。如果你一连看了三个4星的，这时候紧接着看到稍微一个好的，就会给他评为五星。相反，如果这同样这个之前你看到的是三个1星的，这个你就可能只给4星。

Just when you think that this is not true, you will discover that this algorithm now sits in the 5th place and still is making progress, while other algorithms are spinning. Enhancing formulas with a bit of human psychology is a really good idea and this is where we turn next.

正如你认为此方法不正确，事实上，这是正在发展中的第五种方法。考虑并规划出人类的心里范式，在这里是个很好的方法。

Replacing Recommendations with Filters

以过滤替代推荐

How many times has this happened to you: a friend recommended you a movie or a restaurant, so you went there all excited - but ended up disappointed? A lot! It is obvious that hype sets the bar high, increasing the chances of a miss. In math speak, this kind of miss is known as a false positive. Consider now what would happen if instead of recommending a movie, a friend tells that you are not going to like certain movie, so do not bother renting it.

是否经常经历这种尴尬？朋友向你推荐了影片或者餐厅，你兴致勃勃前往的结果却是失望重重。夸大之词抬高了期望和相应的失望可能。那数学的说法叫做：false positive .想象一下另一种情形，朋友告诉你你不太可能喜欢某个电影，你就不必费心找它了。

What bad can come of that? Not much, because likely you are not going to watch it. But even if you do and you like it, you are not going to be experience negative feelings. This example demonstrates the difference between our reaction to a false negative and a false positive. False positives upset us, but false negatives do not. The idea of respinning recommendations as filters is about leveraging this phenomenon.

这样做有什么坏处？没什么。即便你看后喜欢，你也不会有什么负面情绪。此例证明了我们对于错误的负面消息、错误的正面消息的反应差异。错误的正面消息让我们沮丧，而错误的负面消息却不会。这就是以过滤替代推荐的依据。

When Netflix makes recommendations, it sets itself up for a sure failure. Sooner rather than later it is going to miss and recommend you a movie that you are not going to like. What if instead of doing that, it would show you new releases and have a button: filter the ones I am not going to like. The algorithm is the same, but perception is different.

Netflix意在推荐而必然失败。假设如果他在展示给你最新影片的时候，旁边有个按钮：不喜欢，忽略它。整个算法相同，但是感觉却完全不同。（让我突然想起了很早以前就给别人提到的自己一个想法，展示自己不喜欢什么其实和喜欢什么一样重要。只是那个时候，不能回答人家随后的问题，然后呢？现在知道了，然后其实在这里落脚，然后还可以再延伸）

Filters in Real-Time Culture 现实中的过滤器

And this idea becomes increasingly important and powerful in the age of real-time news. We are increasingly oriented towards continuously filtering new information. We do this with our RSS Readers everyday. We think of the world in terms of streams of news, where things of the past are not relevant. We do not need recommendations, because we are already over subscribed. We need noise filters. An algorithm that says: 'hey, you are definitely not going to like that' and hide it.

现实中，人们事实上在过滤掉大量的新信息。我们认为过去的信息不再相关。我们在新一已经过载的情况下，不需要推荐，而是需要噪音过滤。算法说：你肯定不会喜欢这个，忽略它节省些时间吧。

If the machines can do the work of aggressively throwing information out for us, then we can deal with the rest on our own. Borrowing from the spam box in emails, if all the tools around us had a button that said 'filter this for me', and maybe even had a mode where such a filter is on by default, we'd all to get more things done.

如果机器能帮我们扔掉这些噪音，我们就能专心处理剩下的信息。进而能够处理更多的事情。

Conclusion

Building a perfect recommendation engine is a very complex task. Regardless of the method, collaborative filtering or inherent properties of things - recommendations are an unforgiving business, where false positives quickly turn users off. Perhaps applying psychology to the problem can make people appreciate what these complex algorithms are doing. If instead of recommending things, machines would filter things we definitely won't like, we might be more forgiving and understanding.

Now tell us please about your experiences with recommendation engines. Were there ones that worked really well? Would you be open to filtering instead of recommendation? Besides movies and news, where would you like to have these filters?

See also our follow-up post 10 Recommended Recommendation Engines.

标签： Anlysis, personalization algorithm, ResearchMethod