A while back, Netflix announced a contest offering anyone a cool $1 million if they could produce a 10 percent improvement in the company's movie recommendation engine. From time to time, the media has been poking in on the results, and though there are some teams getting close (Netflix has a way of measuring an improvement, though I'm not exactly sure what it is), nobody has yet eclipsed the threshold. In fact, to some the whole thing is looking like one of those curves that gets infinitely closer to the baseline, but will never touch it (what are those called, again?). That would be genius on Netflix' part, seems a bit unlikely.
After months of the leaderboard being dominated by a few math-based teams, there's a new face on the leaderboard, and he's actually a psychologist, reports Wired (via Techdirt). There's not one good graph to summarize, so you'll just have to read the whole thing, but the upshot, is that when you have mountains of data, you end up extrapolating off of illusory patterns -- noise. However, if youhave some correct premises at your disposal, you can hone in on some key patterns that might otherwise get lost in the numbers:
A deeper part of (Gaving) Potter's strategy is based on the work of Amos Tversky and Nobel Prize winner Daniel Kahneman, pioneers of the science now called behavioral economics. This new field incorporates into traditional economics those features of human life that are lost when you think of a person as a rational machine, or as a list of numbers representing cinematic taste.
One such phenomenon is the anchoring effect, a problem endemic to any numerical rating scheme. If a customer watches three movies in a row that merit four stars — say, the Star Wars trilogy — and then sees one that's a bit better — say, Blade Runner — they'll likely give the last movie five stars. But if they started the week with one-star stinkers like the Star Wars prequels, Blade Runner might get only a 4 or even a 3. Anchoring suggests that rating systems need to take account of inertia — a user who has recently given a lot of above-average ratings is likely to continue to do so. Potter finds precisely this phenomenon in the Netflix data; and by being aware of it, he's able to account for its biasing effects and thus more accurately pin down users' true tastes.
Couldn't a pure statistician have also observed the inertia in the ratings? Of course. But there are infinitely many biases, patterns, and anomalies to fish for.
He might not win. Like the math-intensive teams, he's stalling out around 8 percent, and knows it'll be a long slog to get to the magic 10 percent.
While the competition goes on, he's keeping a blog under his nom de algorithme Just a guy in a garge, which has obviously gotten way more popular since the Wired piece game out. It's a cool journal of the various ideas he's exploring along the way, looking for patterns, testing out assumptions, etc. And since he's not strictly a math guy, it's readable to the average person.
That Wired article was great. My girlfriend and I use Netflix regularly, and so far every time my gf puts one of the "recommendations" onto her list, I make fun of her. And every time I end up being surprised and liking some movie I'd never heard of.
Posted by: Simon Owens | March 03, 2008 at 09:37 PM
"In fact, to some the whole thing is looking like one of those curves that gets infinitely closer to the baseline, but will never touch it (what are those called, again?)."
Assymptote.
Posted by: Alex | March 03, 2008 at 09:50 PM