The Future of Music Recommendation Engines

Dartboard by flickr user hpkHi everybody. I know it's been a while since I rapped at ya, but you know how it goes. Anyway, enough about me - let's talk about my favorite topic, music recommendation engines!

I recently attended the SXSW 2009 Interactive conference, and some of my favorite panels (not surprisingly) focused on the ups, downs, and future of music recommendation engines. My favorite one, titled "Help! My iPod Thinks I'm Emo" (click on that to see the slides from the panel) focused on the current state of recommendation engines and talked about why they don't work - "work" in this case meaning "introduce people to music they would otherwise not have heard of but would probably like." Studies have shown that the majority of music that gets recommended to people by automated recommendation engines actually represents a very small percentage of music that's available to these engines. In other words, music recommendation engines are not helping people dig into the "long tail".

So, if these engines aren't perfect now, are they at least getting better? Not really. In 2009, we basically have the same two options as we did back in early 2006, when CNET's Steve Krause wrote this great piece outlining the differences between the two prevalent forms these engines take: collaborative filtering, and content-based recommendations.

If you don't feel like reading Steve's article (you should, it's enlightening) here's the short version:

  • Collaborative filtering (think last.fm) is where a computer tells you that you will like Coldplay because you like Radiohead. Why? Because other people who like Radiohead like Coldplay. And by "like" the computer means "listen to frequently." (Not included in this calculation is the fact that Coldplay blows and you will probably hate them.)
  • Content-based recommendation engines (think Pandora), will also tell you that you will like Coldplay because you like Radiohead, but in this case, it will be because the computer perceives shared characteristics between the two bands’ music -- such as high-pitched vocal melodies, anthemic, sweeping guitar arrangements, and a general gloomy outlook on life. (This type of recommendation also fails to take into account the fact that Coldplay blows and you will probably hate them.)
Obviously, each of these models has its drawbacks. Ultimately, no automated solution can rival recommendations coming from an actual human being – a friend, a music critic, a blogger, or even someone you connect with on social media sites like imeem and meemix. The more personal the recommendation, the more likely it is to respond to your individual music tastes – to know, for example, that you that you will never like Coldplay, but you might like Elbow, because they're awesome and they share some characteristics with Radiohead without being a blatant ripoff. But the problem with that personal touch is that it doesn't scale - your friend who knows all the cool bands can't be giving personalized recommendations to millions of people every day, he or she would get tired.

So where does that leave us? If, like me, you were hoping for a beautiful future where our robot overlords would tell us what to like and how to think, you're probably out of luck. It turns out that the most promising systems are probably going to utilize a hybrid approach between all of the methods discussed above, whereby human intuition and cold, hard algorithms will share the spotlight. For example, imagine the Hype Machine (which automatically crawls a curated selection of music blogs and generates an automatic playlist based on what music is being blogged about) with Slacker.com (where professionals have decided what bands sound like each other) with user-generated tags from Last.fm.

Of course, even that sort of hybrid system won't be perfect. Nothing will, obviously. But at least there's hope for the future. In another session I attended, "Music 2.0 = Music Discovery Chaos?" it became apparent than even those people I would consider to be "high-level users" of both music and technology primarily relied on their friends and human tastemakers to point them to new music. There's a nice writeup of the panel over at Music Machinery.

So, how are you finding new music these days? Have you ever actually heard a band on one of these sites that you've A) never heard of before, and B) went on to become a fan of? Let me know in the comments.

EDIT:I just realized that the author of the awesome "Music Machinery" blog I link to above is none other than Paul Lamere, one of the co-hosts of the first SXSW panel I reference above. Cool! Expect to hear more about him and some of the work his new company is doing in my next post.


axelrod said...

How do you think this compares to engines for other content like Netflix's logic for recommending movies? It seems like the approach wouldn't need to be so different, but somehow Netflix can predict with some accuracy what I would like while music engines do only slightly better than if they chose bands at random.

Lightnin' No Last Name said...

great post.

A good hybrid model would probably address my major beef with the collaborative filter: problem is you have some listeners who are very genre-oriented (let's say, one particular strain of post-punk recorded after 1977 and commonly referred to as "indie") and those who are pantagruelic and listen to a number of artists across a different range of genres with little regard for genre labels (ie, the people who don't say, "Oh, I listen to pretty much everything but rap and country.")

The latter set - who arguably are more "rounded" music fan - is probably useless to the collaborative filter, because there wouldn't be much of a discernible pattern there (other than the fact that this set of listeners is probably better at separating the wheat from the chaff in any given genre - but they'd be toxic if you have listeners who just want to drill deep within one genre.)

For example, most "indie" kids are drawn to experimental or free jazz rather than the more formal stuff. I like indie, and I like jazz, but I have no interest in listening to a recording of Albert Ayler strangling someone's child or whatever he does. I wouldn't trust the collaborative filter to figure that out, but then again, the content-based model wouldn't even point me to jazz if I fed it indie in the first place, so I'm screwed no matter what. And I'm ok with that.

I fully support the use of Coldplay as a litmus though. I like how they're the example of epic fail on the part of automated systems!