Many personalized recommendation engines boil down to “here are some items that you might like”. In other words here’s a list, take it or leave it. The output tends to be rather blunt in nature. Recommendations can be served more delicately (depending on how it’s presented), but often a recommender model runs no deeper than an unconnected string of potentially interesting objects.
The formula: Gather user browsing and/or item data, learn from it, and make some item suggestions. Creating this sort of output is more or less what we are taught when building a basic recommendation engine. Be it a collaborative or content-based, the basic recipe is to gather user and/or item data and create an ordered list of things they might like. This approach is fine for many applications, and is certainly flexible enough to work for a wide range of problems. But there’s not much creativity in the groupings, which I’ll get to about in a minute.
Let’s use movies as a concrete example because it’s historically interesting (as a recommendation problem) and I am just familiar enough with movies to recognize interesting suggestions . I’ll also pick on Netflix because because I have an account and it’s what got me started thinking on the problem in the first place. Skipping over the details, let’s see Netflix’s self-described algorithmic output:
… when you look at your Netflix homepage, our systems have ranked titles in a way that is designed to present the best possible ordering of titles that you may enjoy. In each row there are three layers of personalization:
The choice of row (e.g. Continue Watching, Trending Now, Award-Winning Comedies, etc.), Which titles appear in the row, and The ranking of those titles.
In practice this looks something like:
Above we have groups containing:
- What is popular
- What I’m currently watching
- What is trending (recently popular)
- Top 10 (what is, um, extra popular…)
- ‘Categories’ for me
- A general personalized recommendation list
What got me started thinking more on this topic lies is how incredibly unimaginative Netflix’s groupings are. The “Categories for You” labels were especially bland. They include: Drama, Comedy, Documentary, and my least-favourite: “International” (not because I dislike international movies, but because “Not American” is no more insightful than “Movies without Adam Sandler”). These categories are not an impressive means of telegraphing they have any special knowledge about your tastes. Label-wise, these are the same categorizations you’d find in a Blockbuster video.
I want to suggest a path forward, at least conceptually. Here is my suggestion: aim for more novel, spontaneous movie groupings. The naive way to go about this is to pre-label into and finer clusters by cross-referencing genres. Some examples:
- Sci-Fi Romance
- Young-adult comedy
- Nature Documentaries
- Spanish Dramas
- Award-winning Foreign movies
Personally, I like to jump between genres and styles so narrowing my tastes into more selective fields simply misses the point. Recommendation engines should know that I want to see many genres. In a rec_2.0 world, I would expect to see more creativity focused on the movie groupings themselves. Clusters that could arise spontaneously and evolve over time. Here are some I would challenge an imaginary recommendation engine to generate:
- Minimalist productions (Dogville, My Dinner with Andre)
- Movies remade every 10–15 years (A Star Is Born, Godzilla, Hamlet)
- Someone falls out a window? (Robinhood, Robocop)
- Stories told out of order (Momento, Rashomon, Primer)
- From Books said to be “unfilmable” (Life of Pi, Dune, The English Patient)
- Divisive films (Perfume, Napoleon Dynamite, Happy Gilmore)
My inspiration comes from the smaller movie stores I used to know. Using the same content as a larger chain, a single chain owner would have the ability to arrange them by director, actor, time period, or in whatever arrangement would suit the movies best. Nowhere would I find the label “foreign language movies”. I want to be surprised by what the engine serves me, but I also want context. Focusing on interesting groupings will yield more guidance about mood and style. Give the same ingredients to two chefs and you might end up with different recipes. I want more focus to be placed on the assemblage over the individual titles.
Where to go from here
How can these clusters be generated? Clearly more external content is necessary. Scraping IMDB, Wikipedia, and movie reviews could be useful. One could take inspiration from interesting themes generated by people; the Atlantic has a piece called An Ode to Running in the Movies. At least one group, “Divisive movies”, could actually be learned during the collaborative training process by collecting films with the worst prediction scores. The Napoleon Dynamite Bug could be transformed into a feature!
Obviously it’s quite possible to have actual people generating these clusters, as is done already (Spotify gives space for people to build their own sets). No matter the source, it would remain the algorithm’s job to cater which of these groups to serve to particular persons. Even with no original content, a recommender system could still surprise you by grouping familiar material in unfamiliar ways. This makes the whole browsing experience a lot more enjoyable.
How to generate more aesthetically interesting topics is a hard problem, but perhaps clues could be derived from some combination of adversarial networks (to challenge the originality), human-AI collaborations, or to take inspiration of IBM Chef Watson from a few years back (where did that go?). It would be easy enough to know if you got it right: most of us can spot an original movie theme. And maybe we can even better automate the process!