One Giant or Thousand Ants

We recently thought a lot about ways to build a good model for movie recommendations. There is no need for perfection, but it should be at least useful for the user and not the kind of model that suggests all Meg Ryan movies just because you recently watched a romantic comedy with her.

The whole idea is haunting us since quite a while, so we are very skeptical that a single new idea or technology, like Deep Learning, will solve all our problems. Maybe we should start again from scratch and get inspired by the human brain itself. A child learns step by step and not all at once and so should learn our model.

Stated differently, is it more useful to have several weak models that are combined somehow or to have one big model? Especially since we have data from very different domains like ratings, reviews, features, taxonomies and summaries, it seems more straightforward to learn simpler models for the different domains that are combined later.

Especially the ability to integrate a new domain ad-hoc seems to be very beneficial for a personalized TV system. For instance, if a new movie comes out and there is no meta data yet and only very limited ratings and reviews exist, data from an external source, like a social network -especially friends- could be very helpful to enrich the representation of the movie.

One of the bigger problems is sparsity which is especially a problem for movies that are lesser known and thus, only very few data is available. For this kind of movies, we need every bit of information we can get to make useful predictions and it is extremely unlikely that all the information are from a single domain.

So at the end, we choose the ants over the giant in the hope that an army of ants is much more useful than a single, but mighty leviathan.


