It is very unlikely that we find the Holy Grail for embedding movies very soon, but during our long journey we disclosed a lot of useful information and ideas which would help to build a system that assists the user to find his needle in the movie haystack, or at least it could assist him to explore alternative regions of the movie space that we might find interesting.
Of course, we are aware that there is a huge difference between an idea that seems to be scalable and also useful and the actual implementation of it; not to mention the politics behind such a project. However, if we visit a movie database and we study the details of a hard-boiled detective movie and a recommender suggests that we should also give a try to a nerdy comedy, because it is ‘similar’, or other people also watched it, we are a little confused about the intentions of such a suggestion and the utility for users.
Maybe, some people watched the detective film and also the comedy, but to suggest pairs of movies just because some people watched them both does not seem to be very beneficial in most cases. Maybe, the suggestion is a result of some latent connection derived from a collaborative filtering approach, but without a component to explain the reason behind the suggestion does not seem to be beneficial either. We do not expect a system to capture all concepts of a movie and to suggest similar ones, but we would expect movies from the same -or nearby- genres, or at least movies with an obvious connection, or theme, to the movie we currently study.
And we are not limited to the domain of movies. A lot of web shops these days are equipped with a lightweight component to infer purchase decisions or product pairings for customers, but especially for complex domains like beverages, we doubt that most trivial rules inferred by such a system really help customers to find what they want. In the worst case, the user might even be confused if such suggestion contradict common sense.
So, what is our point? Big Data might be a buzz word or not, but crunching data to get value information is beneficial for customers. However, too often it seems that we are drowning in these information but starving for knowledge. Stated differently, regardless of the business, knowledge extraction is usually challenging and often there is not enough budget for it. On the other hand, even if money is no problem, too often there is still no efficient -continual- process to extract knowledge from data.
At the end, the situation is similar for most cases. There is lots of carbon -the information- but not enough pressure to make diamonds out of it. To rephrase our argument, we would like to see fewer standard components for movie recommendations that, frankly, do pretty mediocre jobs and instead, focus on systems that really help users to find what they are searching for. And since happy customers come back it is a win-win situation. To be fair, there are some systems that do a pretty good job, but often those systems are hard to use for non-experts, or restricted to a single domain and/or website. We know that good systems are hard to get right, but of what use is a fancy website, if we have to visit 1,000 movies before we find the one we might enjoy?