Even if we would succeed to build a good, semantic movie model, the question still would be, if users would agree on the semantic and thus, if the outcome is really useful for the person in front of the screen.
For instance, let us assume that we like science fiction movies. Now, there are a couple of problems: First, sci-fi means different things to different people. Some might think of it as the genre of ideas and experiments, while others think of huge battles in space and the good old space opera. There is no right and wrong in this case, because personalized TV means that the system should find shows a user is likely to enjoy.
The situation is further complicated by the fact that especially in the sci-fi genre, there are lots of low-budget movies and copycats. In other words, and this is also no news, recommendations need to consider the concept of a movie but also external factors like the language, countries or the budget. The acquisition of all these information is a hell of job and it is very unlikely that we find a single source with reliable information that covers them all.
A model would surely benefit from these information but again, it is very unlikely that the preferences of users fit into such pre-defined categories. That is why tagging is so popular. If a user thinks that a movie should be annotated with “cannon-pizza”, he just does it. Stated differently, a tag is a feeling of a user towards a movie and it is very likely that similar feelings will lead to similar tags.
What a tag really means, not literally but for a movie, is also very subjective. We plan to investigate tags as some form of micro genres. For instance, if a user tags a group of movies with ‘cat’, it can mean a lot, depending on the user. It might be possible that the movies are in fact about cats, but this is not the only possibility. However, it is irrevocably that these movies share a theme that was the reason for the tag by the user.
Our approach to model a similarity space for movies would be more efficient, if we do not have to rely on the genre as the label, but we could use tags to express the similarity between pairs of movies. This can be easily demonstrated with an example: If the ‘cat’ movies do not share the same genre, the label would be always set to ‘false’ but with the tag, the features of the movies would be related and the relation would be encoded in the model. And furthermore, the resulting model would reflect directly the preferences of the user instead of a “global” statement that is expressed in terms of genres.
The bad news is that this cannot be done unsupervised and that it will take a lot of time until a sufficient number of annotations are available.