Currently it feels a little like we are in a drought. The reason might be that we need more data or, better features, or both. Our results so far are nevertheless useful, but we are missing a Eureka!
Therefore, we decided to go back a step, to take a closer look what lies in front of us, but from a further distance. More precisely, we were interested how well our data could explain a sophisticated genre like science fiction. For that, we trained a good old supervised neural network with WTA neurons in a one-vs-rest mode.
This has been done a lot of times and is nothing new. However, we are more interesting with the ability of the model to predict latent sci-fi movies. In other words, movies which have a strong sci-fi theme but the genre is not present in the actual genres of the movies.
As we expected, the performance of the model is very good. There are some movies, like “Super 8” or “Cloverfield” that got a lower score but to be fair, the sci-fi in both movies is definitely present but not a key aspect. On the other side, the model found a lot of movies with a stronger sci-focus that cannot be inferred from the given genres and here are some examples: “Iron Man”, “Mad Max” and “Next”.
But, since nobody knows what sci-fi really means, the results are highly subjective. Thus, our goal was not to draw a line what is sci-fi or not, but to see if the features at hand could explain concepts beyond genre labels. And since a consistent tagging of genres for movies is wishful thinking, it even makes more sense not to rely on a single attribute to build a semantic clustering.