In the last post, we talked about neural language models and how they could be used to predict the next word in some context. Now, one might ask how we could use the idea for our feature learning model?
Most people would agree that a single word, without knowing the context, is often not sufficient to grasp a situation. In our case, it is very similar. Let us take a movie about zombies for instance. The keyword ‘zombie’ could mean different things, for instance, that there is a fight against them, or that some scientists want to find a cure for them or that somebody turned into a zombie. In other words, the global context is determined while the details are not.
And it gets worse. Let us assume that further keywords are “fight” and “gang”. Does it mean there is a fight against the zombies or against a gang? Is it a zombie gang? Or is it a sub-plot where humans fight against humans and zombie suddenly appear? It is all possible.
The determination of the context with only keywords is not possible because they are neither grouped nor ordered. With a summary, there would be an explicit context in the phrases and thus for the words. For instance, a summary like “The protagonists enter a city and are immediately attacked by a zombie gang”, would allow to determine a local context of the words. Of course this is not trivial, but at least we have an ordered sequence of words.
With a large number of movie summaries, we could train a neural language model. Besides the ability to predict the next word, these models also allow to embed words into some low-dimensional space which can be used for clustering. The clue is that if two words are not similar, like “ghoul” and “zombie”, but they appear ‘often’ in similar contexts, these words are treated as semantically relevant by such models.
The idea is to use a large corpus of movie descriptions and summaries to learn a language model that allows to embed a pre-defined vocabulary, one that is descriptive for movies, into one semantical space. Then, the feature representation of each word could be used to estimate the similarity of movies which allows a ranking or clustering.
In case of our zombie example, the new representation could help a lot to compare movies that are semantically very similar, in terms of the keywords, but whose keywords equal: (ghoul, zombie), (epidemy, plague), (survivor, apocalypse).