Sometimes, a problem can be better solved with a model that has fewer assumptions and therefore might have a lower complexity. Stated differently, it is possible that a better -theoretical- understanding of the problem leads to a more elegant solution that might be even faster and easier to implement. We wrote quite a lot about embedding movies into some kind of feature space and most of the models suffered from the problem that a pair-wise similarity function was required to train the model. In the case of movies, even a pair-wise similarity can be hard to get right and thus, it is very likely that this will become the bottleneck of the model.
The idea we have in mind is to start with a very simple model and to incorporate additional knowledge into the model to overcome the most obvious limitations. We started with a linear embedding model that we then enhanced with a non-linear affinity function. We got the idea with the affinity from a paper that was recently published. Our first shot was to use an affinity function G(x,y) that uses the covariance matrix to consider the importance of pairs of keywords. For instance, two words (a, b) might occur frequently together, but for a particular pair of movies (x, y), word ‘a’ is only present in x and ‘b’ only in y. With the additional knowledge that (a, b) is nevertheless strongly related, (x, y) will be pushed closer together in the feature space because of the semantic connection between the words. However, the drawback of this model is that the covar matrix is huge and thus, we decided to factor the matrix into matrices with lower dimensions to reduce the computational complexity.
The results look promising but more work needs to be done to incorporate the sparsity of the input data and we probably need to more training data to fit the model.