After we sketched most of our ideas and concepts of the prototype, it is time to combine these modules into a classifier. We start with a high-level description of the system.
Our classifier consists of multiple layers. The first layer consists of the data, the next layer is the RBM topic layer we previously described and the last layer is a logistic regression layer that converts the features of a movie into a single probability. The probability can be understood as the chance of an item being liked by some user. Even though a gradual interpretation of the value is possible, we usually use a very coarse partition of the values: 0.0 means no preference, 0.5 means indifferent and a value higher than 0.8 is interpreted as a sufficient preference towards a movie.
We use the following notation. X is the set of all training movies and x is a single movie. A single movie x is represented by a sequence of topic keywords where each keyword might be mapped to a fixed position in a high-dimensional feature space.
The training starts with an unsupervised model to capture generic characteristics of the model. We use RBMs to train a model for each of the most frequent movie genres. As described before, the outcome will be a set of neurons that describe some (abstract) concept found in the training data. After we trained all genre models, we combine all these neurons into a single larger model. Again, each neuron is sparsely encoded since it only depends on a small subset of the input data.
Let us assume that we trained 10 neurons for each genre and that we have 50 genres that we consider frequent. The result is a model with 500 neurons where each neuron depends on a subset of the input data X. To convert a single movie item x into the feature space, we calculate the encoding of x for each neuron n. Each neuron returns a value within the range of 0..1 as an indicator how well the input data matches the concept. Thus, the output of a movie item is a feature vector with 500 values, one dimension for each neuron, which we call f. The vector f is then the input for the logistic regression layer.
The last layer is trained with output of the previous RBM layer. Stated differently, the classifier tries to model a combination of learned concepts to predict the preference of the user regarding a specific movie. Here, it would be possible, and probably beneficial, to introduce a new RBM layer that is used to learn even higher-level concepts, given the concept of the last layer we already learned. However, since we want to start with a simple model, we will not pursue this idea for the moment.
It should be noted that our actual setup was rather simple. We trained our model with about 400 labeled movies and 10 movie genres. The outcome is not mind-blowing, but an accuracy of about 82% is a sufficient start for further experiments.
To analyze the model, we calculated the feature representation of all movies that we have meta data for (RBM layer) and stored it in a database back-end. We also saved the model parameters of the logistic regression. This allows a very simple implementation into our TV recommender by looking up the features of a movie x and applying the parameters of logistic regression model then. The output is then the probability of x as a preference indicator of a movie. We ignore movies that do not have any keywords in common with any neuron.
Since the RBM layer does not depend on the labels, we use on-line learning for the logistic regression model to continually adjust the parameters according to the observed user behavior.
In following posts, we plan to discuss the results of the analysis and how to further improve the results.