With the recent adjustments of our preference model, which is nothing more than a clever way to capture as much as pairwise correlations as possible, the performance of our thumb up/down approach reached a precision where we decided not to tune it any further. However, even with a sorted list of movies/series based on the preference score, makes it often very hard to group items in a conceptual way.
For example, the list might contain about 20 films that are a good match for the preferences of the user, but the score does not explain that particular aspect of the movie is responsible for it. So, the user might be in a particular mood for a specific theme or sub-genre and it usually it is not easily possible to group the top-k results into concepts. At least not with the score alone from the preference model.
In case of factorization machines, the model also includes an embedding space V: |X| x N_FACTORS, but the shape of the space is only driven by the supervision signal. However, the space is nevertheless responsible for encoding semantically similary input data close together. Stated differently even if two inputs a, b have no overlapping features, a * b.T = 0, the embedding space learned to group related features together and thus, if a, b both contain variables from the same topic, the dot product of the embedding f(a, V) * f(b, V).T is likely to be positive, to indicate a positive correlation.
But it turns out that the space is not sufficient to perform a coarse clustering. Mainly because the number of factors is too small <10 and the directive of factorization machines does not encourage disjoint factors or clustering of any kind.
Still, the question is how can we perform a preference-based latent clustering that both models the preferences of a user and explain them by learning a topic model that disentangles some explaining factors?