We wrote a lot about the necessity for prior knowledge in case of meta data for movies. For instance, the keyword ‘creature’ can mean very different things depending on the genre. For a family movie, we might think of a benign, cute little alien, while for a horror movie the creature is very likely a terrifying, flesh-eating beast from space. In case of a traditional auto-encoder (AE), the objective is to reconstruct the input with some introduced bottleneck, while the gated AE has two inputs x and y and the objective is to learn the relation of those inputs. This is a schematic of the procedure:
(1) First, project x, y into a new space.
hx = dot(Wx, x)
hy = dot(Wy, y)
(2) Determine the hidden representation H.
h_x_y = F(dot(Wh, hx * hy) + bias_h)
(3) Reconstruct y given x and H.
y_tmp = hx * dot(Wh.T, h_x_y)
y_hat = F(dot(Wy.T, y_tmp) + bias_y)
The encoding H considers the relationship of x and y by point-wise multiplications of the features hx and hy.
The next question is how to use gated AEs to generate good features to describe movie contents? What we want to learn is the correlation of features, very similar to a factor model, but with the ability to extract features also for unknown movies. If we set x=y, we can turn the gated AE into a covariance auto-encoder (cAE), or stated differently, we model the relation of movie features with itself. After the training is done, we can simply encode new movies with (1) and (2) by setting y=x.
To address the problem mentioned at the beginning, we could regularize the auto-encoder with a discriminative objective. For instance, we could train a softmax classifier with genres on top of H to pull hidden representations of different genres farther apart; at least when there is no semantical overlap of the keywords. In other words, we need a trade-off between the cAE objective and the discrimination:
L = L_cae + alpha * L_softmax
In a nutshell, gated AEs are so far mainly used in the domain of images but they are not limited to it. However, to assess the utility of them for bag-of-words data, more tests need to be done. In any case, gated AEs are a very powerful and interesting idea.