Fast Fading is Everywhere

The way to a good model can be a very long journey, especially with handcrafted features, because you never know if your encoding is sufficient to achieve the goal which is to describe your data efficiently. Recently, we decided to start again with very simple models to gather new insights and to come up with some new methods. For no particular reason, we stumbled about models based on entropy and especially for the domain of features for movies, such a direction makes sense.

Let us start with an easy example: genres. For instance, if 45% of the movies are marked as ‘drama’, we could almost flip a coin to make a decision. Stated differently, the information of the label is low and a model would not gain much if it tries to optimize its reconstruction abilities of the label ‘drama’, because half of the movies are ‘drama’ anyway. That means it is much more important to learn to correctly reconstruct genres that are rare but more descriptive, like scifi/western/horror.

To incorporate such knowledge into a model, we need to penalize reconstruction errors of rarer genres higher than errors for common genres. Of course, we could also go a step farther and use a similar scheme for the input layer, but in this case the binary weights of features would be adjusted according to their inverse frequencies.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s