Ratings – What The Mean Does Not Tell

The question is how to decide if a suggested movie by some recommender is worth watching without actually watching it. The issue is probably nothing most users would care about, but if the system suggests two movies and their start times overlap, a decision has to be made.

The first thing to do is probably reading the plot summary, but if this does not help, more data has to be gathered. For such a case, ratings are a good way to get a condensed view averaged by the crowd. However, ratings are not born equal and especially the mean can be highly biased, like when only folks that liked the movie rated it.

Let us consider a simple example. Movie A has a mean rating of 5.0 created by about 70,000 users. The value tells us that the crowd thinks it is pretty average, but what it does not tell us is how diverse the user opinions are. For instance, it is possible that a group of users rated it with 2 and a different group with 8, at the end the mean is also 5.0, but the variance is much higher.

It is reasonable to assume that most movie ratings are distributed according to a Gaussian distribution which means a peak in the middle and the further we walk away from it, the lower the values will be. The spread around this peak is the variance and in our opinion very important to decide about the diversity of the opinions. Of course real data is more complex, but this approximation is usually sufficient for basic experiments.

For our example it could mean, that one movie has a low mean value but a very high variance. A closer look at the distribution might tell us that there was a large “hating it” and “loving it” group that gave very extreme ratings, like 1 and 10 and of course all the other users that do not fit into the two groups. With just the mean value, this information would be impossible to infer.

It is still difficult to decide what movie to choose, but if ratings are diverse, which happens especially for independent movies or ones with unusual topics, it is a often a good idea to ask yourself how much you can trust the mean rating. Stated differently, how much do we disagree with the crowd in this case, but also in general? If somebody only watches popcorn movies, but we mainly watch non-mainstream, it is safe to assume that there is not much overlap of the tastes and therefore, the ratings should count less for a final judgment.

To sum it up, the mean value is definitely helpful, but only as an indicator and in our opinion it would make sense to show also the variance to get a quick look at the diversity of the ratings.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s