It has been a while since we started to work on RBMs models and we are still amazed how versatile these machines are. They can be used in lots of domains, for instance, speech processing, computer vision, text mining and even collaborative filtering. However, the data of each domain that is used for training the model is very heterogeneous. It can be discrete in case of (movie) ratings, binary in case of simple text features or continuous for images and speech. The basic RBM model only provides binary units to model the data.
There are tricks to convert continuous data into a binary representation but it should be clear that such a kludge does not guarantee to lead to proper models. That is why people started to work on RBMs that can handle continuous data. Such a machine uses Gaussian nodes in the visible layer to model the data, for instance gray level intensities of images. Besides binary and gaussian units, there exists also other types of units, like the softmax unit, softplus or relu units.
Some of the unit types might be limited to a specific domain, for instance the replicated softmax is used to model the number of words in a document. With such a variety of different units, it is possible to model a huge number of problems with RBMs which might be another reason why RBMs are so popular right now.
Preliminary results of some experiments we conducted indicate that we might improve our models by using non-binary units. To measure the precision of the outcoming model, we tracked how many high-frequency keywords a model contained with different types of units. Additionally, we monitored the reconstruction error, well aware of the fact that it should be used with caution. The results were pretty different which means that we need more time to analyze the data before we can draw any real conclusions.
Again, we noted that a slight adjustment of one parameter can lead to a completely different result but since we kept all parameters fixed and just switched to a different type of unit, we can be at least sure that the different result was mostly caused by the new activation function.