Today, we continued our tests with the non-negative version of the RBM and while we worked on the model fine-tuning, we were painfully reminded that choosing proper hyper-parameters can drive you insane. If we consider the vanilla RBM, we have to choose the learning rate, a value for the momentum (or to decide not to use a momentum at all), the initial weights and maybe additional constraints for the model (sparsity, etc). Furthermore we have to select a size for the mini-batches, we need to monitor the progress and we need to decide when to stop the training.
And since we are learning without a teacher, so-called unsupervised learning, we have no easy way to assess the quality of the final model. Nevertheless, the situation is not hopeless since lots of other machine learning enthusiasts have the same problems and thus, there are a lot of tips available how to improve your model.
We are still amazed that a slight variation of one parameter can lead to a significant improvement of a model, or on the other hand, result in a much worser model which is one reason why model tuning may take a lot of time and experiences. However, we have at least one advantage: In contrast to other domains, the amount of our data allows us to re-train a model very fast so we can try a lot of different combinations of parameters.