aka RBM
definition. An RBM is an undirected graphical model that defines a distribution over some input vector , and it is going to model the distribution by using a hidden layer of binary units (latent variables, ), and an Energy function
We first assume that the input is composed of binary variables
Energy function
Probability distribution
It's a Boltzmann distribution (hence the name):
Markov network representation
Conditional inference
derivation of p(h\x), although it follows from the Local Markov property
Free energy is used to marginalize , to get the distribution , which we are most interested in.
derivation. The function in the sum is known as softplus, vid
The softplus function can often be approximated by the ReLU
We often can't compute directly due to the intractability of computing the partition function. However the expression helps us understand what s the model makes more and less likely.
The hidden units basically represent features that we expect to observe in
A Practical Guide to Training Restricted Boltzmann Machines
The coupling parameters between the visible and hidden layers are chosen using a variational procedure that minimizes the Kullback-Leibler divergence (i.e. relative entropy) between the “true” probability distribution ofthe data and the variational distribution obtained by marginalizing over the hidden units.
Neural networks [5.7] : Restricted Boltzmann machine - example. Example on MNIST data set.
Neural networks [5.8] : Restricted Boltzmann machine - extensions
Gaussian-Bernoulli RBM allows for real inputs