Restricted Boltzmann machine

cosmos 30th May 2018 at 1:09am
Boltzmann machine

aka RBM

definition. An RBM is an undirected graphical model that defines a distribution over some input vector x\mathbf{x}, and it is going to model the distribution by using a hidden layer of binary units (latent variables, h\mathbf{h}), and an Energy function

We first assume that the input xx is composed of binary variables

Energy function

E(x,h)=hTWxcTxbThE(\mathbf{x}, \mathbf{h}) = -\mathbf{h}^T \mathbf{W} \mathbf{x} - \mathbf{c}^T \mathbf{x} - \mathbf{b}^T \mathbf{h} =ijWijhjxiicixiibuhi= \sum\limits_{ij} W_{ij} h_j x_i -\sum\limits_i c_i x_i -\sum\limits_i b_u h_i

Probability distribution

It's a Boltzmann distribution (hence the name):

p(x,h)=exp(E(x,h)/Zp(\mathbf{x}, \mathbf{h}) = \exp{(-E(\mathbf{x}, \mathbf{h})}/Z

Markov network representation

Video

Factor graph

video

Inference on RBMs

video

Conditional inference

video

derivation of p(h\x), although it follows from the Local Markov property

Free energy in an RBM

Free energy is used to marginalize xx, to get the distribution p(x)p(x), which we are most interested in.

derivation. The function in the sum is known as softplus, vid

The softplus function can often be approximated by the ReLU

We often can't compute p(x)p(x) directly due to the intractability of computing the partition function. However the expression helps us understand what xxs the model makes more and less likely.

The hidden units basically represent features that we expect to observe in xx

Training an RBM

A Practical Guide to Training Restricted Boltzmann Machines

Contrastive divergence

Minimizing Relative entropy

The coupling parameters between the visible and hidden layers are chosen using a variational procedure that minimizes the Kullback-Leibler divergence (i.e. relative entropy) between the “true” probability distribution ofthe data and the variational distribution obtained by marginalizing over the hidden units.


Neural networks [5.7] : Restricted Boltzmann machine - example. Example on MNIST data set.

Debugging RBMs

Extensions

Neural networks [5.8] : Restricted Boltzmann machine - extensions

Gaussian-Bernoulli RBM allows for real inputs

Deep belief network

Extended Mean Field RBM

Temperature based RBM

Transductive Boltzmann machine