Restricted Boltzmann machine: Cosmos — All that is, or was, or ever will be

Restricted Boltzmann machine

cosmos 30th May 2018 at 1:09am

aka RBM

definition. An RBM is an undirected graphical model that defines a distribution over some input vector $\mathbf{x}$ , and it is going to model the distribution by using a hidden layer of binary units (latent variables, $\mathbf{h}$ ), and an Energy function

We first assume that the input $x$ is composed of binary variables

Energy function

$E(\mathbf{x}, \mathbf{h}) = -\mathbf{h}^T \mathbf{W} \mathbf{x} - \mathbf{c}^T \mathbf{x} - \mathbf{b}^T \mathbf{h}$ $= \sum\limits_{ij} W_{ij} h_j x_i -\sum\limits_i c_i x_i -\sum\limits_i b_u h_i$

Probability distribution

It's a Boltzmann distribution (hence the name):

$p(\mathbf{x}, \mathbf{h}) = \exp{(-E(\mathbf{x}, \mathbf{h})}/Z$

Markov network representation

Video

Factor graph

video

Inference on RBMs

video

Conditional inference

video

derivation of p(h\x), although it follows from the Local Markov property

Free energy in an RBM

Free energy is used to marginalize $x$ , to get the distribution $p(x)$ , which we are most interested in.

derivation. The function in the sum is known as softplus, vid

The softplus function can often be approximated by the ReLU

We often can't compute $p(x)$ directly due to the intractability of computing the partition function. However the expression helps us understand what $x$ s the model makes more and less likely.

The hidden units basically represent features that we expect to observe in $x$

Training an RBM

A Practical Guide to Training Restricted Boltzmann Machines

Contrastive divergence

Minimizing Relative entropy

The coupling parameters between the visible and hidden layers are chosen using a variational procedure that minimizes the Kullback-Leibler divergence (i.e. relative entropy) between the “true” probability distribution ofthe data and the variational distribution obtained by marginalizing over the hidden units.

Neural networks [5.7] : Restricted Boltzmann machine - example. Example on MNIST data set.

Debugging RBMs

Extensions

Neural networks [5.8] : Restricted Boltzmann machine - extensions

Gaussian-Bernoulli RBM allows for real inputs

Deep belief network

Extended Mean Field RBM

Temperature based RBM

Transductive Boltzmann machine