Graphical model

cosmos 29th December 2017 at 12:39am
Machine learning Probabilistic model

A probabilistic graphical model is a Model to represent a Joint probability distribution (joint PD) of a set of Random variables, which takes into account causal relations, and dependencies. The models are called graphical, because these dependencies are represented using Graphs, which allow for building the sparsely-parametrized representations of the joint PDs, and for many useful Algorithms for inference and learning to be used.

Factors are functions of the random variables, which are used to build the joint PD. One can do conditioning/reduction and marginalization on these factors. The reduction operation is like currying in Functional programming

http://cs.brown.edu/courses/cs242/lectures/

Representation

Coursera courseKnowledge engineering

Graphs:

See here for the distinction of directed vs undirected graphical models. The difference, is that a directed graphical model is an undirected one, but where the factors that correspond to the edges, are normalized, because they correspond to Conditional probabilityes

Directed graphical models (Bayesian nets)

Template models

Ways of representing graphical models that have a lot of internal shared structure (repeated variables and topologies), like events that occur over time, or relation types found over and over in a graph.. See vid

An importance class are those that show Structured CPDs

Undirected graphical models (Markov nets)

Independencies

I-maps and perfect maps.

An I-map (independence map) for a probability distribution PP is any graphical model GG such that the set of independencies implied by the network (I(G)I(G)) is a subset of the set of independences of PP (I(P)I(P)) (see here), i.e. I(G)I(P)I(G) \subset I(P)

A perfect (independence) map is one such that I(G)=I(P)I(G) = I(P)

Sum-product networks

Inference

Conditional Probability Queries

Exact inference and even approximate inference are NP-hard. This comes about because the sum-product calculation over all possibilities when doing Marginalization involves exponentially many terms. However, this is for worst case, and for general/average cases, there are practical inference algorithms!

Maximum a posteriori inference

video. Hm, what about MAP, not {over all unobserved variables}?, i.e. with some marginalization... In any case this is also NP-hard

Algorithms

Probability query algorithms

Maximum a posteriori algorithms

Other Optimization algorithms.

Viterbi algorithm

Learning


1.0 - Welcome-Probabilistic Graphical Models - Professor Daphne Koller

Jeffrey A. Bilmes

Graphical models

They can often be represented as kinds of Artificial neural networks

Energy minimization

Composing graphical models with neural networks

https://www.vicarious.com/2017/10/26/common-sense-cortex-and-captcha/