Surprisal as a Lyapunov function in stochastic dynamics

cosmos 23rd November 2016 at 11:00pm

See Free energy principle for context.

We can describe it using the Fokker-Planck equation. It's steady state solution:

p˙=0=(Γf)p\dot{p}=0=\nabla \cdot(\Gamma \nabla - f) p

Γp=pf+fp\nabla \cdot \Gamma \nabla p = p \nabla \cdot f + f \cdot \nabla p

Γpp=f+fpp \frac{\nabla \cdot \Gamma \nabla p}{p} = \nabla \cdot f + f \cdot \frac{\nabla p}{p}

ΓlnpΓp(1p)=f+flnp \nabla \cdot \Gamma \nabla \ln p - \Gamma \nabla p \cdot \nabla (\frac{1}{p}) = \nabla \cdot f + f \cdot \nabla \ln p

Γlnp+1p2Γpp=f+flnp \nabla \cdot \Gamma \nabla \ln p +\frac{1}{p^2} \Gamma \nabla p \cdot \nabla p = \nabla \cdot f + f \cdot \nabla \ln p

Γlnp+Γlnplnp=f+flnp \nabla \cdot \Gamma \nabla \ln p +\Gamma \nabla \ln p \cdot \nabla \ln p = \nabla \cdot f + f \cdot \nabla \ln p

We can decompose any vector field into an irrotational (curl-free), and solenoidal (divergence-free) component (Helmholtz decomposition), which can be expressed as the so-called standard form:

f=(Γ+Q)Vf=(\Gamma + Q) \nabla V

where QQ is antisymmetric, and Γ\Gamma is symmetric. Substituting in above equation

Γlnp+Γlnplnp=ΓV+(Γ+Q)Vlnp \nabla \cdot \Gamma \nabla \ln p +\Gamma \nabla \ln p \cdot \nabla \ln p = \nabla \cdot \Gamma \nabla V + (\Gamma + Q) \nabla V\cdot \nabla \ln p

Notice that (Γ+Q)Vlnp=VT(Γ+Q)lnp=VTΓlnp(\Gamma + Q) \nabla V\cdot \nabla \ln p = \nabla V ^T (\Gamma + Q) \nabla \ln p = \nabla V ^T \Gamma\nabla \ln p because QQ is antisymmetric. Using this, if we assume that V=lnpV=\ln p, then the above equation for stationarity is satisfied.

Proof from paper: see Free_energy_principle_lemmaD1a.png and Free_energy_principle_lemmaD1b.png

It is straight-forward but fundamental result means that the flow of any ergodic random dynamical system can be expressed in terms of orthogonal curl- and divergence-free components, where the (dissipative) curl-free part increases value while the (con-servative) divergence-free part follows isoprobability con-tours and does not change value. Crucially, under this decomposition value is simply negative surprise: lnp(xm)=V(x)=L(xm)\ln p(x|m)= V(x) = -L(x|m). It is easy to show that surprise (or value) is a Lyapunov function for the policy

V˙=Vf=VΓV+V×W=VΓV0\dot{V} = \nabla V \cdot f = \nabla V \cdot \Gamma \cdot \nabla V + \nabla V \cdot \nabla \times W = \nabla V \cdot \Gamma \cdot \nabla V \geq 0