A partially-observabe Markov decision process is a Markov decision process where the state is only partially observable by the actor, so that the policy can only depend on a function of the state, which looses some of the state's Information
A type of reinforcement learning, where we don't observe the state explicitly!
Want to estimate actual state, given the noisy and incomplete measurements of the state. Can use the method of marginalization, as used in Factor analysis models. However, it is very computationally expensive. Instead we use a Kalman filter model, which turns out to be a Hidden Markov model with continuous states.
Intuition. I think this can be seen through the lens of Sufficient statistics
Kalman filter + LQR = LQG control <- video <– how to solve
Separation principle of LQG control