Tabular method

cosmos 16th July 2017 at 7:35pm

Reinforcement learning method which tries to computes the exact Value function. The name refers to the fact that in general, this requires storing the value for each state (or state-action pair). Therefore, this can only be applied to finite-MDPs.

The other alternative, which can be applied to larger and more difficult problems is to approximate the value function, for instance using a neural network