A component of a Model-based reinforcement learning algorithm, which can be learned
RECURRENT ENVIRONMENT SIMULATORS
The Predictron: End-To-End Learning and Planning