Double learning

cosmos 15th July 2017 at 8:48pm
Off-policy learning

A series of methods to avoid Maximization bias