ReLU

cosmos 23rd February 2018 at 12:30am
Activation function

ReLUs don't really suffer from the vanishing gradient problem (at least not in its standard form), as their gradient is either 0 or 1. See discussion here: https://stats.stackexchange.com/questions/176794/how-does-rectilinear-activation-function-solve-the-vanishing-gradient-problem-in