backpropagation. An algorithm to compute the derivatives, needed for Gradient descent, for Artificial neural networks.
Backpropagaion. It effectively uses the chain rule to compute the gradient w.r.t. parameters at one layer with the values of the gradients w.r.t. parameters at the layer above (deeper).
[image above, wait until it loads, you also need to be signed into google]
Why backprop is more efficient than naive approach
Derivatives wrt the input give you a way of knowing which part of the input is determining the classification, i.e. where is the cat in the image, for example
Backprop in the brain (see Geoff Hinton vid) and this paper: http://biorxiv.org/content/early/2016/12/23/035451
STDP-Compatible Approximation of Backpropagation in an Energy-Based Model