Highway Networks a new architecture designed to ease gradient-based training of very deep networks which use gating units which learn to regulate the flow of information through a network.
Recurrent Highway Networks – a novel theoretical analysis of recurrent networks based on Gersgorin's circle theorem that illuminates several modeling and optimization issues and improves our understanding of the LSTM cell. Based on this analysis we propose Recurrent Highway Networks, which extend the LSTM architecture to allow step-to-step transition depths larger than one.
HIGHWAY AND RESIDUAL NETWORKS LEARN UNROLLED ITERATIVE ESTIMATION – Skip-connections