Toward an Integration of Deep Learning and Neuroscience
See notes in Spiking neural networks.
There may or may not be a separate circuit to compute and impose a cost function on the network, depending on the optimization mechanism.
Optimizing by Self-organization (without explicit supervision), useful for unsupervised learning, minimizing implicit costs, though they can also give rise to implicit approximations of backprop, or other algorithms that use the gradient.
“biologically plausible” mechanisms can efficiently make use of the gradient (and there are very good reasons regarding efficiency to use the gradient, whenever supervision signal is available). A possible mechanism, by which biological neural networks could approximate backpropagation, is “feedback alignment” (Lillicrap et al., 2014; Liao et al., 2015). Other methods, using special network organizations have been proposed. See Hinton's talk. There are also diverse mechanism that go beyond conventional conceptions of backpropagation, like Neuromodulation by molecules.
Temporal credit assignment is a difficult problem. Can generic recurrent networks perform temporal credit assignment in in a way that is more biologically plausible than "backpropagation through time" (BPTT)? Indeed, new discoveries are being made about the capacity for supervised learning in continuous-time recurrent networks with more realistic synapses and neural integration properties. BPTT can be achieved by learning to predict the backward-through-time gradient signal (costate) in a manner analogous to the prediction of value functions in reinforcement learning.
Fast connections maintain the network in a state where slow connections have local access to a global error signal.
spiking recurrent networks using realistic population coding schemes can, with an appropriate choice of connection weights, compute complicated, cognitively relevant functions17. The question is how the developing brain efficiently learns such complex functions.
Individual neurons should not be regarded as single “nodes” but as multi-component sub-networks.
Primary function of the cortex is probably some form of unsupervised learning via prediction. Some cortical learning models are explicit attempts to map cortical structure onto the framework of message-passing algorithms for Bayesian inference
As we will discuss below, some form of deep Reinforcement learning may be used by the brain for purposes beyond optimizing global rewards, including the training of local networks based on diverse internally generated cost functions.
Matching the Statistics of the Input Data Using Generative Models. Message-passing implementations of probabilistic inference have also been proposed as an explanation and generalization of deep convolutional networks (Chen et al., 2014; Patel et al., 2015). Various mappings of such processes onto neural circuitry have been attempted.
Cost Functions That Approximate Properties of the WorldA perceiving system should exploit statistical regularities in the world that are not present in an arbitrary dataset or input distribution.
Cost Functions for Supervised Learning. Some possible uses of supervised learning, including: deliberative procedures could be compiled down to more rapid and automatic functions by using supervised learning to train a network to mimic the overall input-output behavior of the original multi-step process. Such a process is assumed to occur in cognitive models like ACT-R (Servan-Schreiber and Anderson, 1990),
Repurposing Reinforcement Learning for Diverse Internal Cost Functions. Reinforcement learning rewards plays a role
Special, internally-generated signals are needed specifically for learning problems where standard unsupervised methods—based purely on matching the statistics of the world, or on optimizing simple mathematical objectives like temporal continuity or sparsity—will fail to discover properties of the world which are statistically weak in an objective sense but nevertheless have special significance to the organism (Ullman et al., 2012). [...] How could we hack together cost functions, built on simple genetically specifiable mechanisms, to make it easier for a learning system to discover such behaviorally relevant variables? Ullman refers to such primitive, inbuilt detectors as innate “proto-concepts” (Ullman et al., 2012). Their broader claim is that such pre-specification of mutual supervision signals can make learning the relevant features of the world far easier, by giving an otherwise unsupervised learner the right kinds of hints or heuristic biases at the right times. Here we call these approximate, heuristic cost functions “bootstrap cost functions.” The purpose of the bootstrap cost functions is to reduce the amount of data required to learn a specific feature or task, but at the same time to avoid a need for fully unsupervised learning.
Evolution: it seems likely that the brain would make extensive use of them to ensure that developing animals learn the precise patterns of perception and behavior needed to ensure their later survival and reproduction.
Cost Functions for Learning by Imitation and through Social Feedback. Babies and children learn about cause and effect through models based on goals, outcomes and agents, not just pure statistical inference.
cost functions and optimization are not the whole story. To achieve more complex forms of optimization, e.g., for learning to understand complex patterns of cause and effect over long timescales, to plan and reason prospectively, or to effectively coordinate many widely distributed brain resources, the brain seems to invoke specialized, pre-constructed data structures, algorithms and communication systems, which in turn facilitate specific kinds of optimization. Moreover, optimization occurs in a tightly orchestrated multi-stage process, and specialized, pre-structured brain systems need to be invoked to account for this meta-level of control over when, where and how each optimization problem is set up.
Cost Functions for Story Generation and Understanding
pre-structured architectures are needed to allow the brain to find efficient solutions to certain types of problems. the brain may need pre-specialized systems for planning and executing sequential multi-step processes, for accessing memories, and for forming and manipulating compositional and recursive structures
Second, the training of optimization modules may need to be coordinated in a complex and dynamic fashion, including delivering the right training signals and activating the right learning rules in the right places and at the right times. To allow this, the brain may need specialized systems for storing and routing data, and for flexibly routing training signals such as target patterns, training data, reinforcement signals, attention signals, and modulatory signals. These mechanisms may need to be at least partially in place in advance of learning.
specialized structures, including thalamus, hippocampus, basal ganglia and cerebellum (Solari and Stoner, 2011). These structures evolutionarily pre-date (Lee et al., 2015) the cortex, and hence the cortex may have evolved to work in the context of such specialized mechanisms. For example, the cortex may have evolved as a trainable module for which the training is orchestrated by these older structures.