The binding problem is expressed by different authors in rather different ways. However, it generally boils down to the question of how the visual system represents which features are bound together as part of the same object.
Tackling this problem requires appears to require recurrent Spiking neural networks.
Our laboratory is now exploring the effects of extending the VisNet model to include top-down connections between layers and lateral connections within layers, and spiking dynamics with spike time dependent plasticity, and using this more biologically accurate cortical model to investigate the emergence of polychronization and its role in the development of binding neurons.
A key limitation of the current purely bottom-up VisNet architecture is that the firing rates of the neurons are not able to specify which features are part of which objects. For example, consider an image consisting of the two letters L and T. On the V1 input layer, each of these two letters is represented by one horizontal bar and one vertical bar. So when the two letters are presented together, there will be separate representations of two horizontal bars and two vertical bars. In the output layer, there would be separate representations of the letters L and T. However, it is not possible to read off from the neuronal firing rates within the network which horizontal and vertical bars are part of which letters. The correct binding between the bars and the letters could be determined if it was possible for neurons in the higher layers of the network to interrogate not only which bar and letter representations were active, but also which synapses between them were active thus specifying which bar representations were driving which letter representations. However, it is not possible for higher level neurons in the network to read off the activity of synapses in such a direct manner.
In this case, during visually-guided competitive learning, some random subset of neurons in every layer of the network, which we shall call ‘binding neurons’, will become tuned to respond when a particular low-level feature such as a bar is driving the representation of a specific high-level feature or object such as a letter T or L.
This process could occur through polychronization within a spiking network by activation of a temporal attractor loop for each type of binding neuron similar to that described above. ?
the binding mechanism proposed here is potentially so rich that it would be impossible to describe the process at a high psychological level; it requires a description at the neuronal level.
Although backpropagation networks might be efficient at learning arbitrary mappings, and solving narrow tasks like face recognition, they would not be able to represent the essential binding information needed to semantically analyse visual scenes in the same way as the primate brain.
It is hypothesised that a large spiking neural network model with a distribution of randomized axonal delays will develop many such binding neurons if the model incorporates the following elements:
There are other way in which Equation 1 is satisfied other than neuron 1 causing 2 right?
Extensions to the theory: binding neurons can learn to represent many different kinds of relationship between visual features
The top-down connections guide competitive learning in lower layers, thus driving the formation of lower level visual representations that are modulated by higher level representations. How does this work precisely?
‘polychronization’ [12]. This phenomenon involves the network learning temporal memory patterns, each of which takes the form of a repeating loop of neuronal firings. These temporal memory loops self-organise automatically when STDP is used to modify the strengths of synapses in a recurrently connected spiking network with randomised distributions of axonal conduction delays between neurons.
[12]: Polychronization: computation with spikes. – pdf
aka variable-binding
the assignment of a particular datum to a particular slot in a data structure.
Example
In language, variable-binding is ubiquitous; for example, when one produces or interprets a sentence of the form, “Mary spoke to John,” one has assigned “Mary” the role of subject, “John” the role of object, and “spoke to” the role of the transitive verb.