Jump to: navigation, search

The visual recognition network is an attempt to integrate software to perform fundamental visual recognition operations based on spiking neural network simulation. The portion of the network described here integrates software from a variety of sources to create classifiers for key image features.

Return to the Synapse project.


[edit] Overview

We are creating a library of spiking operators. We will use these operators as the foundation of increasingly complex visual recognition tasks.

[edit] Inputs

The spiking inputs to the network are created by taking a scene (or image) and passing it through the ISET simulation to generate a spatial array of cone absorptions. The inputs are then passed through the retinal ganglion cell simulator to create an array of spiking outputs. The VRN begins with these as inputs and then attempts to classify the inputs as reflecting the presence of edges, corners, and other increasingly complex stimulus features.

[edit] Data Fountain

We are working with the IBM environments team to create an endless flow of input data that would support multiple perception tasks that can be easily scaled in difficulty and would require minimal human oversight for preparation. For example, it would be nice to ask the virtual environment to provide 100 images that allow us to train a network for horizontal versus vertical edge discrimination with a desired amount of complexity. Two criteria are essential for building a data fountain, 1.) any number of images can be requested such that the returned data would represent an unbiased sample of images we would normally expect and 2.) that the images contain pertinent labels for the desired perceptual task, i.e., for the previous example the vertical and horizontal edges would be labeled in the image.

[edit] Other methods for feature labeling

Since the final stage of our visual recognition network utilizes a linear classifier in order to identify patterns in an image, we require feature labels for any image sets that we would like to train our visual recognition network on or for evaluation of the performance of a trained network. As stated in the section describing the data fountain, the ideal situation would be for the virtual environment to provide such feature labels in addition to the images it produces. Currently, no labels are exported from the virtual environment and therefore must be produced after the images are retrieved from the VE server. So far the only perception task that the VRN has been applied to is the discrimination of vertical, horizontal edges and corners. Fortunately, there exists a rich literature on this task in the field of image processing. We have therefore created a procedure based on the techniques and code provided by Peter Kovesi for labeling small image patches one of three types (corner, vertical edge, or horizontal edge).

[edit] Spike to spike transformations

In some simple cases, the feature to be detected can be adequately determined by a linear function of the spiking patterns produced by the environment sensors of the VRN. In these cases, a linear classifier is sufficient to achieve good results. However, in more complex cases, the feature to be detected cannot be determined by a linear function of the sensor's spikes. In these cases, some nonlinearity must be incorporated into the perception stages of the VRN in order for good results to be achieved.

TEXT 1: One approach is to apply a nonlinear classifier to the original inputs; however, nonlinear classifiers are complicated to train, require much data to be fit well, and are in general not biologically realistic. An alternative approach, which we have adopted, is to apply one or more nonlinear spike-to-spike transformations to the inputs before sending the inputs to a linear classifier. This approach is more biologically realistic; however, it is of course an open question of what the appropriate spike-to-spike transformations should be.

TEXT 2: One approach is to create general purpose learning algorithms that assume as little as possible about the spiking patterns emanating from the sensors or the types of classification/perception tasks to be done on these spikes. Alternatively, the approach that we adopt assumes regularity exists both within the spiking patterns to be classified and the tasks to be performed on the spikes. For example, edges are important features for detecting the presence of objects which would be an important feature for many object identification tasks. The fact that various components of biological vision have been preserved across many species provides support to the position that there may exist a set of common transformations.

[edit] Classifiers

The primary goal for the VRN is to learn to classify incoming inputs as belonging to one of several discrete classes (e.g. vertical edge, horizontal edge, etc.). This is a supervised learning problem, in which the correct class labels for a fixed set of training data are provided. A spiking circuit that achieves linear classification has been described by Fusi; please see Brader, Senn, Fusi, Neural Computation, 2007. We have attempted to implement this circuit and have applied it to simulated data --- for more information, see Fusi classifier.

Personal tools