Fusi classifier

From VISTA LAB WIKI

Jump to: navigation, search

Contents

[edit] Summary

The Fusi classifier is a spiking neural network that can undergo supervised learning in order to solve classification problems. Greg implemented the Fusi classifier in MATLAB to the best of his ability based on Fusi's Neural Computation paper, and tested the classifier on a particular toy problem. He was able to obtain successful results. Recently, I took Greg's code and explored whether the Fusi classifier could be adapted for practical use as part of the Synapse pipeline. That is, can a Fusi classifier take spikes from the retinal model (or some nonlinear transformation of those spikes) and successfully learn to classify those spikes according to pre-specified labels (such as 'horizontal edge is present' or 'vertical edge is present')?

As a first step, I set out to explore whether we can gradually expand the class of problems that the classifier succeeds on, beyond the problem tested by Greg. In general, I found it difficult to get the classifier to perform consistently well across various classification problems --- there are a number of parameters that control the classifier's behavior, and it appears that careful tuning of these parameters is necessary to achieve good performance. It would be nice if there were a specific set of parameter values that would guarantee reasonably good classification performance across a wide variety of problems; however, I am not sure if such a parameter set exists.

[edit] Getting data into the Fusi classifier

Let's suppose we have 64 features (i.e. dimensions) and we are trying to classify points in this feature space as coming from 2 distinct classes. Let's suppose these features take on only non-negative values, like in a rate-based coding scheme. Finally, let's suppose we have 100 samples of each class. At this point, we can represent the data as a matrix of dimensions [100*2 samples x 64 features].

Now, because we have a spiking network, we need to encode the non-negative values as a binary stream. First, we expand out the feature values over time (say, 1000 ticks where 1 tick is a millisecond) and randomly generate spikes according to a Poisson rate-based coding scheme: [100*2 samples x 64 features x 1000 ticks]. Second, we expand the feature representation by assigning 10 neurons to each feature: [100*2 samples x 64*10 features x 1000 ticks]. This expansion is merely intended to help the network by averaging out variability in individual neurons.

At this point, the data are spike patterns and are ready to be fed to the Fusi classifier. That is, at each tick, we feed a [1 x 64*10 x 1] binary pattern of spikes to the Fusi classifier along with an associated class label. The Fusi classifier then updates its weights as necessary and outputs a pattern of spikes across the output neurons in the classifier. Assuming that there are 10 output neurons per class, the Fusi classifier would output a [10*2 units x 1] binary spike pattern.

The classification predicted by the Fusi classifier is implicit in the pattern of spikes across the output units. That is, to determine what the predicted class is, you sum the spikes across the pool of output units dedicated to each class, and then choose the pool with the most spikes. Moreover, in order to deal with the spike-based coding scheme, it is necessary to integrate over some period of time (e.g. 1000 ticks) to achieve robust classification performance.

[edit] Architecture of the Fusi classifier

Each input neuron is connected to each output neuron. All weights are binary (0 or 1); there are no negative weights. Besides the input neurons, there are two other groups of neurons that also impinge upon the output neurons. One group is the teacher neurons, which provide strong excitatory drive selectively to the pool of neurons that should be active (based on the class label). The other group is the inhibitory neurons which provide global, non-specific inhibition to all output units.

[edit] How does the Fusi classifier work?

The intuition seems to be as follows: Without the teacher neurons, the output units are not driven very much and no learning happens. During training, the teacher neurons selectively drive the correct pool of output units. Driving the output neurons causes Hebbian-like strengthening of the synaptic weights between the output neurons and the input neurons that happen to be spiking at that time. So, in a sense, then, the synaptic weights "memorize" which input patterns are associated with a given pool of output neurons. After the weights are strengthened enough, the hope is that the input patterns by themselves will be sufficient to drive the correct output neurons (in the absence of signals from the teacher neurons).

Although I think the above intuition is accurate at the gross level, there are intricate details of the classifier and it is not clear how the details map onto the intuition. In other words, it is not clear to me how to set all of the various parameters of the classifier such that the intuition described above is achieved for arbitrary classification problems.

The artificial neural network (ANN) style "abstract" rule that Fusi presents in his paper does have some reasonably clear principles, and does succeed on linearly separable problems, but not much more. The trouble is that this isn't a spiking model, and so you might well have used tried and true ANN perceptron learning rules to build your linear classifier. The relationship between Fusi's ANN style learning rule and the spiking version of the learning rule is loose. Intuitions from the abstract rule don't seem to translate well to the spiking model, and the spiking model has a good number of parameters that seem to drastically alter its behavior.

[edit] What are the limitations of the Fusi classifier?

The primary limitation of the Fusi classifier is that it appears that it is necessary to hand-tune a number of parameters in order to achieve good performance. These parameters include the number and strength of the teacher neurons and the number and strength of the inhibitory neurons. (The teacher neurons have to be strong enough to induce learning but not too strong since there is a regime beyond which learning is not allowed to happen. The inhibitory neurons have to be strong enough such that the network does not saturate with activity but not too strong since without spiking of the output neurons, no learning can occur.) It would be nice if there were a principled way to set all of the parameters once and for all, but I am not sure how to do this. Furthermore, trial-and-error setting of parameters is itself even tricky since the parameters interact with one another as well as with the statistics of the data (see below).

The tuning of the inhibitory neurons is especially tricky because it is dependent on the gain of the input signals. If the input patterns associated with a given class are weak (i.e. only a few input neurons fire), then inhibition must be also weak (otherwise, no output neurons will fire and learning cannot occur). On the other hand, if the input patterns associated with a given class are strong (i.e. many input neurons fire), then inhibition must also be strong (otherwise, all pools of output neurons will indiscriminately fire). Perhaps setting the inhibition level could itself be implemented as part of the spiking network; in fact, it might be as simple as making the inhibition proportional to the total sum of spikes in the incoming spike train, but this remains to be tested.

From a practical standpoint, it is highly desirable to have a classifier perform robustly under all circumstances. This is relatively easy to achieve in artificial neural network (ANN) world. For example, a perceptron with backpropagation learning can perform linear classification reliably and does not require any special setting of parameters. But Fusi's classifier, and perhaps spiking neural networks in general, are more finicky than ANN networks (e.g. because of all of those hand-tuned parameters), and it is hard to guarantee performance reliability.

Another limitation of the Fusi classifier (but a relatively mild one) is that the very architecture of the classifier imposes certain kinds of restrictions of what kinds of classification problems can be solved. For example, since class membership is coded by high firing, it appears to be the case that the "contrast" of an input pattern is a dimension that the Fusi classifier cannot be sensitive to. (If input pattern P is associated with category C, it must be the case that input pattern 2*P is associated with category C.) As another example along the same lines, the Fusi classifier cannot learn to classify input patterns that consist of low firing. (If firing of the input units is too low, then this will be insufficient to drive any of the output units.)

[edit] Additional thoughts:

I'm not sure that the Fusi classifier should necessarily be considered a *linear* classifier. To illustrate, consider the case where there is no variability in any given class; that is, each class is just a single point in feature space. Then, the Fusi classifier can in principle classify an arbitrarily large number of classes, since it essentially memorizes which input patterns are associated with each class. Now, let's consider the less trivial case where there is indeed variability within a class. It seems to me that what the Fusi classifier does is to find the centroid of the training examples for each class and to set the synaptic weights from the input neurons such that these weights essentially project the input pattern in the direction of the centroid. Strangely enough, this is very similar to the idea that I had been developing in natural image statistics (i.e. perform k-means on image patches to find "good" directions in image space and then point neurons in these directions). Furthermore, notice that this kind of architecture is very unlike standard concepts of separating hyperplanes and so forth that one finds in the land of statistics (SVM, LDA, logistic regression, etc.). This raises the interesting question: does linearly separable in the ANN sense imply separable in the Fusi sense?

Finally, it is interesting that the output neurons in the Fusi classifier operate essentially independently of one another. Perhaps this helps the Fusi classifier scale to situations where there are many classes.

Personal tools