Rgc perception

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

We are using the RGC simulator to understand certain aspects of human behavior. Specifically, there are some human performance limits that are governed by the lens, cone photopigments, and spatial resolution of the retina. By using the RGC network we can evaluate when the human performance matches the information available in the retinal signals. This page documents experiments and computations that are designed to compare simulations of the retinal signals and human performance.

 Experimental Plans

1. Fix the spatial frequency target at, say, 1 cpd, and we sweep out a set of receptive field sizes, we should see the same pattern of results as if we fix the spatial receptive field and sweep out the spatial frequency in cpd. Let's calibrate the receptive field sizes by sweeping them out and seeing which one provides a peak at 1 cpd, which at 2 cpd, and which at 5 cpd.

It should be the case that when the center of the RF is about 150 microns, then we should have a peak sensitivity at 1 cpd.

If we have an RF center that is 30 microns, then we should have a peak at 5 cpd

To debug this we need to look at the spatial pattern of mean absorptions in real spatial units, we want to look at the size of the spatial receptive field in the same units, we want to look at the mean linear response of the RF array. The linear response part (prior to spiking) should be easy to understand.

Then we want to have a way to look at the dynamics of the center and surround outputs. It could be that some of the classification takes place because of properties of the center and surround dynamics. We should probably have a way to use only the spiking pattern at times that are in the steady-state.

In these experiments what should we do for the temporal characteristics of the target? Should we implement a routine that creates a stimulus with a smooth (ramped or Gaussian or slow) onset and offset. Experimentally in the psychophysics literature people will turn on the harmonic function using Gaussian(t)*c*sin(2pifx). Or they will use a raised cosine, which is (Cos(t)+1)*c*sin(2 pi f x). So maybe we should have some ISET functions that create the absorption pattern over time like this. In general we need a way to create stimuli with a temporal modulation F(t)*Space(x).

2. We can start setting up experiments for hyper-acuity. You can read about hyper-acuity in chapter 6 pattern vision chapter, at the end of the chapter, in the Foundations of Vision book.

In these experiments we set up a pair of lines, one above the other. They are either aligned or slightly displaced. We classify between perfectly aligned and not perfectly aligned. We could set up a classification between lines to the left and lines to the right.

3. Can we add material to the classify wiki page / clean up or break up the spikeClassify function (see with Tony).

4. Should we try some experiments in which we change the spectral power distribution of the target. A simple one would be to use short-wavelength harmonic functions and long-wavelength harmonic functions and compare the two results. We also need to start considering the sampling density of the different cone types.

5. As we write these scripts, we will be setting parameters a lot. From the scripts we will learn what kinds of GUI tools we want for setting stimulus parameters and for setting RF parameters and for setting classification parameters.

 Contrast sensitivity functions (CSF)

We are first predicting the contrast sensitivity function (sensitivity vs. spatial frequency). For the purposes of this calulcations we define sensitivity as 1/threshold, and threshold is the contrast level at which the classifier is correct 81% of them time. This threshold is the contrast value, x, that is equal to the parameter $\lambda$ in the fitted Weibull function (below).

To study the contrast sensitivity at given frequency, we use the rgcContrastSensitivityISET function. For several contrast values, on one side, it generates a grating with the given contrast and frequency, and on the other side, it generates a blank image. For each image, cones absorptions are simulated twice (once for training and once for testing), and then a set of spikes is computed for each of these absorptions. These spikes are integrated on a given time window (we use 50ms). A SVM classifier, with a linear kernel is then trained on one set of spikes from the blank stimulus and a set of spikes from the grating stimulus. It is then tested on the second set of spikes for each image. Classification is only done for the frames on which we could integrate for 50ms on the same type of image.

This gives us a curve of classification accuracy vs contrast, you can find an example below:

 Weibull fitting

Because of the variance in the absorptions, the spikes, and the classification, these results are quite noisy. To get a less noisier approximation of this function, we are now assuming that these data follow the cumulative distribution of a Weibull distribution. This assumption seems more than reasonable on all of our experiments. To fit the curve, we find the 2 parameters ($k,\lambda$) minimizing $F$, given the constrasts used ($x_{i}$), and the computed accuracies ($y_{i}$):

$W(x,\beta,\alpha) = 1-exp(-(\frac{x}{\alpha})^{\beta})$

$F(\beta,\alpha) = \sum_{i}N(y_{i}-W(x_{i},\beta,\alpha))$

Where $N()$ can be the absolute value ($L_{1}$ minimization), or the square function ($L_{2}$ minimization); On our experiments, the 2 minimizations give extremely similar results.

The function that does this is rgcFitWeibull. You can see the result of this curve fitting on the image below:

 Frequency sensitivity

As we have seen previously, fitting a Weibull cdf to our contrast sensitivity experiments seems to be working really well. We are going to use that to study the frequency sensitivity of the network. We are using the following method: for a given frequency, we compute contrast sensitivity data as described in the contrast sensitivity section. As these results are quite noisy, we fit a Weibull cdf to the data, this means that we are only fitting 2 parameters, thus highly reducing the noise as we have a highly constrained problem (assuming the Weibull cdf is a good model). We can now determine a contrast $c_{80}$ such that $W(c_{80},k*,\lambda*) = 0.80$, this is the contrast value at which we have 80% of accurate classification. This value is easy to compute as the formula of $W$ can be inverted analytically. We use 1/$c_{80}$ as a measure of sensitivity for the considered frequency. By repeating the process on a range of frequencies, we compute a frequency sensitivity curve.

The function for this is rgcFrequencySensitivityContrastScale. You can see such a curve on the figure below:

 Comparison for different RF sizes

Below: Comparison of Frequency snesitivity for a layer with a RF with a small variance and one with a high variance. 2 different experiments for different cpd values.

 Prediction and comparison using a DFT analysis

The problem can then be formulated as: Are the spikes obtained on the grating stimulus statistically different that those obtained on a blank stimulus. This is a very complex problem to analyse directly, therefore we are going to look at a related problem, that is easier for us to study: Classifying the signal obtained by convolving the receptive field with the stimulus to distinguish between a blank signal and a grating.

We're making here the assumption that the blur from the optics do not have too much impact on the perceived frequency at the cones level. This is true at low frequencies, but not at high frequencies.

Let's start with a simple case: the coefficient of the center is 1 and the coefficient of the surround is -1. This means that the response of the RF to a uniform signal is zeros. We can now compute the DFT of the RF filter. As a first approximation, it should give us an idea of the shape of the frequency sensitivity curve, as the blank stimulus gives a blank signal, and when we are at a frequency with a high response, then, in some places of the image, we should have a high response, making it easy to distinguish from the blank stimulus.

One can see on the curves below that the experiments behave as we would expect for these coefficients when we are only changing the covariance matrix of the center and surround.

 Vernier acuity functions

We are now computing a Vernier acuity function for our system. To be able to have a minimal difference of 0.0001 degree between the lines, we are first defining the following parameters: a field of view of 0.2 degree and image width of 2000 pixels. This means that 1 pixel represents 0.2/2000 = 0.0001 degree per pixel. We define the reference stimulus as a black background with a 1 pixel wide vertical white line. This line is placed at the middle of the image. To build the other stimulus, we slide the higher half of the line by n pixels to represent a delta of n*0.0001 degree. We compute two sets of absorptions and spikes for each stimulus (one for training and one for testing). The spikes are then temporally integrated by summing them over an intgrationTime duration (We usually use 50ms). Finally, we test the detection capacity with a SVM classifier using a linear kernel.

With a 0.2 degree fov, we compute cones absorptions over a 40x40 cones. This gives us a RGC network of 27x27 cells whose support represent a 0.1993 degree fov.

Changing the layer RF seems to have little impact on these results as we can see with these 4 different layers:

We can see on the following images that for a delta of 1.8 second of arc, the blur from the optics spreads the difference on a few cones, and this is then reinforced by the layer RF. (To make the difference between the right and left, these show the difference between the absorptions (or spikes) and the mirrored image):