Learning from Cone Samples

From VISTA LAB WIKI

Jump to: navigation, search

This is the wiki page for cone samples classification project. In this project, we computed cone sampled images for both clearType and non-clearType rendered letters and used SVM to test whether or not they are classifiable. Effects from viewing distance and display dpi (ppi) were also investigated in this project.

Most part of the project is coded in Matlab.

Developed by Haomiao Jiang, Joyce Farrell and Brian Wandell

Contents

[edit] Pre-requisites

This project is based on the results of several previous pdc projects.

To make sure you could run the code correctly, the following toolbox are required to be installed and accessible in MATLAB path:

1. ctToolbox (scienstanford)

2. isetbio (vistalab)

3. Download liblinear. Or perhaps we need libsvm from the same group. The latter one has mex files for Windows for svm classification (mexw64).

[edit] General Process

[edit] From Letter to Cone Samples

The general process is shown as below:

1. Init font and letter parameters (letter, font size, font family, etc)

2. Generate font bitmap (load cached bitmap) and filter it - ctToolbox

3. Load display model (vDisplay) and generate display image

4. Compute output image from display and convert it to scene (ISETBio)

5. Init sensor for human eye and generate eye-movement positions

6. Compute 50*nSamples cone absorption images and compute average to generate nSamples

7. See the scripts for examples (s_ctConeSampleForDistance / s_ctConeSampleForDisplays)

Flow Chart is shown below.

Flow chart: from letter to cone sample

[edit] Acuity Experiment

Manually Aligned 'c' and 'o' Bitmap

General Process is as below:

1. Load / Create a clearType rendered letter 'c' and 'o'.

2. Manually align the two letters, c' and o'

2. Compute cone sampled images for c' and o' as I_c, I_o

5. Use SVM to classify the union set of I_c and I_o and get the classification accuray

The bitmap image for manually aligned c and o bitmaps are shown on the right.

[edit] ClearType vs Non-ClearType

ClearType Rendered 'v' vs Binary 'v'

General Process is as below:

1. Load / Create a clearType rendered letter, take v as an example.

2. Compute cone sampled images for v as I_c

3. Convert clearType rendered letter to binary rendered letter, v'. For each pixel, if at least two subpixels are on, it is set to on, otherwise it's set to off.

4. Compute cone sampled images for v' as I_n

5. Use SVM to classify the union set of I_c and I_n and get the classification accuray

A sample image for v and v' on the display is shown on the right.

[edit] ClearType Color Edges

General Process is as below:

1. Load / Create a clearType rendered letter, take v as an example.

2. Compute cone sampled images for v as I_c

3. Convert clearType rendered letter ([m n 3]) to binary rendered letter on a 3x dpi display, v'. Converting process is:

(a) merge RGB plane to one plane ([m n 3] -> [m n*3])

(b) Copy each row 3 times ([m n*3] -> [m*3 n*3])

(c) Copy image to three plane ([m*3 n*3]->[m*3 n*3 3])

4. Reset display dpi to 3x original dpi

5. Compute cone sampled images for v' as I_n

6. Use SVM to classify the union set of I_c and I_n and get the classification accuray

[edit] Download, Usage, Instructions

All source code and data file used can be found in GitHub Repository

[edit] Experiment Results

[edit] Acuity Experiment


Acuity Test for Eye-Movement model

The main purpose for this experiment is to determine proper eye-movement model for text reading. The result of this experiment can not only tell how our eye moves but also help us pick proper parameters for some related experiments.

This experiment is based on a known result that people can tell the difference between letter 'c' and 'o' in 9 pt at a viewing distance of 1.2 meters \[??Ref\]. We translate this JND expression into 80% accuracy in SVM classification. In the experiment, we tested eye movement under gaussian distribution with std (0.02, 0.02),(0.02, 0.03), (0.03, 0.03), (0.03, 0.04). Among them, (0.02, 0.03) corresponds to average std of 25 ms window length and (0.03 0.04) corresponds to average std of 50 ms window length.

More detailed process for this experiment is described in the corresponding part in General Process section.

The plot illustrating the SVM classification accuracy under different eye-movement models is shown on the right.

Parameters Used


Letter: c, o Font Size: 9 Font Family: Georgia (Manually Aligned)

Exposure Time: 50 ms Pupil Size: 3 mm

ClearType Filter: (0.2 0.6 0.2) Display DPI: 100

Number of Samples: 600 (300 each) Test Method: 10 fold testing (training:testing = 9:1)

SVM Kernel: Linear


Summary

As can be seen from the plots, the eye-movement under gaussian distribution with std (0.02, 0.03) is closest to 80% at 1.2 meters. This suggests that (0.02, 0.03) and 25 ms window length could be a proper choice for eye-movement in text reading scenario. Thus, the following experiment will use this parameter for the eye-movement model.

[edit] ClearType vs Non-ClearType


Acuity Test for Eye-Movement model

The main purpose for this experiment is to uncover the effect of viewing distance and display dpi on the performance of clearType technology. With the result of this experiment, we can see under which circumstance, clearType will not give any perceptible improvements on text letter rendering.

The general process for this experiment is described in the corresponding part in General Process section.

The plot illustrating the viewing distance effects on the SVM classification accuracy of cone sampled images is shown on the right.

Viewing distance varies from 0.4 meters to 1.2 meters and the display dpi is choose from [100, 200, 300]. 100 dpi corresponds to most LCD displays, while 200 and 300 corresponds to retina display of Macbook and display of iphone respectively.

Parameters Used


Optical Images (100dpi 0.5m)
Optical Images (300dpi 0.8m)

Letter: v Font Size: 12 Font Family: Georgia

Exposure Time: 50 ms Pupil Size: 3 mm

Eye-movement Model: Gaussian Random Movement

Eye-movement Range: <math>\mu = (0,0) , \Sigma = (0.02, 0; 0, 0.03)</math>

ClearType Filter: (0.2 0.6 0.2)

Number of Samples: 600 (300 each) Test Method: 10 fold testing (training:testing = 9:1)

SVM Kernel: Linear


Summary

1. Prediction accuracy decreases monochromatically with viewing distance in certain ranges. When viewing distance is small, i.e 0.3 meters, the classification accuracy is relatively high, indicating that the images which people actually see are different. When this gets larger, say 1.2 meters, the classification accuracy is around 50%, indicating that there's no notable difference between the two type of images.

2. Prediction accuracy decreases monochromatically with display dpi. When dpi gets to as high as 300, clearType is not that helpful as it is with 100 dpi displays.

[edit] ClearType Color Edges


Test result for whether or not we can see the color edges on ClearType letters

The main purpose of this experiment is to test whether or not we can see the color edges on the clearType rendered letters.

The main disadvantage people argue for clearType is its color edges. One example of this phenomenon is shown on the right.

This experiment is designed to show under what viewing distance can we see the color edges on an 100 dpi display. The general idea is to compare the cone sampled images of clearType letter to those of binary letters under 3x dpi displays.

More detailed process for this experiment is described in the corresponding part in General Process section.

The plot illustrating the relationship between classification accuracy and viewing distance about color edges is shown on the right.

Parameters Used


Letter: v Font Size: 12 Font Family: Georgia

Exposure Time: 50 ms Pupil Size: 3 mm

Eye-movement Model: Gaussian Random Movement

Eye-movement Range: <math>\mu = (0,0) , \Sigma = (0.02, 0; 0, 0.03)</math>

ClearType Filter: (0.2 0.6 0.2)

Number of Samples: 600 (300 each) Test Method: 10 fold testing (training:testing = 9:1)

SVM Kernel: Linear


Summary

As can be seen, at 0.3 to 0.4 meters (11 inches to 15 inches), the classification accuracy is very low. This means that at normal reading distance (or even closer), people cannot see the color edges of a clearType rendered letter on a 100 dpi display.

[edit] Effects of Display Pixel Shape


Dell Chevron-shaped and Stripes-shaped subpixels

The main purpose of this experiment is to show whether or not the shape of display pixels will directly affect image quality.

The general idea of this experiment is to show the same letter on two different displays (chevron and Dell Stripe) and compare the cone sampled images.

The pixel shapes of the two displays are shown on the right.

The classification result shows that even at 0.2 ~ 0.3 meters, the prediction accuracy is no more than 55%, meaning that there's no perceptible difference from the two type of display.

[edit] ClearType Filter Array


Filter Array Effects on ClearType Rendering

The main purpose of this experiment is to show the effect of filter used in clearType rendering.

Joyce mentioned in her paper \[Ref XXX\] that filter design is essential in building the rendering the clearType letter. Letters rendered under different filter array can be found in ClearType Experiments.

The general idea of this experiment is to compare the cone sampled images from two letters rendered by different 3-step filters. Three-step symmetric filter can be described as (a, b, a) and considering its unitary property, b should equal to 1-2a. We choose (0.2 0.6 0.2) as the reference filter and we vary parameter a to see whether there's a perceptible difference.

The plot illustrating the result is shown on the right.

Parameters Used


Letter: v Font Size: 12 Font Family: Georgia

Exposure Time: 50 ms Pupil Size: 3 mm

Eye-movement Model: Gaussian Random Movement

Eye-movement Range: <math>\mu = (0,0) , \Sigma = (0.02, 0; 0, 0.03)</math>

ClearType Filter: (0.2, 0.6, 0.2), (0.3, 0.4, 0.3), (0.4, 0.2, 0.4), (0.5, 0 , 0.5), (0.6, -0.2, 0.6)

Number of Samples: 600 (300 each) Test Method: 10 fold testing (training:testing = 9:1)

SVM Kernel: Linear


Summary

As can be seen, different filter will result in different letter shape and they can be easily tell apart at reading distance. This result coincides with the Joyce's findings in \[Ref XXX\]

[edit] Questions and Future Work

1. When using a filter array other than (0, 1, 0), the 3x binary images are not actually binary. Shall we change this to make it binary and redo the experiment?

2. Why do we need to manually align 'c' and 'o' in acuity experiment? Are we assuming the area outside the scene is black not white?

3. Sometimes prediction accuracy falls below 45%. Shall I change the testing method to cross validation method?

4. Use 2D gaussian blur to do eye movement.

[edit] To Do List

1. Test bad filter about color edges

2. Flow Chart

3. 100% Optical Images vs 50% Optical Images

4. Color blind simulation

5. Stigmatism

6. Different display (5 cd/m^2 200 cd/ m^2)

[edit] Reference

Personal tools