Results pre-2011 Microsoft status report

From VISTA LAB WIKI

Revision as of 15:35, 10 August 2015 by Rjpatruno (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

[edit] Notes

This should be updated to include very early results on power consumption of fonts, and other data mentioned in the MS status report.


[edit] Test (Phase 7): What effect does Thresholded Area Sum (TAS) DeltaE (1 thres) measure have on finding 1D filters? TAS may be better for accounting for strength of deltaE measures and amount (area) of deltaE scores (peak info from TAM isn't enough?)

Purpose: Check whether TAS (Thresholded Area Sum) DeltaE has a different effect than mean deltaE for finding 1D filters.

Method: Simulated displays, did 1d filter search with TAS (Thresholded Area Sum) DeltaE (1 thres).

Result: From this small sample of three letters it seems that TAS provides more granularity than mean deltaE, and probably provides more granularity than TAM (compare filtermaps for TAS g with TAM g (phase 6)). Whether this increased granularity corresponds to less perceptual error is unknown?

Calculations: TAS is calculated by thresholding the errorimage (deltaE scores) so only values 1 or above remain, then the deltaE scores are summed.

Display simulated: RGB

DPI Display: 72

Fonts simulated: georgia

Font sizes: 12

DPI Font: 72

DPI Ideal: 300

Viewing distance: 0.60

ScaleFactor: 0.50

Image Area: New Cropping

TAS DeltaE Errormap for s, rgb, 12 pt, georgia, 72 dpi
TAS DeltaE Filtermap for s, rgb, 12 pt, georgia, 72 dpi
TAS DeltaE Errormap for v, rgb, 12 pt, georgia, 72 dpi
TAS DeltaE Filtermap for v, rgb, 12 pt, georgia, 72 dpi
TAS DeltaE Errormap for g, rgb, 12 pt, georgia, 72 dpi
TAS DeltaE Filtermap for g, rgb, 12 pt, georgia, 72 dpi

[edit] Test (Phase 6): What effect does TAM DeltaE (1 thres) measure have on finding 1D filters? Esp. compare to mean deltaE

Purpose: Check whether TAM DeltaE has a different effect than mean deltaE for finding 1D filters.

Method: Simulated displays, did 1d filter search with TAM DeltaE (1 thres) and with mean deltaE.

Result: For the letters a few interesting results pop out


1) different filters were picked (but for TAM deltaE the mean deltaE filter still seems to be acceptable),


2) the granularity of the filter space is much finer for TAM than mean deltaE


3) TAM is identifying regions as bad which mean deltaE isn't (because of increased granularity). Does this mean that finding the optimal single filter for multiple letters, font styles and sub-pixel layouts would be more accurate with TAM, i.e. granularity helps us know when we've moving out of local minima in filter search space?


Display simulated: RGB

DPI Display: 72

Fonts simulated: georgia

Font sizes: 12

DPI Font: 72

DPI Ideal: 300

Viewing distance: 0.60

ScaleFactor: 0.50

Image Area: New Cropping

Mean DeltaE Errormap for s, rgb, 12 pt, georgia, 72 dpi
Mean DeltaE Filtermap for s, rgb, 12 pt, georgia, 72 dpi
TAM DeltaE Errormap for s, rgb, 12 pt, georgia, 72 dpi
TAM DeltaE Filtermap for s, rgb, 12 pt, georgia, 72 dpi
Mean DeltaE Errormap for v, rgb, 12 pt, georgia, 72 dpi
Mean DeltaE Filtermap for v, rgb, 12 pt, georgia, 72 dpi
TAM DeltaE Errormap for v, rgb, 12 pt, georgia, 72 dpi
TAM DeltaE Filtermap for v, rgb, 12 pt, georgia, 72 dpi


Mean DeltaE Errormap for g, rgb, 12 pt, georgia, 72 dpi
Mean DeltaE Filtermap for g, rgb, 12 pt, georgia, 72 dpi
TAM DeltaE Errormap for g, rgb, 12 pt, georgia, 72 dpi
TAM DeltaE Filtermap for g, rgb, 12 pt, georgia, 72 dpi

[edit] Test (Phase 5): Cross compare sub-pixel effects using Thresholded Area Mean deltaE (1 thres)

Cross compare vertical sub-pixels, where primary order varies and calc done with TAM DeltaE


[edit] Test (Phase 4): How does mean deltaE calc interact with thresholding and peak deltaE scores? i.e. is mean deltaE calc over cropped errorImage a good measure (cause it ignores peaks)

Purpose: Test different ways of calculating deltaE scores for whole letter. In particular, original mean calc is based on whole cropped error image, which means letter shape and amount of empty space around letter effects deltaE score (acts somewhat like a low-pass filter).

Method: Simulated numerous displays, displayed many letters with no filters, measured different styles of deltaE

Result: Different ways of calcing deltaE does matter, in particular a combination of thresholding (above 1 deltaE) and limiting what values we use for a mean divisor, e.g. sum(thresImg(:)) / length(find(thresImg >= threshold))

DeltaE Calculations:

1) Mean DeltaE (original): mean(thresImg(:))

2) Mean DeltaE (1 threS): threshold for 1 deltaE (remove all deltaE scores below 1), then mean(thresImg(:))

3) Mean DeltaE (2 thres): threshold for 2 deltaE (remove all deltaE scores below 2), then mean(thresImg(:))

4) Thresholded Area Mean DeltaE (1 thres): threshold for 1 deltaE, then sum(thresImg(:)) / length(find(thresImg >= threshold))

5) Thresholded Area Mean DeltaE (2 thres): threshold for 2 deltaE, then sum(thresImg(:)) / length(find(thresImg >= threshold))


Display simulated: RGB, GBR, BRG, RBG, BGR, GRB Vertical

DPI Display: 72

Fonts simulated: arial

Font sizes: 11

DPI Font: 72

DPI Ideal: 300

Viewing distance: 0.60

ScaleFactor: 0.50

Image Area: New Cropping

Script (generate data): MeasureDeltaEByParamSpace_T00000004.m (generates measures of fonts render on displays, across range of letters, DPIS, etc - large parameter space - no filters)

Script (analysis): Analyse_T00000005.m (reads in data file)

Data: data_T00000004_newcrop_*.mat

Results:

RGB display - Effects of different ways of calculating deltaE with varying levels of thresholding
GBR display - Effects of different ways of calculating deltaE with varying levels of thresholding
BRG display - Effects of different ways of calculating deltaE with varying levels of thresholding
RBG display - Effects of different ways of calculating deltaE with varying levels of thresholding
BGR display - Effects of different ways of calculating deltaE with varying levels of thresholding
GRB display - Effects of different ways of calculating deltaE with varying levels of thresholding



[edit] Test (Phase 3): Are there peaks in deltaE scores? Do they correspond to letter features?

Purpose: Can we identify peaks in deltaE scores? Do they correspond to letter features?

Result: Yes there are peaks (which we'd expect). Follow on question: Is mean deltaE good measure of fitness for finding ideal filters? What do 3 tap and 5 tap 1D filters do to peak deltaEs? Does the current use of mean deltaE lead to values below 1 deltaE getting reduced (which in theory don't matter because they aren't seen by people) at the expense of values above 1 deltaE?

Method: Simulated multipe RGB displays (with various primary positions). Generated new figures which overlay deltaE contour maps on letter simulations, while also varying the colorfulness of the subpixels drawn in the figures as a function of deltaE score.

Display simulated: RGB, GBR, BRG, RBG, BGR, GRB Vertical

DPI Display: 72

Font letters: a, b

Fonts simulated: arial

Font sizes: 11, 12, 13

DPI Font: 72

DPI Ideal: 300

Viewing distance: 0.60

ScaleFactor: 0.50

Image Area: No cropping, full padding

Script (generate data): MeasureDeltaEByParamSpace_T00000004.m (generates measures of fonts render on displays, across range of letters, DPIS, etc - large parameter space - no filters - and errorImages (deltaEs measured over display))

Script (analysis): 1) ThresholdPlot('between', 8, EXPRES, 2, 3, 'Above Absolute', 1), 2) ThresholdPlot('between', 8, EXPRES, 2, 3, 'Above Absolute', 2)

Data: Generated by MeasureDeltaEByParamSpace_T00000004 with parameters set to match properties above

Results:

11pt, a, threshold deltaE 1. Hotspots show high deltaE scores, thresholded to only show values above 1 deltaE (should correspond to just noticable artifacts)
11pt, a, threshold deltaE 2. Hotspots show high deltaE scores, thresholded to only show values above 2 deltaE (should correspond to just noticable artifacts)
12pt, a, threshold deltaE 1. Hotspots show high deltaE scores, thresholded to only show values above 1 deltaE (should correspond to just noticable artifacts)
12pt, a, threshold deltaE 2. Hotspots show high deltaE scores, thresholded to only show values above 2 deltaE (should correspond to just noticable artifacts)
13pt, a, threshold deltaE 1. Hotspots show high deltaE scores, thresholded to only show values above 1 deltaE (should correspond to just noticable artifacts)
13pt, a, threshold deltaE 2. Hotspots show high deltaE scores, thresholded to only show values above 2 deltaE (should correspond to just noticable artifacts)
11pt, b, threshold deltaE 1. Hotspots show high deltaE scores, thresholded to only show values above 1 deltaE (should correspond to just noticable artifacts)
11pt, b, threshold deltaE 2. Hotspots show high deltaE scores, thresholded to only show values above 2 deltaE (should correspond to just noticable artifacts)
12pt, b, threshold deltaE 1. Hotspots show high deltaE scores, thresholded to only show values above 1 deltaE (should correspond to just noticable artifacts)
12pt, b, threshold deltaE 2. Hotspots show high deltaE scores, thresholded to only show values above 2 deltaE (should correspond to just noticable artifacts)
13pt, b, threshold deltaE 1. Hotspots show high deltaE scores, thresholded to only show values above 1 deltaE (should correspond to just noticable artifacts)
13pt, b, threshold deltaE 2. Hotspots show high deltaE scores, thresholded to only show values above 2 deltaE (should correspond to just noticable artifacts)

[edit] Test: Does starting position for displaying letter matter? i.e. Subpixel alignment effects on perceptual distortions

Purpose: Test effects of sub-pixel alignment, i.e. start drawing on a red sub-pixel compared to a green or a blue sub-pixel

Method: Simulated numerous displays, displayed many letters with no filters, measured for mean delta E 2000, original cropping, draw same letters at different positions on a display

Result: (more below) From an intial analysis, with no filters on a limited range of displays - subpixel position does matter. A single subpixel shift when drawing a letter change signficantly impact upon mean deltaE values.

Display simulated: RGB Vertical, GBR Vertical, BRG Vertical

DPI Display: 72

Fonts simulated: arial, georgia

Font sizes: 11

DPI Font: 72

DPI Ideal: 300

Viewing distance: 0.60

ScaleFactor: 0.50

Image Area: Original Cropping

Script (generate data): MeasureDeltaEByParamSpace_T00000003.m (generates measures of fonts render on displays, across range of letters, DPIS, etc - large parameter space - no filters)

Script (analysis): Analyse_T00000003.m (reads in data file)

Data: data_T00000003_originalcrop.mat

Results:

Arial Lower case, RGB vs GBR vs BRG
Arial Upper case, RGB vs GBR vs BRG
Georgia Lower case, RGB vs GBR vs BRG
Georgia Upper case, RGB vs GBR vs BRG

[edit] Test (Phase 2): How varied is deltaE score across different displays for the same fonts? Crop test

Purpose: Don't crop the error image before calculating the deltaE mean.

Method: Simulated numerous displays, displayed many letters with no filters, measured for mean delta E 2000, no cropping (testing how much this matters).

Display simulated: RGB Vertical

DPI Display: 72

Fonts simulated: arial, georgia

Font sizes: 11

DPI Font: 72

DPI Ideal: 300

Viewing distance: 0.60

ScaleFactor: 0.50

Image Area: No cropping, full padding

Script (generate data): MeasureDeltaEByParamSpace_T00000003.m (generates measures of fonts render on displays, across range of letters, DPIS, etc - large parameter space - no filters)

Script (analysis): Analyse_T00000001.m (takes in vector from generator script)

Data: data_T00000001.mat (see MeasureDeltaEByParamSpace_T00000001.m to understand the format of the data file)

Results:

How mean deltaE scores vary as a function of letter and font family. This is without cropping the error image before calculating the deltaE mean.



[edit] Test (Phase 1): How varied is deltaE score across different displays for the same fonts?

Method: Simulated numerous displays, displayed many letters with no filters, measured for mean delta E 2000.

Display simulated: RGB Vertical

DPI Display: 72

Fonts simulated: arial, georgia

Font sizes: 10

DPI Font: 72

DPI Ideal: 300

Viewing distance: 0.60

Script (generate data): MeasureDeltaEByParamSpace_T00000001.m (generates measures of fonts render on displays, across range of letters, DPIS, etc - large parameter space - no filters)

Script (analysis): Analyse_T00000001.m (takes in vector from generator script)

Data: data_T00000001.mat (see MeasureDeltaEByParamSpace_T00000001.m to understand the format of the data file)

Results:

How mean deltaE scores vary as a function of letter and font family

Todo: Analyse other font size, faces and displays


[edit] Test: Does black aperture around sub-pixels have an impact on deltaE scores?

Expectation: Yes, it may change the deltaE score


Result: From a quick test, deltaE scores may not increase significantly when aperture added.


Conclusion: Not enough displays and fonts tested for any meaningful conclusion.


Tested:

  • No aperture RGB vertical for georgia letter s, pt 12, dpi 72. Result gives mean deltaE of 0.62658, median of 0.28865, and 1d filter of a = 0, b = 0.1
No aperture RGB vertical errormap
No aperture RGB vertical filters


  • With aperture for RGB vertical (subPixelHalfWidth/Height had 0.025 subtracted in psfGroupCreate). Gives mean deltaE of 0.6273, median of 0.28891, 1d filter of a = 0.0, b = 0.1


  • With aperture for RGB vertical (subPixelHalfWidth/Height had 0.05 subtracted in psfGroupCreate). Gives mean deltaE of 0.62812, median of 0.28893, 1d filter of a = 0.0, b = 0.1
Aperture RGB vertical errormap (-0.05)
Aperture RGB vertical filters (-0.05)



[edit] Test: Does downsampling for speed increase impact on deltaE scores (speed vs precision trade-off) when subpixel aperture is present?

Expectation: Yes, it should significantly effect it, though strenght of effect depends on scale factor. If letter rendered on display is scaled very small then scaling artifacts will be introduced.


Result: From quick test, scale factor at 0.25 does have a significant impact on deltaE scores - making them go up. Scale factor at 0.5 has less impact.


Conclusion: Scalefactor does have a significant impact upon deltaE scores, with .25 having a large impact, and less so for a 0.5 scaling factor (results from limited tests).

Of note, the picked filter is still the same (a = 0, b = 0.1) even though the deltaE values still changes with scaling (this could easily be a coincidence).

No aperture and scaling didn't lead to as much change in deltaE compared to with an aperture.


Tested:

  • ScaleFactor of 0.25 (in ctdgSearchSC.m) - reduce simulated display size by 1/4 (was ~ 1180*1180*3 before scaling).

Result: With aperture of -0.025. Mean deltaE = 2.4927, median = 0.71982, filter a = 0, b = 0.1

Scalefactor 0.25, aperture -0.025 RGB vertical errormap
Scalefactor 0.25, aperture -0.025 RGB vertical filters


  • ScaleFactor of 0.50 - reduce simulated display size by 1/2. With aperture of -0.025.

Result: Mean deltaE = 0.62212, median = 0.28103, filter a = 0, b = 0.1

Scalefactor 0.5, aperture -0.025 RGB vertical errormap
Scalefactor 0.5, aperture -0.025 RGB vertical filters



  • ScaleFactor of 1.0 - no size reduction. With aperture of -0.025.

Result: Mean deltaE = 0.6273, median = 0.28891, filter a = 0, b = 0.1

Scalefactor 1 (no scaling), aperture -0.025 RGB vertical errormap
Scalefactor 1 (no scaling), aperture -0.025 RGB vertical filters



[edit] Test: Does downsampling for speed increase impact on deltaE scores (speed vs precision trade-off) when NO subpixel aperture is present?

Expectation: Yes, it should significantly effect it, though strenght of effect depends on scale factor. If letter rendered on display is scaled very small then scaling artifacts will be introduced.


Result: From quick test, scale factor at 0.25 does have a significant impact on deltaE scores - making them go up. Scale factor at 0.5 has less impact.


Params: Letter s, pt 12, dpi 72, georgia


Conclusion: Scalefactor does have a significant impact upon deltaE scores, with .25 having a large impact, and less so for a 0.5 scaling factor (results from limited tests).

Of note, the picked filter is still the same (a = 0, b = 0.1) even though the deltaE values still changes with scaling (this could easily be a coincidence).

No aperture and scaling didn't lead to as much change in deltaE compared to with an aperture.


Tested:

  • ScaleFactor of 0.25 (in ctdgSearchSC.m) - reduce simulated display size by 1/4 (was ~ 1180*1180*3 before scaling).

Result: Mean deltaE = 0.85276, median = 0.56647, filter a = 0, b = 0.1

Scalefactor 0.25, RGB vertical errormap
Scalefactor 0.25, RGB vertical filters


  • ScaleFactor of 0.50 - reduce simulated display size by 1/2.

Result: Mean deltaE = 0.62625, median = 0.28867, filter a = 0, b = 0.1

Scalefactor 0.5, RGB vertical errormap
Scalefactor 0.5, RGB vertical filters



  • ScaleFactor of 1.0 - no size reduction.

Result: Mean deltaE = 0.62658, median = 0.28865, filter a = 0, b = 0.1

Scalefactor 1 (no scaling), RGB vertical errormap
Scalefactor 1 (no scaling), RGB vertical filters
Personal tools