L3 Algorithm

From VISTA LAB WIKI

Jump to: navigation, search

L3 is a method for creating demosaicking, denoising, and color transforms in an automated fashion for any CFA design.

Developing the image processing pipeline for a CFA design is ordinarily challenging because the designer must account for many aspects of the CFA, including its spatial, chromatic and overall sensitivity properties. L3 is a method to create the pipeline using learning procedures that balance the strengths and weaknesses of a particular CFA design. The learning is performed for specific images and sensor properties.

Because L3 is a computational procedure, it enables camera designers to test CFA ideas by rapidly prototype the CFA along with a camera model implemented in the ISET framework. This enables rapid testing CFA designs and CFA optimization for a particular application.

The method is named L3, which stands for Local, Linear, and Learned. The name highlights the core features of the method.

Contents

[edit] Overview

[edit] Local

The word Local has two meanings here. The first is that the algorithm is spatially localized meaning the values at each pixel in the final image is a function only of the CFA values in a local neighborhood centered at that pixel, which we shall call a patch. Although restricting the estimation locally ignores possible patterns or statistics in the CFA image that may be helpful in estimation, the local restriction significantly simplifies computation. With this approach, each output pixel can be processed independently and in parallel.

The second meaning of local is to highlight that the processing is adaptive to each particular patch. This is in contrast to global approaches where each patch is processed in an identical manner. Such global algorithms fail to sufficiently adapt to all features in the image and result in poor outputs. To be adaptive, each patch is classified into a predefined cluster. Similar patches are placed in the same cluster, and all patches in a cluster are processed in an identical manner. But patches in different clusters are processed differently.

[edit] Linear

The word Linear emphasizes that almost all of the calculations on each patch are linear. Such calculations are very fast to compute. Computation is very important when operating on modern images and videos where there are millions of pixels. Many published demosaicking and denoising algorithms have very high computational requirements that may be too expensive for many applications. Once a patch’s cluster is identified, a pre-computed linear filter is applied to the patch to obtain the estimates in the output image at the center pixel. Under certain assumptions, we can find the optimal linear filter for each cluster. These filters can also be optimized to be robust to the expected noise level of the patch.

[edit] Learned

We use the word Learned to explain that we determine how to successfully process images by extracting image statistics from a training set of images. Instead of relying on heuristics and general knowledge about images as is common in image processing, we use machine learning techniques to optimize the processing on our training set. As a result, the L3 approach allows one to automatically generate the processing pipeline for any CFA design or particular application. For applications where the images differ from typical consumer images, the specialized algorithms may have significantly improved performance compared to general algorithms. Specifically, the training set contains a collection of measurements that contain little or no noise and the corresponding desired output. We learn how to cluster patches from the training set. For each cluster, we use the Wiener filter to derive the optimal linear filters that will achieve the least error over all of the training patches that are in that cluster.

[edit] Software download

Software can be checked out from the PDC Software respository pdcprojects/L3pipeline. To check out the code use

svn co https://white.stanford.edu/ee/pdcprojects/L3pipeline

[edit] L3 Software overview

The scripts for running L3 are divided into three parts. They are

L3multispectral2images
L3trainpipeline
L3wrapperimage


Then,

L3showresultimages

More explanations will follow in time.

[edit] Creating the simulation data

This script generates the images that will be used for training the processing pipeline. The selection of the images - both their relative number and characteristics - influences the pipeline parameters.

These images are created by start with a calibrated multispectral representation of the scenes. Reconstructing these images is the target. We use ISET to build a camera model - either a perfect camera or noisy one - that produces training images that correspond to the original multispectral images. The algorithm pipeline is developed as a method that maps the simulated camera images into the (known) multispectral originals.

L3multispectral2images.m

[edit] Learning the optimal filters

Then we learn the filters using

L3trainpipeline.m

[edit] Reconstructing and evaluating

Then we simulate the noise, subsample according to the CFA, and then applies the L3 pipeline to generate the reconstructed image. This script also produces some error metrics.

L3wrapperimage.m

[edit] L3 Projects

[edit] Patch luminance (Rahul)

Rahul Aggrawal worked with Steve Lansel to do a project analyzing some of the statistical properties of the algorithm. The results are summarized here.

[edit] Illuminant correction (Iretiayo, François)

Iretiayo Adegbola Akinola and François Germain worked with Steve Lansel and Qiyuan Tian to study illuminant correction transforms for L3. The preliminary project results from Winter 2014 is available here. Research conducted in Spring 2014 is summarized here .

[edit] References

Link to the OSA presentation Steve made goes here.

Personal tools