Homework 7b - Ghost

This project is a neat combination of mass data, algorithms, and Python. All parts of HW7 are due Wed March 4th 11:55pm. Download ghost.zip to get started.


We have a few warmups functions to get ready for the Ghost problem. These all have a 1-line solution using lambda.

> Lambda Warmups

Ghost Problems

Suppose we are trying to take a picture of Stanford, but each image has people walking through the scene, like this (this is the "hoover" data set):

hoover tower, but with people walking in front

hoover tower, but with people walking in front

hoover tower, but with people walking in front

We'd like to look through all these images and figure out an image that "ghosts" out all the people and just shows the background. (Thanks to John Nicholson for his "Pesky Tourist" Nifty Assignment which inspired this project, see nifty.stanford.edu).

Ghost Solution

This assignment depends on the "Pillow" module you installed for the Parse-Mystery project. Instructions for installing Pillow are included at the end of this document.

Programming detail: For this assignment, we'll access the color of a pixel as a "pix" - a len-3 tuple, e.g. (red, green, blue) as shown below. The SimpleImage module supports getting and setting pix data on images with get_pix() and set_pix() functions.

Say we have 3 images. For every x,y consider the 3 pixels at that x,y across all the images. Many of those pixels show the ordinary background, and a few pixels have someone walking in front, which we'll call "outlier" pixels. We'll assume that outlier pixels are not the majority for any x,y.

For example, for the 3 images, suppose the 3 pix at x=100 y=50 look like:

(1, 1, 1), (1, 1, 1), (28, 28, 28)

Looking at the pixs, you can guess that (28, 28, 28) is the outlier and (1, 1, 1) is the background since it's the majority. What's an algorithm to do this?

Color Distance

To frame this problem, it's handy to introduce the idea of "color distance" between pix. You can think of each pix color as a point in a 3-dimensional color space, with red-green-blue coordinates instead of the regular x-y-z coordinates.

For reference, the standard formula for distance in x-y-z space is (in Python syntax):

math.sqrt(x_difference ** 2 + y_difference ** 2 + z_difference ** 2)

Reminder: In Python "**" is exponentiation, so (x ** 2) is x squared. The module named "math" contains miscellaneous math functions, and math.sqrt() is the square root function. As a surprising optimization, we can actually omit the math.sqrt() for this algorithm (explained below).

Ghost Algorithm

There are a few ways to solve the Ghost problem, but the following is a relatively simple approach that works well.

Problem: for any particular x,y, look at all the pix across the images at that x,y. We want to pick out the best pix to use, avoiding the outlier pix.

1. Compute the floating point arithmetic average pix across these pixs - average all the red values to get average-red-value, and the same for green and blue. So for the three pix (1, 1, 1), (1, 1, 1), (28, 28, 28), the average pix would be (10.0, 10.0, 10.0).

To think through the algorithm, imagine the pix scattered in a 2D space of colors.

cluster of pix, average pix near, outlier pix relatively far off

All the pix but the outlier will naturally cluster, grouped around the perfect color for that x,y, but displaced by little measurement errors. The outlier pix will be off by itself, a truly different color. The average will be in between the two, but nearer to the cluster, since the cluster has many pix and the outlier has just one.

2. To select the best among the pixs, select the pix with the least distance to the average of all the pixs. Equivalently we could say the the pix closest to the average. This works because the more numerous background pixs are near the average, while any outlier pix are relatively far from the average.

Note: In order to select the closest pix, we can omit the math.sqrt(), which is great because square-root is a relatively slow math operation. Here's why: suppose we have three pix, and their distances from average are 4, 6, and 9. We would select the distance "4" one as the closest. However, that works the same if distances are squared, which would be 16, 36, and 81 - we select the "16" one. Computing the square root was unnecessary to selecting the one with the least distance. Therefore, in the code below, we'll compute the distance squared instead of the distance, and in fact using distance-squared is a standard big-data technique for just this reason.

Ghost Code

For this program, three functions are specified here, and you will need to decompose out 1 or 2 other functions to fill out the whole program.

1. pix_dist2(pix1, pix2)

Write the code for a pix_dist2() function which returns the distance-squared between 2 pix. Write at least 2 Doctests.

def pix_dist2(pix1, pix2):
    Returns the square of the color distance between 2 pix tuples.

This is a low-level function, whose role is to be called other functions. Often in our programs, the low-level functions are first in the file, called by the functions later in the file.

2. best_pix(pixs)

Write the code for a best_pix() function which, given a list of pix, returns the best pix from that list according to the ghost algorithm. For example with the pixs [(1, 1, 1), (1, 1, 1), (28, 28, 28)] this should return (1, 1, 1). Write at least 2 Doctests. (See Decomposition below about supporting functions.)

Design constraint: best_pix() should not itself loop through the pix to compare distances. Instead, it should leverage lambda to pick out the best pix.

def best_pix(pixs):
    Given a list of 3 or more pix, returns the best pix.

If multiple pix qualify as the best, we will leave it unspecified which one precisely best_pix() returns.

Design aside: Looking at the the whole ghost program output for, say, the hoover case, it's hard to tell for sure if the program is working correctly. In contrast, best_pix() has been made testable by stripping it down to the minimum - a list of pixs go in, one pix comes out. That form is so simple, it can be tested and made correct. Once best_pix() is correct, the program routes this river of data through that one, well-tested function.

Cultural aside: the TV program Sesame Street has a series of thinly disguised videos based on the Ghost algorithm.

3. solve(images)

This is a high level function, solving the whole problem, but calling other functions to solve sub-problems. The provided main() does the housekeeping of loading the image objects, and then calls solve() to do the real work. Solve() does not need Doctests.

def solve(images):
    Given a list of image objects, compute and show
    a Ghost solution image based on these images.
    There will be at least 3 images and they will all be
    the same size.

Create a solution image the same size as the first image in the list. The SimleImage.blank() function will create a new blank image, like this:

    solution = SimpleImage.blank(width, height)

SimpleImage Functions

The provided SimpleImage module has 2 new functions that access image data in the "pix" format of (red, green, blue) tuples. The old get_pixel() function works too, but for this project, do everything in "pix" - get_pix() and set_pix().

SimpleImage Features:
Create image:
  image = SimpleImage.blank(400, 200)   # create new image of size
  image = SimpleImage('foo.jpg')        # create from file

Access size
  image.width, image.height
  image.in_bounds(x, y)      # boolean test

Get pix at x,y
  pix = image.get_pix(x, y)
  # pix is RGB tuple like (100, 200, 0)

Set pix at x,y
  image.set_pix(x, y, pix)   # set data by tuple also

Get Pixel object at x,y
  pixel = image.get_pixel(x, y)
  pixel.red = 0
  pixel.blue = 255

Show image on screen


You will need to design 1 or 2 other functions to work with the 3 required functions to build the whole program. There are many ways to do this, and any reasonable design is fine.

Top Down Decomposition

Sketch out the code needed for a high-level function such as best_pix(). Think of a helper function that solves a useful sub-problem to support the code in best_pix(). Write the helper function separately, then complete the code in best_pix(), calling the helper. Any helper functions should have """Pydoc""" sentences explaining what they do as usual, but Doctests are not required beyond the ones specfied for pix_dist2() and best_pix().

Ghost Run

The ghost.py program takes just one command line argument - the name of the directory that contains the jpg files. The folder "hoover" is relatively small, containing the three jpg files (the three example images at the top of this page). Run the program with the "hoover" files like this:

$ python3 ghost.py hoover
(solution image appears)

The provided main() first looks in the folder, loads all the jpg files within the folder, and prints the filename of each image as it goes. Then main() calls your solve() function, passing in the images to do the real work.

Bigger Test Cases

We have bigger and better campus cases to try, each in a folder: clock-tower, math-corner, and monster. For any of these, you can look at the individual images inside the folder to get a feel for the data. The file naming convention here is that the file "foo-500.jpg" is 500 pixels wide.

Clock Tower

Here's a medium sized case by the clock tower.

clock tower ghost problem

$ python3 ghost.py clock-tower

Math Corner

A larger case by math corner. math corner ghost problem

$ python3 ghost.py math-corner


And finally this "monster" case - lots of large images again in the direction of of Hoover tower. alt: large ghost problem pointing at hoover

$ python3 ghost.py monster

The monster images are each 1000 x 750 and there are 8 image files, and each pixel has 3 numbers. How many numbers is that?

>>> 1000 * 750 * 8 * 3

So all told, your code is sifting through 18 million numbers to figure out the ghost image. It's a sign of something in your life if you find yourself running python3 to solve a little arithmetic problem like that.

When your code is cleaned up and works correctly, turn in your ghost.py on Paperless as usual..

Appendix: Installing Pillow

This project will use the "Pillow" library. A library is a body of already written code which you import and use, and in this case the Pillow library contains code to manipulate images. We'll use libraries with more detail later. For this assignment you need to install Pillow on your machine so your code can use it.

Open a "terminal" window - the same type of window where you type "python3 foo.py" to run programs. The easiest way to get a Terminal is to use the terminal tab at the lower-left within PyCharm. Type the following command (shown in bold as usual). Note that "Pillow" starts with an uppercase P. (On windows, "py" or "python" instead of "python3").

$ python3 -m pip install Pillow
..prints stuff...
Successfully installed Pillow-5.4.1

To test that Pillow is working, type the following command to a terminal inside your homework folder. This runs the "simpleimage.py" code included in the folder. When run like this, simpleimage.py creates and displays a big yellow rectangle with red at the right edge.

# (inside your project folder)
$ python3 simpleimage.py
# yellow rectangle appears