Assign2: Mazes

animation of BFS search on maze

A maze is a twisty and convoluted arrangement of paths that challenge the solver to find a route from the entry to the exit. This assignment is about using ADTs to represent, process, and solve mazes.

Mazes

Labyrinths and mazes have fascinated humans since ancient times (remember Theseus and the Minotaur?), but mazes can be more than just recreation. The mathematician Leonhard Euler was one of the first to analyze plane mazes mathematically, and in doing so founded the branch of mathematics known as topology. Many algorithms that operate on mazes are closely related to graph theory and have applications to diverse tasks such as designing circuit boards, routing network traffic, motion planning, and social networking.

This project focuses on a particular type of maze known as a perfect maze. A perfect, or simply-connected, maze has no loops and no inaccessible areas. Any two locations within a perfect maze are connected by exactly one path.

The image below shows a perfect maze and the path from the entry at upper left to exit at lower right:

rectangular maze with dotted path leading from entry to exit

Grid and GridLocation

A maze can be modeled as a two-dimensional array where each element is either a wall or a corridor. The Grid class from the Stanford library is a good fit tool for representing a maze. It provides the abstraction of a two-dimensional array in a safe, encapsulated form with various client conveniences. We use a bool as the element, storing true for a cell that is an open corridor and false for a wall.

Access to grid elements is generally by row and column, e.g. grid[row][col]. Alternatively elements can be selected by GridLocation. GridLocation is a small struct type that pairs the row and column together into one aggregate type. A struct is a C++ user-defined type consisting of a defined set of named fields of heterogeneous type.

 GridLocation chosen;
 chosen.row = 3;      // separate assignment to row
 chosen.col = 4;      // then col
 if (maze[chosen]) // equivalent to maze[3][4]
 ...
   
   // or can set both row, col as part of initialization
 GridLocation exit = { maze.numRows()-1, maze.numCols()-1 }; // last row, last col 
 if (chosen == exit) // can directly compare two GridLocations  
 ...
 if (chosen.row == 0 && chosen.col == 0) // also ok to compare row,col separately
 ...

References

Lecture slides on Grid
Documentation for Grid
Section 6.1 of the textbook introduces struct types
Header files grid.h and gridlocation.h within the starter project subfolder lib/StanfordCPPLIb/collections/
GridLocation has a neighbors operation that is a bit ill-matched to our needs. For maze, neighbors are defined in the four cardinal directions only (N,S,E,W), but GridLocation also considers NE, NW and so on. Rather than try to work around the mismatch, it is cleaner to write your own operation to collect the neighbors.

Reading a maze file

A maze file contains a maze written in text format. Each line of the file corresponds to one row of the maze. Within a row, the character @ is used for walls and - for corridors. Here are the first few lines of a sample maze file:

 -----@-----------
 @@@@-@@@@@@@@@-@@
 ---@---------@---
 -@@@@@@@@@@@-@-@-
 -----@---@---@-@-

The last line of a maze file must either be a blank line or can optionally includes the solution written as a Stack of GridLocation.

A completed version of the function

bool readMazeFile(string filename, Grid<bool>& maze)

is provided in the starter project. The provided function correctly reads a well-formed maze, which is confirmed by the provided unit tests. First read over the function and its unit tests. Ask questions if you find anything unclear.

Although readMazeFile does a great job reading correct files, it does a poor job if asked to read a malformed file. Since it makes no effort to detect problems, it generally blunders through, misinterpreting the data, and possibly crashing on bad inputs.

Add error detection

Your first task is to improve readMazeFile by making it robust to malformed input. There are two specific issues in the file format that you are to detect and report:

A maze row that is longer or shorter than the other rows.
- Each maze row is required to have same length as all the others.
A maze location containing a character that is not @ or -.
- The valid options for a location are open or wall. Anything else is an error.

Use the error function from the header "error.h" to report an error. This function is used to report a fatal error. When you call error it halts the program right here and reports the message you provided:

error("Here is my message about what has gone wrong");

Construct tests and maze files

We’ve provided some unit tests and one badly malformed maze file. You are to add student tests that further verify that your maze reading is now robust. You will need to create additional malformed files. Maze files are just ordinary text files placed into the res/ folder of your project. You can edit these files in Qt Creator. The project resource files are listed under Other files -> res. Your program can open a resource file by specifying the path to open as "res/myfilename". Follow the format of the other maze files: one line for each row in grid, followed by one extra line that is blank or the solution path written as a Stack of GridLocation.

Note the use of the special kind of test case EXPECT_ERROR in our provided test. An EXPECT_ERROR test case evaluates an expression, expected that the operation with raise an error. While the test is running, SimpleTest framework will catch the error, note that it was generated, and then resume. Because the error that was raised was expected, the test case is considered to be passed. If an error was expected but didn't materialize, the test case fails. (Note this is opposite of the regular EXPECT and EXPECT_EQUAL tests which do not expect errors and treat a raised error as a test failure). More information on the different test macros and how they all work can be found in the CS106B Testing Guide

Q6. Describe the malformed maze files you created in order to verify the robustness of your maze reading.

Check solution

The last line of a maze file must either be a blank line or can optionally includes the solution written as a Stack of GridLocation. The readMazeFile function includes the correct code to read the Stack and validate it is syntactically well-formed. To confirm that the path is a valid solution, it calls the function checkSolution.

The function

bool checkSolution(Grid<bool>& maze, Stack<GridLocation> path)

is intended to verify that a path meets all of the necessary criteria to be a correct solution:

The path must start at the entry (upper left corner).
The path must end at the exit (lower right corner).
Each location in the path is within the maze bounds.
Each location in the path is an open corridor (not wall).
Each location is one cardinal step (N,S,E,W) from the next in path.
The path contains no loops, i.e. a location appears at most once in the path.

If checkSolution detects any of the above problems, it should call the error function with an appropriate message. A call to error immediately exits the function and halts the program. If all of the criteria are met, then the function completes normally and returns true to signal success.

This function has a lot of things to check! However, writing this function now is going to pay off big time when you later use it to test your path-finding algorithm. Be sure to very thoroughly test your checkSolution now on a variety of invalid paths so that you can be confident it is the oracle of truth when it comes to validating a solution. Your future self will thank you.

Q7. After you have written your tests, describe your testing strategy to determine that your checkSolution works as intended

Once your maze-reading is bullet-proofed against all manner of problems, you're ready to go on to display the maze and code the algorithm to solve it.

Drawing the maze

The Stanford C++ library has extensive functionality for drawing and interacting with the user, but we generally don't ask you to dig into those features. Instead, we have supplied the graphics routines you need pre-written in a simple, easy-to-use form. For this program, the provided mazegraphics.cpp module has functions to display a maze and highlight a path through the maze. Read the mazegraphics.h header file in the starter project for details of these functions:

MazeGraphics::drawGrid(Grid<bool>& grid)
MazeGraphics::highlightPath(Stack<GridLocation> path, string color)

After reading in a maze (and validating that it is well-formed), you simply call the MazeGraphics::drawGrid to display the maze. The function MazeGraphics::highlightPath is later used to mark the cells along a path as part of animating the search for a solution. Highlighting a path expects to add dots on locations within the already drawn grid. Be sure to only call MazeGraphics::highlightPath after having first called MazeGraphics::drawGrid once.

Note that when calling these functions, you must call them using their full name (including the weird looking MazeGraphics:: prefix). We will talk more about what this notation means a little later on in the quarter!

Solving a maze using breadth-first search (BFS)

Now that we know we have the ability to construct a maze, we can turn our attention to figuring out how we will escape these mazes we've just built. In particular, you are now going to implement the following function in maze.cpp, in which you will implement a maze-solving algorithm (described below).

Stack<GridLocation> solveMaze(Grid<bool>& maze)

There are a wide variety of algorithms for solving a maze. Solving a maze can be seen as a specific instance of a shortest path problem, where the challenge is to find the shortest route from the entrance to the exit. Shortest path problems come up in a variety of situations such as packet routing, robot motion planning, analyzing gene mutations, spell correction, and more. In the case of the perfect maze, the shortest path is also the only path, but the general process is the same regardless.

Breadth-first search (BFS) is a classic and elegant algorithm for finding a shortest path. A breadth-first search reaches outward from the entry location in a radial fashion until it finds the exit. The first paths examined take one hop from the entry. If any of these reach the exit location, you’re done. If not, the search expands to those paths that are two hops long. At each subsequent step, the search expands radially, examining all paths of length three, then of length four, etc.), stopping at the first path that reaches the exit.

Breadth-first search is typically implemented using a queue. The queue stores partial paths that represent possibilities to explore. The paths are processed in order of increasing length. The first paths enqueued are all length one, followed by the length two paths, and so on. Given the FIFO handling of the queue, all shorter paths are dequeued before the longer paths make their way to the front of queue, ready for their turn to be processed.

At each step, the algorithm considers the current path frontmost in the queue. If the current path ends at the exit, it is a complete solution. If not, the algorithm takes the current path and extends it to reach locations that are one hop further away in the maze, and enqueue those extended paths to be examined later.

To represent a path, a Stack of GridLocation is a good choice, as it cleanly supports the needed operations to add/examine the element at the end of the path. Putting these paths into a queue for processing means you'll have a nested ADT, a Queue<Stack<GridLocation>>. A nested container type looks a little scary at first, but it is just the right tool for this job.

Here are steps followed by a breadth-first search:

Create a queue of paths. A path is a stack of grid locations.
Create a length-one path containing just the entry location. Enqueue that path.
- For simplicity, assume entry is always the upper-left corner and exit in the lower-right.
Dequeue path from queue.
If this path ends at exit, this path is the solution!
If path does not end at exit:
- For each viable neighbor of path end, make copy of path, extend by adding neighbor and enqueue it.
- A location has up to four neighbors, one in each of the four cardinal directions. A neighbor location is viable if it is within the maze bounds, the cell is an open corridor (not a wall), and it has not yet been visited.
Repeat steps 3-5 until path reaches exit.

One issue that is a bit subtle is that you must avoid repeatedly revisiting the same location in the maze or creating a path with a cycle, lest the search get stuck in a infinite loop. For example, if the current path leads from location r0c0 to r1c0, you should not extend the path by moving back to location r0c0. A common strategy for tracking where you have already been is to keep a Set of locations to which you add each location as you visit it. Checking whether a location is contained in the visited set lets you know whether to consider or skip this location.

As your algorithm hunts for a solution, you can call the function MazeGraphics::highlightPath function to visually mark the current path on the graphics window so the user can follow along with the algorithm. This animation is not required, but it adds a bit of fun and may even help you trace and debug.

Once you have a working solver, unit-testing is a piece of cake because earlier you wrote the awesome checkSolution function. After solving a maze, use your unit test verify that your path meets all the criteria to be valid. Super neat! Your work here is done, congratulations!👏

Notes on Collection ADTs

For solving the maze, the Stack with its LIFO behavior is ideal for storing a path. The first entry pushed on the stack is the starting point and each subsequent point in the path is pushed on top. The FIFO Queue is just what's needed to track those partial paths under consideration. The paths are enqueued (and thus dequeued) in order of increasing length. A Set is particularly handy if you need to be able to quickly determine membership (is this element contained in the set?).
The assignment operator work as expected for all our ADTs. Assigning from one Stack/Vector/Set/etc to another will create a copy of the ADT with a copy of the elements.
Be on your toes about making sure that your template types are always properly specialized and that all types match. Using Vector without specifying the element type just won't fly, and a Vector<int> is not the same thing as a Vector<double>. The error messages you receive when you have mismatches can be cryptic and hard to interpret. Bring your template woes to the forum and we can help untangle them with you.

References

There are many interesting facets to mazes and much fascinating mathematics underlying them. Mazes will come up again several times this quarter. Chapter 9 of the textbook uses a recursive depth-first search as path-finding algorithm. At the end of the quarter when we talk about graphs, we'll explore the equivalence between mazes and graphs and note how many of the interesting results in mazes are actually based on work done in terms of graph theory.

Walter Pullen, Maze Classification. http://www.astrolog.org/labyrnth/algrithm.htm Website with lots of great info on mazes and maze algorithms
Jamis Buck. Maze Algorithms. https://www.jamisbuck.org/mazes/ Fun animations of maze algorithms. He also wrote the excellent book about Mazes for Programmers: Code Your Own Twisty Little Passages.

Extensions

Instead of reading pre-written mazes from a file, you could instead generate a new random maze on demand. There is an amazing (I could not resist…) variety of algorithms for maze construction, ranging from the simple to the sublime. Here are a few names to get you started: backtracking, depth-first, growing tree, sidewinder, along with algorithms named for their inventor: Aldous-Broder, Eller, Prim, Kruskal, Wilson, and many others.
Try out other maze solving algorithms. How does BFS stack up against the options in terms of solving power or runtime efficiency? Take a look at random mouse, wall following, depth-first search (either manually using a Stack or using recursion once you get the hang of it), Bellman-Ford, or others.
There are many other neat maze-based games out there that make fun extensions. You might gather ideas from Robert Abbott's http://www.logicmazes.com or design a maze game board after inspiration from Chris Smith on mazes.