Final Projects | CS 106X

N-Body Relativity Simulator

Ahmed Abdalla

The N-Body problem refers to the difficulty of predicting the motion of an arbitrary set of interacting particles. A general solution to this problem has alluded physicists since the time of Newton. Leveraging the power of computing, this simulator will attempt to provide users with the intuition necessary to grasp this complexity by recursively solving the associated differential equations. Users will have the option to define a set of initial conditions and see how their system operates under arbitrary timescales. Among these conditions will be an option to arbitrarily set the speed of light, forcing the simulation to also keep track of relativistic effects between the particles. In order to minimize runtime, the Barnes-Hut algorithm will be employed to define a distance scale over which bodies are assumed to be effectively independent of each other.

Minimax Approach to Creating Checkers Simulator

William Brannon

With the advent of artificial intelligence becoming more and more of a standard tool used in performing countless tasks, it becomes interesting to consider the performance of human versus computer. Checkers, a very popular game in which a player wins by removing all the opponent's pieces from the board, is an interesting implementation of this consideration, as it is a game commonly played by humans that is simple enough for any person to perform well at. Using minimax implementation with the foresight that humans are naturally irrational, the computer player should theoretically beat the human a vast majority of the time.

Making 5,377,183 Suns

Chapman Caddell

Making 5,377,183 Suns takes an algorithmic approach to contemporary art. Inspired by Penelope Umbrico's piece 5,377,183 Suns from Sunsets from Flickr, a collage of images downloaded and arranged by the artist, I scrape images from Wikimedia Commons, determine the commonest color in each through k-means clustering, order them by hue in a priority queue, and (finally) use merge sort to arrange columns by value. The final product is a program that generates pieces similar to Umbrico's, but with a twist: images are arranged in a grid across a color gradient, and the "artist" has the freedom to think beyond sunsets.

CS 106Sudoku

Erin Cohen

CS 106Sudoku is a program that introduces users to Sudoku and helps them to solve a classic game. Internally, the program employs recursive backtracking to generate the computer solution of the game while a custom object class is used to operate the game flow.

Flight Planning: Low Altitude Airways

Chris Connelly

My program will construct flight plans following low altitude airways from point-to-point. In addition to finding the shortest path, the program will incorporate additional capabilities such as: (1) Allowing users to specify an altitude--which determines which airways can be flown; (2) Managing the arrival and departure phases of a flight; and (3) Building in a diversion capability for when weather or equipment outages preclude the shortest route.

Analysis of Healthcare Data

John Dalloul

Healthcare is one of the largest industries out there and wont be going away anytime soon. With the increasing modernization of patient records and diagnostic techniques, the amount of data generated by the healthcare industry is growing at an enormous rate. The application of computer science to the healthcare industry's large amount of data can reveal correlations and other patterns that can inform optimization changes to the industry, both in terms of efficiency of operation and effectiveness of treatment.

Baby Mozart

Roman Dimov

My final project for CS106X is an application that can help users compose harmonic progressions. The application will have have knowledge of harmonic theory in the form of a cyclic directed graph, with nodes indicating harmonies, and arcs indicating relations between harmonies. I will also find an API that will allow the application to play the harmonies.

An Image-Based Database Engine for Assessing Nutritional Intake

Hunter Guru

In the nutrition sciences, it is useful to measure the nutritional intake of individuals in both research & industry. However, challenges/inconveniences arise when monitoring dietary intake through invasive data collection.

The USDA has an entire repository of the majority of food items consumed in the United States, with complete nutrition facts and information. Researchers at the University of Nebraska - Lincoln are developing a smartplate which can measure all the information of a meal, based on special mass sensors and camera systems. Unfortunately, this system is unable to utilized the measured data to accurately identify items in the USDA's database.

This project will aim to fill that void. In order to log all the meal information of users, we must create a tool that can search items in the USDA database based on an image queries of a plate of food. Then, the application must log food items that are consumed by specific users and how they correlate to the nutritional needs of someone of that given body type & demographic. This project will heavily rely on utilizing custom data structures and graphs in order to model the database and differentiate between similar items, such as "cooked broccoli" and "fresh broccoli". Additionally, it will rely on recursion to parse the information the user inputs into the program.

Predicting Off-Target Genome Editing by CRISPR-Cas9

Robert Hu

CRISPR-Cas9 is a powerful genome editing technology that is able to cleave double stranded DNA at a specific site that corresponds to a matching single guide RNA (around 20 nucleotides long). However, off-target edits are oftentimes made at other locations in the DNA that possess the same or similar sequences. My plan for this project is to accept in a strand of guide RNA and predict sites of off-target edits within the human genome. The program will recursively search through the genome for similar sequences and will also attempt to evaluate the importance of said such locations (whether they are potentially a part of an important gene, regulator sequence, etc., or if the affected sequence is just some non-coding DNA).

Exploring NLP with Movie Scripts

Grant Hugh

This project explores natural language processing on a sizable collection of movie scripts, found at https://osf.io/zytmp/. I'll first try to use a simple text classifier to generate a plot of the sentiment throughout a script, with the ultimate goal of exploring deep learning techniques to potentially generate sentences relating to a specific movie or a specific movie genre. Due to logistical constraints, however, the text classifier aspect alone may be more than enough work for this project.

I'm interested in this project because I built a machine learning program to try to generate classical music last year with a couple of friends using Keras and I'd like to expand on that experience. I will probably be doing this project in Python and I'm hoping to learn how to use native TensorFlow, but if that poses too much of a challenge, I can always go back to using Keras which I'm more familiar with.

Connect Four

Brian Kaether

Connect Four is a simple game in which two players take turns placing circular pieces onto a grid board. Each player has their own color, and the first to successfully place four pieces of their color in a row horizontally, vertically, or diagonally wins. This project will implement the game Connect Four with a human user playing against the computer. The computer player will use recursion to search the existing board and choose the best location to place its piece each turn. Each time either player makes a move, both an internal model and graphical display of the board will be updated.

IdentifAI: Detecting Image Similarity and Verifying Ownership

Aditya Khandelwal

We live in an increasingly data-driven world. As we become more and more adept at using technology, our real-world identity blends with our virtual presence. While we have built sophisticated systems of interaction that billions of people use each day, we have fallen short in protecting people’s identity and privacy online. Reports from Cambridge Analytica, password hacks, etc. have caused us to take notice of some of the ways our data is at the risk of being misused. Furthermore, as we move from a centralized to a decentralized version of the internet -- one where data is immutable and stored forever, verifying ownership of digital content becomes an even more pressing issue.

IdentifAI is an API that exposes endpoints for fast and robust perceptual hashing functions -- a class of hashing functions that produces snippets of images in order to associate similar images back to the original author or creator.

Getting around Seoul

Andrew Lee

This project aims to provide the users with a simplified transportation map of Seoul. The project's main focus is on finding the quickest and the cheapest way to get from point A to point B using Seoul's public transportations (subway and bus).

Taskaway

Warren Mercer

Organizing time to study and work on assignments can be a challenge as new events and responsibilities come up. This project is an automated task scheduling calendar that moves flexible events around fixed events according to user set parameters as new events and task are added. It will be implemented using Stanford’s C++ libraries. Recursive backtracking will compute the optimal scheduling of flexible events around fixed events. A parent event class will be used to create schedule pointer arrays of the two event subclasses. A simple week view GUI will display the calendar and make modifications through a pop-up window opened by clicking on a particular day.

Chord Progression Generator

Cameron Most

This program aims to generate customized chord progressions according to user preferences. Four bar progressions will be formed using a model of a decision tree with nested maps. The user will rate successive chord progressions, and the rating will adjust the likelihood of selection for each chord root and quality shift.

Utilizing Decision Trees to Predict Outcomes of ATP Tennis Matches

Pujan Patel

This project implements a tennis match outcome predictor that essentially determines the likelihood that a given ATP player will beat their opponent. The algorithm will take in two ATP tennis players and utilize a decision tree to make a prediction. The project will be utilizing a large raw data set that includes the tennis players, rankings, court surface, and set scores for each match played since 2000. The decision tree algorithm will look into previous outcomes between the same players, past outcomes against similar opponents, and variability of outcome depending on court surface to predict the outcome of a given match.

DJ Mixer

Lucas Pauker

My project is to make an automatic DJ mixer. My program will take two mp3 files, and return (and play) an mp3 file with the songs mixed together. I will accomplish this by matching the bpm and pitch of the songs, then splicing them in certain ways that sound good.

Less Than 20 Questions

Max Pike

Many of us grew up with the handheld game "20 Questions" where the user would think of any object or thing, and the computer will magically find out what the user was thinking of. It accomplishes this by asking the user up to 20 yes or no questions that get more and more specific until the computer reaches the answer. "Less that 20 Questions" is a program that recreates the essence of the game by determining what word the user is thinking of within a subset of possibilities. The program will read in a file containing all possible solutions and associated links (questions) which the program will process and use for its decision making process.

A Random Forest Stock Picking Machine

Aman Sawhney

The purpose of this project is for me to explore machine learning using the skills I learned in CS106X. I will be building a random forest algorithm using a decision tree model. The model will rely on the skills in C++ I learned in CS106X such as implementing and creating classes, trees, memory management, and recursion. It will be trained on data from the 1970s and used on stocks data of that time and then trained on modern data and used to pick modern stocks. The purpose of using the different times is to demonstrate the efficient market hypothesis, and further to demonstrate that this trading strategy has been integrated in modern algorithmic trading.

3D Model Viewer

Akram Sbaih

My plan is to make the most basic 3D viewer possible using what I learned in both CS106X and Math51. This viewer would take files in a specific format that describes a 3D model as a collection of points connected into triangles in R3 space. The program views these models on a 2D screen from a specific point of view. The user can change the rotation of the model around the three axes and can zoom in and out of it. The viewer will take care of changing the shade (grey) of each triangle according to its orientation relative to a static light source.

This program is expected to use custom data structures that simulate matrices and vectors (as matrices) including the necessary operations on them required for the program. The code will use an already available 2D graphing library.

Logical HDL Interpreter

Coleman Smith

Logical hardware simulation often uses visual means for design and testing. This project aims to allow for more precise and inspectable design of logical circuits through the implementation of a Hardware Descriptor Language. The interpreter will allow for the import of other circuit designs, a unittest framework, and realtime interaction with the circuits. Together, these traits will allow users to design circuits from full adders to CPUs and interact with their creations.

Knowledge Tree

Michael Sun

This project focuses on creating a recursive tree-like data structure to retrieve all prerequisite concepts needed to understand a high-level concept.

Basic 4-Year Plan Generator

Kaili Wang

The 4-Year Plan generator uses information from explorecourses.stanford.edu and user inputs to satisfy constraints such as number of units, general requirements for graduation, pre-requisites, valid number of units for each quarter, etc. It will use elements of recursive backtracking as well as custom data types to represent classes, quarters, and years.

Conscription: A Chess-Inspired Game

Anthony Weng

Conscription is an original, chess-inspired game designed to extend the strategic space of traditional chess. Just like chess, the objective of the game is to checkmate your opponent’s king. However, players will also have the opportunity to exercise pre-gameplay strategy during the game’s unique conscription and placement phases. Before the game begins, each player will be assigned a number of points to conscript pieces with as determined by the size of the human-selected board (4x4 through 6x6). Following the conscription phase, players will take turns placing their custom armies, piece by piece, on the chess board, before traditional gameplay commences. This project will implement Conscription as a human-computer game, with the computer player utilizing recursive backtracking and alpha-beta pruning to efficiently calculate its most effective move during the piece placement and gameplay phases. Conscription’s graphical display will capture the board state before and after human and computer moves.

Procedural 2D Tile Map Generation

Michael Wood

Procedural tile map generation is a strategy often employed in 2D roguelike games to randomly generate a series of rooms and connecting hallways. In this implementation, each generated map is a "floor" making up a larger "dungeon". The next floor is always accessible through a randomly placed staircase somewhere on the current floor. When generating a floor, rooms are first placed in a grid-like setup with connecting hallways placed after, ensuring that no room is isolated. Water is added last with checking to ensure areas are not cut off by it. This is translated to a grid of integers representing whether a particular tile is a ground, wall, or water tile—this is the floor's generational foundation. The types of tiles surrounding said tile determine which image to place in that location. For example, if a tile is a wall and has three wall tiles above it, a wall tile to its left and right, and ground tiles all below, its image will be a continuous wall "facing" down toward the open area.

Creating a Chess-Playing Computer

Eric Xu

Chess has forever been the quintessential board strategic board game. My project will not only try to create an appealing, functional digital version of the game, but it will also try to create a effective CPU player. I plan to implement some form of minimax algorithm from the textbook, or maybe even utilize the vast amounts of data from professional chess games available online. Creating artificial tactics for the game of chess are particularly difficult because of the many factors that must be considered for every move such as positioning, combinations, piece-value differentials, sacrifices and gambits. There are already many computers out there that are able to look at a specific chess board and translate each potential move into a quantifiable value corresponding to how “good” the move is. I will try to create a similar algorithm, which could not only be used for the CPU player but also to help human players learn strategy.

Hold 'Em Simulator

George Younger

My project simulates a one player game against the computer of Texas Hold 'Em. It produces the best 5 card hand for both players based on the two cards they have and the five center cards. After the two cards are dealt to each player, it computes a probability of the computer winning and makes a bet for the computer based on that (same for flop, turn, and river). It implements the two players as their own classes and uses python list comprehensions to efficiently compute the best hand for a player, as well as utilizes a map of integers to both suits and values to efficiently represent a deck of cards. You can play games as long as you want provided you don’t run out of money. Finally, the project uses the terminal to simulate all of this instead of a GUI.

Earthquake Fingerprint Extraction and Clustering of the Stanford Fiber Optics Observatory Recordings via Deep learning and K-means

Siyuan Yuan

I am planning to train a deep neural nets autoencoder to extract main features (fingerprints) of earthquake recordings by our Stanford fiber optic cable laid under the Stanford campus. The recordings can be viewed as grey scale images, with one dimension to the time lapse and the other to be space. With the autoencoder to extract main features from our earthquake recordings, major coherent waveforms can be expected to recovered with random noise filtered out. For better taking computing resources, when training the neural net, I used a queue data structure to store the seismic data waiting to be processed. I use multithread CPU to enqueue, and GPU to dequeue and do the computation. Then, I apply the K-means algorithm to cluster the Neural net embedding and investigate the potential relationship between geographic locations and earthquake waveforms.

Five in a Row

Kaylee Zhang

Five in a Row (FIR), or Gomoku, is a logic-based board game popular in China that relies on planning several steps ahead to emerge victorious. Two players alternate placing stones of their color on a board of grid intersections until a player wins by having five stones of their color in a row horizontally, vertically, or diagonally. This project implements FIR as a human-computer interaction game, where each of the computer’s moves is optimized so as to challenge its human opponent. The computer player utilizes recursive backtracking to compute its optimal moves, along with a probabilistic move selection system that limits the breadth and depth of the search tree, which would otherwise be impractically large due to the game's expansive search space. The program’s graphical component captures the user’s moves as clicks and communicates with the model to update the game's state accordingly.

CS106X: Programming Abstractions in C++

Project Titles and Abstracts