Past Final Projects

Fall 2016

  • Model-Free Reinforcement Learning of Blackjack
  • Partially Observable Actions in Solving Markov Decision Processes. The Case for Insulin Dosing Optimization in Diabetic Patients.
  • Using Monte-Carlo Tree Search to Solve the Board Game Hive
  • Blackjack: How to use MDP’s to (nearly) beat the house
  • Cancer Metabolism Mapping: Bayesian Networks and Network Learning Techniques to Understand Cancer Metabolic and Regulatory Pathways
  • Gibbs Sampling in BayesNets.jl
  • UAV Collision Avoidance Policy Optimization with Deep Reinforcement Learning
  • Improving Training Efficiency in Deep Q-Learning for Atari Breakout
  • Monitoring Machine Workload Distribution with Kalman Filter
  • Approximating Transition Functions to Cart Track MDPs via Sub-State Sampling
  • Approaching Quantitative Trading with Machine Learning
  • Structure and Parameter Learning in Bayesian Networks with Applications to Predicting Breast Cancer Tumor Malignancy in a Lower Dimension Feature Space
  • Autonomous Racing by Learning from Professionals
  • Bravo Zulu: Optimizing Teammate Selection for Military and Civilian Applications
  • Investigating Transfer Learning in Deep Reinforcement Learning Context
  • Simultaneous Estimation and Control with MCTS
  • Controlling Soft Robots with POMCP
  • Automatic Learning of Computer Users’ Habits
  • Learning to Play Soccer in the OpenAI Gym
  • Playing Ultimate Tic-Tac-Toe with TD Learning and Monte Carlo Tree Search
  • A Bayesian Network Model of Pilot Response to TCAS Resolution Advisories
  • Improving Head Impact Kinematics Measurement Accuracy using Sensor Fusion
  • Drive Decision Making at Intersections
  • Deterministic and Bayesian Techniques for Spaceborne Vision-Based Non-Stellar Object Detection
  • A Two-Phased Deep Reinforcement Learning Algorith for High-Frequency Trading
  • Implementation and Experimentation of a DQN solver in Julia for POMDPs.jl
  • Landing on the Moon
  • Deserted Island: Cooperative Behavior in Absence of Explicit Delayed Reward
  • DeepGo.py
  • Managing Groundwater under Uncertain Seasonal Recharge
  • Using Reinforcement Learning to Find Flaws in Collision Avoidance Systems
  • Effectiveness of Bayesian Networks in Building a Prediction Model for Movie Success
  • Data Driven Agent based on Aircraft Intent
  • Deep Q-Learning with Natural Gradients
  • A Shot in the Dark: Beating Battleship with POMCP
  • Accelerated Asynchronous Deep Reinforcement Learning Variant of Advantage Actor-Critic
  • Applying Reinforcement Learning and Online Methods on the Inverted Pendulum Problem
  • Predicting Sentiment with Deep Q-Learning
  • A Lookahead Strategy for Super-Level Set Estimation using Gaussian Processes
  • Modeling Breast Cancer Treatment as a Markov Decision Process
  • Learning 31 using Cross-Entropy Methods
  • Improving Haptic Guidance using Reinforcement Learning
  • NLPLab: Actor-Critic Training in Natural Language Processing
  • Deep Reinforcement Learning on Atari Breakout
  • Reinforcement Learning for LunarLander
  • Reinforcement Learning for AI Machine Playing Hearthstone
  • Using Deep Q-Learning to Automate CNN Training
  • Automatic Continuous Variable Encoder in Bayesian Network
  • Side Channel Analysis using Neural Networks and Random Forests
  • A Decision-Making System for Wildfire Management
  • Decentralized Game Theoretic Methods for the Distributed Graph Coverage Problem
  • Autonomous altitude control for high altitude balloons
  • Neural Network Arbitration for Better Time and Accuracy trade-offs
  • Deep Deterministic Policy Gradient with Robot Soccer
  • Towards a Personal Decision Support System
  • Optimal Gerrymandering under Uncertainty
  • The Ambulance Dilemma: Crossing an Intersection with Monte Carlo Tree Search
  • DeepDominionDevelopmental Policy Design: an MDP approach
  • Training of a craps betting strategy with Reinforcement Learning Techniques
  • Engineering a Better Monkey
  • Decision Making During a Bicycle Race
  • Using Discrete Pressure Measurements to Understand Subsonic Bluff-Body Dynamic Damping
  • Effective Move Selection in Chess Without Lookahead Search
  • Solving Texas Hold’em with Monte-Carlo Planning
  • Reinforcement Learning of High-Speed Autonomous Driving through Unknown Map
  • Implementation and deployment of particle filter for simulated and real-world localization tasks
  • Tree Augmented Naive Bayes and Backward Simulation
  • Transfer of Q values across tasks in Reinforcement Learning
  • Training Regime Modifications for Deep Q-Network Learning Acceleration
  • Reinforce Optimizer
  • Approximating Ligand Docking Using a Markov Decision Process
  • Breaking Down Social Media Filter Bubbles via Reinforcement Learning
  • Performing an N-Sentiment Classification Task on Tinder Profiles Based On Image Feature Extraction
  • Play Blackjack With Monte Carlo Simulation And Q-learning with Linear Regression
  • Observer-Actor Neural Networks for Self-Play in Imperfect Information Games
  • Using Hybrid Bayes Nets to Model Country Prosperity
  • Solving a Pandemic! Various Approaches for Tackling the Board Game
  • Improved Markov Decision Process Model for Resource Allocation in Disaster Scenarios
  • Learning Chess through Reinforcement Learning
  • Deep Reinforcement Learning For Continuous Control: An Investigation of Techniques and Tricks
  • Computer Vision Through Perception: Semantic Understanding of Novel Scenes through Data Programming
  • Path Planning for Insertion of a Bevel-Tip Needle
  • Modeling human biases through reinforcement learning
  • Bootstrapping Neural Network with Auxiliary Tasks
  • Q-Learning Application in Optimizing Pokémon Battle Strategy
  • Model-based exploration in natural language generation
  • Automated Aircraft Touchdown
  • Longitudinal Vehicle Control using a Markov Decision Process and Deep Neural Network
  • MOMDP-based Aerial Target Search Optimization
  • Greedy Thick-Thinning Structure Learning and Bayesian Network Conditional Independence Implementations in BayesNets.jl
  • Multiagent Planning For Aerial Broadband Internet
  • Viral Marketing as an MDP
  • Neural Soccer - Towards Exploration by the Pursuit of Novelty
  • Locally Weighted Value Iteration in Julia
  • Fully-Nested Interactive POMDPs for Partially-Observable Turn-Based Games
  • Optimal Policy Considerations for Gas Turbine Maintenance
  • Learning Optimal Manipulation of Food Webs
  • Estimating Resource Prospector’s Probability of Failure Using Importance Sampling and Cross Entropy
  • Dynamically Discount Deep Reinforcement Learning
  • Deep Reinforcement Learning: Accelerated Learning with Effective Gradient Ascent Optimization Algorithms
  • Autonomous Human Tracking in Simulated Environment
  • A LQG Library for POMDPs.jl

Fall 2015

  • Mars Hab-Bot: Using MDPs to simulate a robot constructing human-livable habitats on Mars
  • A Value Iteration Study of BlackJack
  • Optimized Store-Stocking via Monte Carlo Tree Search with Stochastic Rewards
  • Trajectory Planning for Map Exploration Using Terrain Features
  • Instruction Following with Deep Reinforcement Learning
  • Using Markov Decision Processes to Minimize Golf Score
  • Reinforcement Learning for Scheduling I/O Requests on Multi-Queue Flash Storage
  • Finding the Perfect ‘Job’ in resource allocation
  • Maximizing Influence in Social Networks
  • A Machine Learning Regression Approach to General Game Playing
  • Modeling GPS Spoofing as a Partially Observable Markov Decision Process
  • Travel Hacking with MDPs
  • Optimal Mission Planning for a Satellite-Based Particle Detector via Online Reinforcement Learning
  • An MDP Approach to Motion Planning for Hopping Rovers on Small Solar System Bodies
  • Sampling Strategies for Deep Reinforcement Learning
  • Descriptive Power of Bayesian Structure Learning in Stock Market
  • Large-Scale Traffic Grid Signal Control Using Fuzzy Logic and Decentralized Reinforcement Learning
  • Simulated Pedestrian-like Navigation with a 1D Kalman Filter with an Accelerometer and the Global Positioning System
  • Search and Track Tradeoff for Multifunction Radars
  • Play Calling in American Football Using Value Iteration
  • Reinforcement learning for commodity trading
  • Learning the Stock Market, a Naive Approach
  • A POMDP Framework for Modelling Robotic Guidance During a Tissue Palpation Task
  • Reinforcement Learning of an Artificially Intelligent Hearts Player1
  • Toy Helicopter Control via Deep Reinforcement Learning
  • Gas Refuelling Optimization Modelled as a Markov Decision Process
  • Q-Matrix and Policy Compression via Deep Learning
  • Augmenting Self-Learning In Chess Through Expert Imitation
  • Monte Carlo Tree Search Applied to a Variant of the RockSample Problem
  • Supply Chain Management using POMDPs
  • Online Markov Decision Process Framework for Modeling Efficient Home Robot Cleaners
  • Reinforcement Learning for Path Planning with Soft Robotic Manipulators
  • Exploring POMDPS with Recurrent Neural Networks
  • Tic-tac-toe with reinforcement learning: best strategies and influence of parameters
  • Vehicle Speed Prediction using Long Short-Term Memory Networks
  • Explorations on Learning Bayesian Networks
  • Playing unknown game on a visual world
  • Reinforcement Learning for Atari Games
  • Q-learning in the Game of Mastermind
  • Modeling of a Baseball Inning as MDP
  • Reinforcement Learning for Path Planning with Soft Robotic Manipulators
  • Autonomous Driving on a Multi-lane Highway Using POMDPs
  • Solving a Maze Without Location Data
  • Markov Decision Processes and Optimal Policy Determination for Street Parking
  • Solving an opponent-based match-three mobile game
  • Life begins as a POMDP: improving decision making in the IVF clinic
  • Path Planning for Target-Tracking Unmanned Aerial Vehicle
  • Discrete State Filter Implementation for a Battleships Artificial Intelligence
  • POMDP for Search and Rescue with Obstacle Avoidance: Incorporation of Human in the Loop
  • Application performance over cellular networks
  • An MDP Approach to Motion Planning for Hopping Rovers on Small Solar System Bodies
  • Solving Dudo: beating Liar’s Dice with a POMDP
  • Reinforcement Learning for Tetris
  • Robot Path Planning using Monte Carlo POMDP
  • Reinforcement Learning of an Artificially Intelligent Hearts Player
  • Enhancing Computational Efficiency of PILCO Model-based Reinforcement Learning Algorithm
  • Analysis of UCT Exploration Parameter in Sailing Domain Problems
  • Solving a Search and Rescue Planning problem with MOMDPs
  • Robot Motion Planning in Unknown Environments using Monte Carlo Tree Search
  • Delivery optimization of an on-demand delivery service
  • Solving Multi­Agent Decision Making using MDPs
  • Efficient and Modular Inventory Management Framework for Small Businesseses
  • Markov Decision Processes in Board Game Playing
  • Automated Model Selection via Gaussian Processes
  • Predictive Hybrid Vehicle Control Policy
  • Optimal Policies for In-Space Satellite Communications
  • Spacecraft Navigation in Cluttered, Dynamic Environments Using 3D Lidar
  • Playing Chess Endgames using Reinforcement Learning
  • Space Debris Removal
  • Large-Scale Traffic Grid Signal Control Using Fuzzy Logic and Decentralized Reinforcement Learning
  • Relation Extraction from Scratch
  • Lane Merging as a Markov Decision Process
  • Using MDP/POMDP to Help in Search of Survivors of a Plane Crash
  • Applying POMCP to Controlling Partially Observable Diffusion Processes
  • Credit Risk Classification using Bayesian Network

Fall 2014

  • Automating Air Traffic Management for Flight Arrivals
  • Policy Learning for Sokoban
  • Flight Path Optimization Under Constraints Using a Markov Decision Process Approach
  • Visual Localization and POMDP for Autonomous Indoor Navigation
  • Monte Carlo Tree Search for Online Learning in Golf Course Management
  • Pushing on Leaves
  • Beating 2048
  • Improved electrical grid balancing with demand response scheduled by an MDP
  • Multi-Fidelity Model Management in Engineering Design Optimization Using Partially Observable Markov Decision Processes
  • Smarter Generators in Power Markets
  • Beach Paddle Ball
  • Applying POMDP to RockSample problem
  • Targeting Hostile Vehicle Modeled as a Partially Observable Markov Decision Process with State-Dependent Observation Model
  • Reinforcement Learning and Linear Gaussian Dynamics Applied to Multifidelity Optimization of a Supersonic Wedge
  • Approximate POMDP Solutions for Short-Range UAV Traffic Conflict Resolution
  • WorkSmart: The Implementation of a Modified Q-Learning Algorithm for an Intelligent Daily To-Do List Android Application
  • Imminent Obstacle Avoidance with Friction Uncertainty
  • Dynamic Restrictions during Commercial Space Vehicle Launches
  • Autonomous Direct Marketing with Deep Q-Learning
  • Efficient Risk Estimation for Chance-Constrained Robotic Motion Planning Under Uncertainty
  • Probabilistic Aircraft Arrival Rate Prediction
  • Audio Keylogging: Translating Acoustic Signals into Keystrokes
  • Collision Avoidance for Small Multi-Rotor Aircraft using SARSA(\(\lambda\)) and Fourier Basis Functions
  • Reinforcement Learning with Tetris
  • Stock Market Reinforcement Learning
  • Obstacle Avoidance for Automated Vehicle using Markov Decision Processes
  • Control of Epidemics on a Graph
  • Autonomous ATC for non-towered airports
  • Path Planning for Terrain Relative Navigation using POMDPs
  • Vehicle Braking Controller in a Markov Decision Process Framework
  • Multi-Armed Bandit Heuristics for HTTP Denial-of-Service Attacks
  • Structure Learning for Probabilistic Driving Models
  • Casino Blackjack Modeled as a Markov Decision Process
  • Competitive Collision Avoidance
  • Efficient Sampling Of Protein Landscapes Via Markov Decision Processes
  • Flight Deck Interval Management (An MDP Approach)
  • BGT Model for Analysis of Head-On Collisions
  • Collision Avoidance System Parameter Optimization
  • Dynamic Demand Prediction and Routing for Autonomous Mobility-on-Demand Systems
  • Action-Constrained, Multi-Species Task Scheduling: The Kayaker Problem
  • Reinforcement Learning with Low-rank Matrix Factorization
  • Automated Sequencing and Spacing of Arrival Aircraft in Final Vector Approach Airspace
  • Exploring Policy Learning for Blackjack