MS&E351 Tentative Syllabus
Markov decision processes
Total cost
Discounted cost
Average cost
Dynamic programming algorithms
Value iteration
Parallel/asynchronous variants
Policy iteration
Linear programming
Problemspecific ideas
Linear systems with quadratic cost
Inventory control
Portfolio management
Interchange argument
Queueing systems
Multiarmed bandits
Imperfect state information
Reduction to the basic problem
Sufficient statistics
Separation principle
POMDPs
Further directions
Approximate dynamic programming
Reinforcement learning
