$\DeclareMathOperator{\p}{Pr}$ $\DeclareMathOperator{\P}{Pr}$ $\DeclareMathOperator{\c}{^C}$ $\DeclareMathOperator{\or}{ or}$ $\DeclareMathOperator{\and}{ and}$ $\DeclareMathOperator{\var}{Var}$ $\DeclareMathOperator{\E}{E}$ $\DeclareMathOperator{\std}{Std}$ $\DeclareMathOperator{\Ber}{Bern}$ $\DeclareMathOperator{\Bin}{Bin}$ $\DeclareMathOperator{\Poi}{Poi}$ $\DeclareMathOperator{\Uni}{Uni}$ $\DeclareMathOperator{\Exp}{Exp}$ $\DeclareMathOperator{\N}{N}$ $\DeclareMathOperator{\R}{\mathbb{R}}$ $\newcommand{\d}{\, d}$

Syllabus

Updated 2023062106

If you have any questions after reading this Syllabus, post on our discussion forum, or email us at our mailing list: cs109 @ cs.stanford.edu.

Teaching Team

Yunsung Kim

Lecturer: Yunsung Kim
yunsung @
Gates 100
OH: M 10:30a to 12:30p

Will Song

Lecturer: Will Song
jsong5 @
240-201
OH: Th 3:00p to 5:00p

Kathleen Cheng

CA: Kathleen Cheng
kcheng18 @

Tori Qiu

CA: Tori Qiu
toriqiu @


I. Course Overview

While the initial foundations of computer science began in the world of discrete mathematics (after all, modern computers are digital in nature), recent years have seen a surge in the use of probability as a tool for the analysis and development of new algorithms and systems. As a result, it is becoming increasingly important for budding computer scientists to understand probability theory, both to provide new perspectives on existing ideas and to help further advance the field in new ways.

CS109: Probability for Computer Scientists starts by providing a fundamental grounding in combinatorics, and then quickly moves into the basics of probability theory. We will then cover many essential concepts in probability theory, including particular probability distributions, properties of probabilities, and mathematical tools for analyzing probabilities. Finally, the last third of the class will focus on data analysis and machine learning as a means for seeing direct applications of probability in this exciting and quickly growing subfield of computer science. This is going to be a great quarter and we are looking forward to the chance to teach you.

Learning Goals

Our goal in CS109 is to build foundational skills and give you experience in the following areas:

  1. Understanding the combinatorial nature of problems: Many real problems are based on understanding the multitude of possible outcomes that may occur, and determining which of those outcomes satisfy some criteria we care about. Such understanding is important both for determining how likely an outcome is, but also for understanding what factors may affect the outcome (and which of those may be in our control).
  2. Working knowledge of probability theory: Having a solid knowledge of probability theory is essential for computer scientists today. Such knowledge includes theoretical fundamentals as well as an appreciation for how that theory can be successfully applied in practice. We hope to impart both these concepts in this class.
  3. Appreciation for probabilistic statements: In the world around us, probabilistic statements are often made, but are easily misunderstood. For example, when a candidate in an election is said to have a 53% likelihood of winning does this mean that the candidate is likely to get 53% of the vote, or that that if 100 elections were held today, the candidate would win 53% of them? Understanding the difference between these statements requires an understanding of the model in the underlying probabilistic analysis.
  4. Applications: We are not studying probability theory simply for the joy of drawing summation symbols (okay, maybe some people are, but that's not what we're really targeting in this class), but rather because there are a wide variety of applications where probability allows us to solve problems that might otherwise be out of reach (or would be solved more poorly without the tools that probability can bring to bear). We'll look at examples of such applications throughout the class.
  5. An introduction to machine learning: Machine learning is a quickly growing subfield of artificial intelligence which has grown to impact many applications in computing. It focuses on analyzing large quantities of data to build models that can then be harnessed in real problems, such as filtering email, improving web search, understanding computer system performance, predicting financial markets, or analyzing DNA.

Course Topics

Here are the broad strokes of the course (in approximate order). More information is available on our Schedule page. We cover a very broad set of topics so that you are equipped with the probability and statistics you will see in your future CS studies!

  • Counting and probability fundamentals
  • Single-dimensional random variables
  • Probabilistic models
  • Uncertainty theory
  • Parameter estimation
  • Introduction to machine learning

Prerequisites

The prerequisites for this course are CS103, CS106B or X, and Math 51 (or equivalent courses). Probability involves a fair bit of mathematics (set theory, calculus, and familiarity with linear algebra), and we'll be considering several applications of probability in CS that require familiarity with algorithms and data structures covered in CS106B/X. Here is a quick rundown of some of the mathematical tools from CS103 and Math 51 that we'll be using in this class: multivariate calculus (integration and differentiation), linear algebra (basic operations on vectors and matrices), an understanding of the basics of set theory (subsets, complements, unions, intersections, cardinality, etc.), and familiarity with basic proof techniques. We'll also do combinatorics in the class, but we'll be covering a fair bit of that material ourselves in the first week. Past students have managed to take CS106B concurrently with CS109 and have done just fine. CS103 is the pre-requisite that we rely on the least. Students have done well even without having taken CS103.

II. Course Structure

Units

If you are an undergraduate, you are required to take CS109 for 5 units of credit (this is by department and university policy, no exceptions). If you are a graduate student, you may enroll in CS109 for 3 or 4 units if it is necessary for you to reduce your units for administrative reasons. Taking the course for reduced units does not imply any change in the course requirements.

Lectures

Lectures are MWF from 3:00p until 4:15p. We will be holding live lectures in-person in Skilling Auditorium. Come to learn the material, engage in interesting problems collectively with the class. While lecture attendance isn't mandatory it is correlated with doing well in the course and mastering the material. Come, enjoy, lecture is a good time.

Lecture Recordings

This class is recorded. Lecture videos will be uploaded to canvas. The recording team does their best to upload lectures promptly, however it can take up to 24 hours for lectures to be uploaded.

Sections

Active participation plays an important role in making you adept at combining probability and computer science. It has also been observed over many quarters that keeping up with the material highly correlates with improved class performance.

Each week (starting week 2) for 50 minutes you will meet in a small group with one of our outstanding CAs (section leaders) and work through problems. If you have taken any of the CS106 classes, our sections will be very similar---except with more probability. Sign-ups for sections will go out on Thursday, June 29th and will be open until noon Pacific Sunday, July 2nd. We will let you know which section you are in by Monday, June 3 and you will have your first section that week (during Week 2).

Section attendance and participation is optional, but we offer a section participation bonus of up to 5% of final grade for students who attend all sections. Your final exam score and section participation bonus will together make up 35% of your final grade. For instance, if you qualify for a 5% participation bonus your final exam will weigh 30% and you will earn 5% as bonus, whereas if you don't qualify for the bonus, your final exam will take up all 35%. Participating in section therefore has the effect of downscaling how much score you can lose from the final exam.

The amount of participation bonus will be based on how much you engage in section; a student that engages and helps other will receive a greater bonus than one who shows up late and doesn't participate. Students are allowed two (2) makeup sections (i.e. attending another TA's session in the same week) and one (1) unexcused section absence in the quarter to receive section participation bonus.

Section logistics. SCPD students are automatially enrolled in an Online section and we will communicate access and zoomlinks via email.

Grading

The grade for the course will be determined according to the following breakdown:

ComponentFinal grade
Problem Sets 40%
Midterm 25%
Final 30%~35%
Section Participation Bonus Up to 5%

Problem Sets

During the course, there will be six problem sets assigned. We put a lot of love into these problems so that they can help train you to become gifted practitioners of probability and computer science. Use them as practice. Doing well on the problem sets is the best way to prepare for life after CS109 (and the exams). Each student is to submit individual work on the problem sets. The problem sets will often include coding tasks, which will be primarily in python. We also strongly encourage you to learn LaTex, which is the interchangable markup language for typing math on a computer. We will hold review sessions for both.

After grades are released, you have one week to file a regrade request if you think that points were deducted incorrectly. We reserve the right to regrade your entire pset.

Late Policy

We anticipate that there may be unforeseen circumstances that make it difficult to turn in homework assignments on time. Our philosophy is to treat you as adults. We have a generous late policy to reflect the many different needs that may come up for our diverse set of students. However, the course will end for everyone on the same date. As such if you are late on one problem set, you will have to work extra hard to catch up. Time management can be hard and we encourage you to give it the full respect it deserves. In practice, falling behind often impacts midterm and final exam scores.

  • Due Date: All problem sets are due at 3:00pm Pacific on the on-time deadline listed on the assignment writeup. Finishing the assignment by the deadline means that you are in sync with the course.
  • Grace Period: All students will be automatically granted a penalty-free "grace period" for submission on all problem sets. The grace period 24 hours and allows you to submit the assignment after the original deadline, with no impact on the final grade. This grace period is meant to give built-in flexibility for any unexpected snags such as being 10 mins late on finishing. The 24 hour period is enforced by a computer clock so we encourage you not to push it to the limit.
  • Long Extension: CS109 is a fast-paced class and if you need an extension of this length then you may fall behind on future problem sets (or the midterm / final). Having said that, you might have a medical, personal or serious time-management situation which requires an "as long as possible" extension. We will give you up until the date when we release solutions. This time is set on a per pset basis but generally is two class days after the problem set was due. You can grant yourself an extension of this length. If you use more than one two of these in the quarter you will need to contact the TAs (Kathleen and Tori) for special permission. Why must you contact us? Because we care and we want to catch issues early. In the rare occurence that someone uses more than one of these extensions without getting permission it will impact their course grade.
  • After the Hard Deadline: We do not accept work after the solutions have been posted and TAs start grading. As such you should make sure you don't accidentally miss this very hard deadline! Yet, there may be a real crisis that means you are not able to do your work before the solutions are released (eg an illness that lasts a week, funeral attendance, etc). First, we hope you are well. Personal life is so truly important and we respect you doing what you need to do. In such an extreme case you need to contact the course staff (Will, Yunsung, Kathleen, and Tori) as well as your Undergraduate Advising Director and we will work something out. Please do contact us as early as possible.

Exams

In addition to the assignments, there will a midterm and a final:

  • Midterm: Tuesday, July 25th, 7:00 - 9:00pm
  • Final: Saturday, August 19th, 3:30 ~ 6:30pm

In addition to the assignments, there will be a 2-hour midterm exam and a 3-hour final exam. The midterm exam will be administered from 7:00-9:00pm on Tuesday July 25th. A week before the exam we will give more information on what you should do if you have an exceptional circumstance regarding the midterm and cannot make the regularly scheduled midterm exam.

For a variety of reasons (including university policy), there will be no alternate time for the final exam. Please make sure that you can attend the final exam at the specified time before enrolling in the class -- Typically you will not be able to make the final if you are taking another class at the same time as CS109. If you cannot make the final exam, you should plan on taking CS109 in a different quarter. If you can't take the exam, we will offer you an incomplete and you can take the final with a future offering of CS109.

SCPD Logistics

If you are an SCPD student then this course is going to be almost identical as for in-person students. The only important difference is that you will need to arrange a proctor to take the Midterm and the Final remotely.

III. Course Resources

Brand New Course Reader

2 Years ago Professor Chris Piech (who regularly teachers CS109 on regular quarters) thought to himself, "you know what CS109 students would probably love? A course reader!" So Chris wrote, and coded, and created a CS109 Course Reader. It is something which he is going to update we go. Please let Chris know if you are curious about something or if you find a typo. Please don't expect anything perfect. It has gone through a few quarters of iterations but you might still find some small bugs. Please check out the chapters and the demos: Course Reader. This is also a collectively owned document. Want to add to it? Or fix a typo? You can contribute Course Reader Github Project.

Optional Textbook

Sheldon Ross, A First Course in Probability (10th Ed.), Pearson Prentice Hall, 2018.

This is an optional textbook, meaning that the text is not required material, but students may find Ross offers a different and useful perspective on the important concepts of the class. Suggested, optional reading assignments from the textbook (10th Ed.) are in the schedule on the course website. The 8th, 9th, and 10th editions of the textbook are all fine for this class.

Borrowing the textbook online: HathiTrust, a library archive of which Stanford is a member, has granted the university online access to the 8th edition (2010) for the duration of the Fall quarter. The "check out" system works similarly to print reserves: A user can check out the book an hour at a time as long as they are actively using it. Access guidelines are on the HathiTrust How To Use It webpage. Once you're logged in, the book is at this link.

All students should retain receipts for books and other course-related expenses, as these may be qualified educational expenses for tax purposes. If you are an undergraduate receiving financial aid, you may be eligible for additional financial aid for required books and course materials if these expenses exceed the aid amount in your award letter. For more information, review your award letter or visit the Student Budget website.

"Working" Office Hours

To help make you more successful in this class , the course staff will hold "working" office hours. The idea is to encourage you to work on your problem sets at these office hours, so you can immediately ask any questions that come up while working on your problem sets. While you are certainly not required to attend any of these working office hours, they are simply meant to encourage you to interact with the course staff more often in order to help you better understand the course material. Besides, our job is to help everyone learn the material for this class, and being more accessible to you when you are actually working on your assignments (rather than when you just have a problem) will help the course go more smoothly for you (and it'll be more fun for us).

More information on office hours will be released in the first week of class on this page.

Accommodations

Students who may need an academic accommodation based on the impact of a disability must initiate the request with the Office of Accessible Education (OAE). Professional staff will evaluate the request with required documentation, recommend reasonable accommodations, and prepare an Accommodation Letter for faculty. For students who have disabilities that don't typically change appreciably over time, the letter from the OAE will be for the entire academic year; other letters will be for the current quarter only. Students should contact the OAE as soon as possible since timely notice (for example, at least a week before an exam) is needed to coordinate accommodations. Students should also send your accommodation letter to instructors as soon as possible. If you require additional, or different, accommodations specific to the Summer 2023 learning environment, please contact your disability adviser directly.

IV. Honor Code

Each student is expected to do their own work on the problem sets and exams in CS109. Students may discuss problem sets with each other as well as the course staff. Any discussion of problem set questions with others should be noted on a student's final write-up of the problem set answers. Each student must turn in their own write-up of the problem set solutions. Excessive collaboration (i.e., beyond discussing problem set questions) can result in honor code violations. Questions regarding acceptable collaboration should be directed to the class instructor prior to the collaboration.

It is a violation of the honor code to copy problem set or exam question solutions from others, or to copy or derive them from solutions found online or in textbooks, previous instances of this course, or other courses covering the same topics (e.g., STATS 116 or probability courses at other schools).Copying of solutions from students who previously took this or a similar course is also a violation of the honor code. Finally, a good point to keep in mind is that you must be able to explain and/or re-derive anything that you submit.

Please read our full Honor Code Policy, which specifically prohibits you from soliciting or taking solutions from other students or websites like Stack Overflow and Chegg.

Looking Forward to a Great Quarter

Genuinely, teaching CS109 is a profound joy. Thanks for coming to learn with us. We can't wait 🌱.