We'll hold three Python review sessions throughout the quarter to get you up to speed on what you'll need for the problem sets.
If you want to get more python practice, you can also check out Python tutorial notebook (make sure you are logged in with your Stanford accout)!
matplotlib
This handout only goes over probability functions for Python. We'll cover these concepts throughout the quarter. For a tutorial on the basics of python, there are many good online tutorials.
This installation guide was written by CS109 TA Tim Gianitsos in Spring 2020. You have a few options for running Python programs.
Install Python and PyCharm
You will need to install Python and PyCharm (or have a code editor you are comfortable with) on your computer. If you've installed Python as part of CS106A, you're good to go for CS109 and can stop reading here :-)
Use repl.it
Use the free online service repl.it in your browser. This does not require you to install anything on your computer, but it will require an internet connection during development. You will need to make a repl.it account and a Github account.
repl.it
Github
Use your existing installation of Python 3
If you have your own installation of Python 3.7 or higher, you can stop reading here. :-)
Please read the CS106A Installing PyCharm Guide. This will install Python 3.8 and PyCharm, a Python IDE.
Once you've installed PyCharm, watch this short video to install numpy in PyCharm.
numpy
Make sure to follow all of the installation instructions. Remember to always open the folder containing the .py files in PyCharm, not the .py files themselves.
.py
Go to https://repl.it/login and click the Github icon to login through Github. The Github icon looks like this:
Since you setup your account with the student developer pack, you can make private projects.
print('Hello World!')
Hello World!
If you are interested in speeding up your code on Problem Set 3 and beyond:
If we want to generate 10,000 samples of a Binomial, we can use the size optional parameter inside of np.random.binomial. The size parameter indicates the number of trials we want. We want to figure out the average value of $X \sim \text{Bin}(n = 10, p = 0.2)$ over 10,000 samples, so we pass in 10,000 to the size parameter. We can check the dimensionality of our array using the .shape property. Finally, if we want to take the mean over all 10,000 samples, we can use np.mean.
np.random.binomial
np.mean
samples = np.random.binomial(10, 0.2, size=10000) samples.shape # Get array shape: (10000,) np.mean(samples) # Mean over array: 1.9961
Compute $n!$ as an integer. This example computes $20!$:
import math print(math.factorial(20))
Computes $n \choose m$ as a float. This example computes $10 \choose 5$:
from scipy import special print(special.binom(10, 5))
Make a Binomial Random variable $X$ and compute its probability mass function (PMF) or cumulative density function (CDF). We love the scipy stats library because it defines all the functions you would care about for a random variable, including expectation, variance, and even things we haven't talked about in CS109, like entropy. This example declares $X \sim \text{Bin}(n = 10, p = 0.2)$. It calculates a few statistics on $X$. It then calculates $P(X = 3)$ and $P(X \leq 4)$. Finally it generates a few random samples from $X$:
from scipy import stats X = stats.binom(10, 0.2) # Declare X to be a binomial random variable print(X.pmf(3)) # P(X = 3) print(X.cdf(4)) # P(X <= 4) print(X.mean()) # E[X] print(X.var()) # Var(X) print(X.std()) # Std(X) print(X.rvs()) # Get a random sample from X print(X.rvs(10)) # Get 10 random samples form X
From a terminal you can always use the "help" command to see a full list of methods defined on a variable (or for a package):
from scipy import stats X = stats.binom(10, 0.2) # Declare X to be a binomial random variable help(X) # List all methods defined for X
Make a Poisson Random variable $Y$. This example declares $Y \sim \text{Poi}(\lambda = 2)$. It then calculates $P(Y = 3)$:
from scipy import stats Y = stats.poisson(2) # Declare Y to be a poisson random variable print(Y.pmf(3)) # P(Y = 3) print(Y.rvs()) # Get a random sample from Y
Make a Geometric Random variable $X$, the number of trials until a success. This example declares $X \sim \text{Geo}(p = 0.75)$:
from scipy import stats X = stats.geom(0.75) # Declare X to be a geometric random variable print(X.pmf(3)) # P(X = 3) print(X.rvs()) # Get a random sample from Y
Make a Normal Random variable $A$. This example declares $A \sim N(\mu = 3, \sigma^2 = 16)$. It then calculates $f_Y(0)$ and $F_Y(0)$. Very Important!!! In class, the second parameter to a normal was the variance ($\sigma^2$). In the scipy library, the second parameter is the standard deviation ($\sigma$):
import math from scipy import stats A = stats.norm(3, math.sqrt(16)) # Declare A to be a normal random variable print(A.pdf(4)) # f(3), the probability density at 3 print(A.cdf(2)) # F(2), which is also P(Y < 2) print(A.rvs()) # Get a random sample from A
Make an Exponential Random variable $B$. This example declares $B \sim \text{Exp}(\lambda = 4)$:
from scipy import stats B = stats.expon(4) # Declare B to be an exponential random variable print(B.pdf(1)) # f(1), the probability density at 1 print(B.cdf(2)) # F(2) which is also P(B < 2) print(B.rvs()) # Get a random sample from B
Make an Beta Random variable $X$. This example declares $X \sim \text{Beta}(\alpha = 1, \beta = 3)$:
from scipy import stats X = stats.beta(1, 3) # Declare X to be a beta random variable print(X.pdf(0.5)) # f(0.5), the probability density at 1 print(X.cdf(0.7)) # F(0.7) which is also P(X < 0.7) print(X.rvs()) # Get a random sample from X