{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Lecture 22: Estimation\n",
    "\n",
    "<div class=\"layout\" style=\"display: flex; justify-content: space-around;\">\n",
    "\n",
    "<div style=\"flex: 2;\" >\n",
    "\n",
    "**Concepts and Learning Goals:**\n",
    "\n",
    "\n",
    "- Estimating an unknown quantity by sampling\n",
    "- Standard deviation of an estimate\n",
    "- Confidence intervals\n",
    "\n",
    "</div>\n",
    "\n",
    "<div style=\"flex: 1;\" >\n",
    "\n",
    "\n",
    "</div>\n",
    "</div>\n",
    "\n",
    "# Recap\n",
    "\n",
    "## Hypothesis testing\n",
    "\n",
    "\n",
    "\n",
    "- A **p-value** is the probability of finding a result *at least* as extreme/surprising, if outcomes happened by random chance alone.\n",
    "- The **null hypothesis** corresponds to \"no effect.\"\n",
    "- The **alternative hypothesis** corresponds to \"an effect.\"\n",
    "- Small p-value \u2192 evidence against the null hypothesis.\n",
    "\n",
    "## Experiments\n",
    "\n",
    "- The best way to determine causality is to run a **randomized experiment**.\n",
    "- This is the gold-standard for inferring a causal relationship between a treatment and an outcome.\n",
    "- Hypothesis testing (potential outcomes, permutation tests) can be used to analyze a randomized experiment.\n",
    "- In **observational studies** the treatment and control groups might not be comparable and this leads to *confounding*.\n",
    "\n",
    "# Estimation\n",
    "\n",
    "## Kissing right\n",
    "\n",
    "\n",
    "<div class=\"layout\" style=\"display: flex; justify-content: space-around;\">\n",
    "\n",
    "<div style=\"flex: 2;\" >\n",
    "\n",
    "- A <a href = \"https://www.nature.com/articles/421711a.pdf\">Nature Communication piece</a>  investigates whether couples have a tendency to turn their heads left or right when kissing.\n",
    "- The researchers observed couples in public places and recorded which way they turned their heads when kissing.\n",
    "\n",
    "\n",
    "</div>\n",
    "\n",
    "<div style=\"flex: 1;\" >\n",
    "\n",
    "\n",
    "<figure style=\"text-align:center;\"><img src=\"../figures/rodin_kiss.jpeg\" alt=\"\" style=\"width:70%;\" ><figcaption>In Rodin's Kiss the couple is kissing right.</figcaption></figure>\n",
    "\n",
    "\n",
    "</div>\n",
    "</div>\n",
    "\n",
    "## Study results\n",
    "\n",
    "<div class=\"layout\" style=\"display: flex; justify-content: space-around;\">\n",
    "\n",
    "<div style=\"flex: 2;\" >\n",
    "\n",
    "- What is the parameter of interest for this study?\n",
    "  - *Answer:* the parameter of interest is the long run proportion of couples who would turn their heads right when kissing.\n",
    "- Out of 124 couples, 80 turned their heads to the right.\n",
    "\n",
    "\n",
    "</div>\n",
    "\n",
    "<div style=\"flex: 1;\" >\n",
    "\n",
    "\n",
    "<figure style=\"text-align:center;\"><img src=\"../figures/kissing_results.webp\" alt=\"\" style=\"width:100%;\" ><figcaption></figcaption></figure>\n",
    "\n",
    "\n",
    "</div>\n",
    "</div>\n",
    "\n",
    "\n",
    "## Hypothesis test\n",
    "\n",
    "- Using the material from week 6, we can test the hypothesis $H_0 : \\pi = 0.5$.\n",
    "- We can use the <a href = \"https://www.rossmanchance.com/applets/2021/oneprop/OneProp.htm\">one-proportion applet</a>.\n",
    "- The p-value is very small. What can we conclude about the null hypothesis?\n",
    "\n",
    "## Another hypothesis test\n",
    "\n",
    "- By most <a href = \"https://en.wikipedia.org/wiki/Handedness\">estimates</a> about 90% of people are right-handed.\n",
    "- Maybe people tend to turn towards their dominant hand.\n",
    "- How could this be formulated as a hypothesis?\n",
    "  - *Answer:* The null hypothesis would be $H_0 : \\pi = 0.9$.\n",
    "- Lets's use the <a href = \"https://www.rossmanchance.com/applets/2021/oneprop/OneProp.htm\">one-proportion applet</a> again.\n",
    "- The p-value is again very small.\n",
    "\n",
    "## A question\n",
    "\n",
    "- We have strong evidence against both $\\pi = 0.5$ and $\\pi = 0.9$.\n",
    "- But what would be a plausible value of $\\pi$ based on the data?\n",
    "  - *Answer:* $\\frac{80}{124} \\approx 0.66$ which is the proportion of couples who kissed right in the sample.\n",
    "- The sample proportion is called an **estimate** of $\\pi$.\n",
    "\n",
    "# Estimation\n",
    "\n",
    "## Populations and parameters\n",
    "\n",
    "\n",
    "- In general, suppose we want to ask a large group of people a yes/no question.\n",
    "  - Example: ask Stanford undergraduate students if they support the proctoring pilot.\n",
    "- This defines a **population** (current Stanford undergraduates) and a **parameter**  $\\pi$ (the proportion of undergraduates who support the proctoring pilot).\n",
    "\n",
    "## Sampling\n",
    "\n",
    "<div class=\"layout\" style=\"display: flex; justify-content: space-around;\">\n",
    "\n",
    "<div style=\"flex: 1;\" >\n",
    "\n",
    "\n",
    "- If we asked *every* person in the population, then we would know $\\pi$.\n",
    "- But this would be time-consuming and expensive (imagine asking every Stanford undergraduate).\n",
    "\n",
    "\n",
    "</div>\n",
    "\n",
    "<div style=\"flex: 1;\">\n",
    "\n",
    "- Instead, we could take a **sample** from the population.\n",
    "\n",
    "  <figure style=\"text-align:center;\"><img src=\"../figures/wiki_sampling.png\" alt=\"\" style=\"width:100%;\" ><figcaption></figcaption></figure>\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "</div>\n",
    "</div>\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "## Estimating from a sample\n",
    " \n",
    "- Suppose we select $n$ people from the population uniformly at random ($n$ random Stanford students).\n",
    "- Then we ask them the yes/no question (do you support the proctoring pilot).\n",
    "- If $m$ out of $n$ answer yes, then our **estimate** of $\\pi$ is\n",
    "\n",
    "  $$\\hat{\\pi}_n = \\frac{m}{n}$$\n",
    "\n",
    "- The \"hat\" on $\\hat{\\pi}_n$ is to emphasize that it is an estimate. The $n$ in $\\hat{\\pi}_n$ is the sample size.\n",
    "\n",
    "\n",
    "## How good is the estimate?\n",
    "\n",
    "- True or false: $\\hat{\\pi}_n = \\pi$?\n",
    "- **Answer:** false, most of the time we will have $\\hat{\\pi}_n \\neq \\pi$.\n",
    "- The estimate $\\hat{\\pi}_n$ is random: if we had sampled different students, then we would get a different value of $\\hat{\\pi}_n$.\n",
    "- We want to know how close $\\hat{\\pi}_n$ is to $\\pi$.\n",
    "\n",
    "\n",
    "\n",
    "## Distribution of $\\hat{\\pi}_n$\n",
    "\n",
    "- The distribution of $\\hat{\\pi}_n$ depends on two things: the sample size $n$ and the parameter $\\pi$. \n",
    "- How do you expect the distribution of $\\hat{\\pi}_n$ to change if the sample size $n$ increased?\n",
    "  - **Answer:** The variability in the distribution should decrease: $\\hat{\\pi}_n$ should get closer to $\\pi$.\n",
    "- How would the distribution of $\\hat{\\pi}_n$ change if $\\pi$ increased?\n",
    "  - **Answer:** the distribution would shift to the right to stay centered at $\\pi$.\n",
    "\n",
    "\n",
    "## Simulation for $\\hat{\\pi}_n$\n",
    "\n",
    "- Suppose that we know $\\pi=0.6$ (60% of Stanford student support the proctoring pilot).\n",
    "- We can then do a simulation to look at the distribution of $\\hat{\\pi}_n$ for different values of $n$. \n",
    "\n",
    "## Simulation for $n=1$\n",
    "\n",
    "\n",
    "<figure style=\"text-align:center;\"><img src=\"../figures/sampling_hist_1.png\" alt=\"\" style=\"width:75%;\" ><figcaption></figcaption></figure>\n",
    "\n",
    "\n",
    "## Simulation for $n=5$\n",
    "\n",
    "\n",
    "<figure style=\"text-align:center;\"><img src=\"../figures/sampling_hist_5.png\" alt=\"\" style=\"width:75%;\" ><figcaption></figcaption></figure>\n",
    "\n",
    "\n",
    "## Simulation for $n=10$\n",
    "\n",
    "<figure style=\"text-align:center;\"><img src=\"../figures/sampling_hist_10.png\" alt=\"\" style=\"width:75%;\" ><figcaption></figcaption></figure>\n",
    "\n",
    "\n",
    "## Simulation for $n=20$\n",
    "\n",
    "<figure style=\"text-align:center;\"><img src=\"../figures/sampling_hist_20.png\" alt=\"\" style=\"width:75%;\" ><figcaption></figcaption></figure>\n",
    "\n",
    "\n",
    "## Simulation for $n=40$\n",
    "\n",
    "<figure style=\"text-align:center;\"><img src=\"../figures/sampling_hist_40.png\" alt=\"\" style=\"width:75%;\" ><figcaption></figcaption></figure>\n",
    "\n",
    "\n",
    "## Simulation for $n=100$\n",
    "\n",
    "<figure style=\"text-align:center;\"><img src=\"../figures/sampling_hist_100.png\" alt=\"\" style=\"width:75%;\" ><figcaption></figcaption></figure>\n",
    "\n",
    "\n",
    "## Simulation summary\n",
    "\n",
    "\n",
    "- What do you notice about the distribution of $\\hat{\\pi}_n$?\n",
    "- The distribution of the estimate $\\hat{\\pi}_n$ is centered at the parameter $\\pi$.\n",
    "  - The *expected value* of $\\hat{\\pi}_n$ is $\\pi$.\n",
    "- The distribution of $\\hat{\\pi}_n$ is less spread out as $n$ gets bigger.\n",
    "  - The *variability* of $\\hat{\\pi}_n$ decreases as $n$ gets bigger.\n",
    "- When $n$ is large, the distribution of $\\hat{\\pi}_n$ looks \"bell shaped.\"\n",
    "\n",
    "\n",
    "\n",
    "# Standard deviation\n",
    "\n",
    "## Standard deviation recap\n",
    "\n",
    "- In lecture 7, you saw that the standard deviation was one way to measure the variability of a distribution.\n",
    "- How does the standard deviation of $\\hat{\\pi}_n$ change as $n$ increases?\n",
    "- **Answer**: the standard deviation *decreases* as the sample size increases. The distribution becomes less variable/spread-out.\n",
    "- Sample size matters: larger sample size means lower standard deviation.\n",
    "\n",
    "## Standard deviation of $\\hat{\\pi}_n$\n",
    "\n",
    "- There is an exact formula for the standard deviation of $\\hat{\\pi}_n$.\n",
    "\n",
    "  $$\\text{Standard deviation of }\\hat{\\pi}_n = \\sqrt{\\frac{\\pi(1-\\pi)}{n}} $$\n",
    "\n",
    "\n",
    "- $\\sqrt{\\pi(1-\\pi)}$ is the standard deviation of $\\hat{\\pi}_1$ (just asking one person).\n",
    "- In a sample of size $n$, the standard deviation is a factor of $\\frac{1}{\\sqrt{n}}$ times smaller than just asking one person.\n",
    "\n",
    "## Computing the standard deviation\n",
    "\n",
    "- The parameter $\\pi$ appears in the formula for the standard deviation ($\\sqrt{\\frac{\\pi(1-\\pi)}{n}}$). But we don't know $\\pi$!\n",
    "\n",
    "- We can use the estimate $\\hat{\\pi}_n$:\n",
    "  $$\\text{Standard deviation of }\\hat{\\pi}_n \\approx \\sqrt{\\frac{\\hat{\\pi}_n(1-\\hat{\\pi}_n)}{n}} $$\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "## Kissing right\n",
    "\n",
    "- In the kissing right study, $n = 124$ and $\\hat{\\pi}_n = 0.66$\n",
    "- Let's use $\\hat{\\pi}_n$ to compute the standard deviation:\n",
    "\n",
    "  $$\\sqrt{\\frac{\\hat{\\pi}_n(1-\\hat{\\pi}_n)}{n}} = \\sqrt{\\frac{0.66 \\times 0.34}{124}} = 0.042$$\n",
    "\n",
    "- Interpretation: the average distance between $\\hat{\\pi}_n$ and the parameter $\\pi$ is around $0.042$\n",
    "\n",
    "\n",
    "# The normal approximation\n",
    "\n",
    "## Bell shaped distribution\n",
    "\n",
    "- For large samples, the distribution of $\\hat{\\pi}_n$ is \"bell shaped\".\n",
    "\n",
    "  <div class=\"layout\" style=\"display: flex; justify-content: space-around;\">\n",
    "\n",
    "  <div style=\"flex: 2;\" >\n",
    "\n",
    "  <figure style=\"text-align:center;\"><img src=\"../figures/sampling_hist_40.png\" alt=\"\" style=\"width:100%;\" ><figcaption></figcaption></figure>\n",
    "\n",
    "  </div>\n",
    "\n",
    "  <div style=\"flex: 2;\" >\n",
    "\n",
    "  <figure style=\"text-align:center;\"><img src=\"../figures/sampling_hist_100.png\" alt=\"\" style=\"width:100%;\" ><figcaption></figcaption></figure>\n",
    "\n",
    "  </div>\n",
    "  </div>\n",
    "\n",
    "- This bell shape is described by the \"normal distribution.\"\n",
    "\n",
    "## The normal distribution\n",
    "\n",
    "- When $n$ is large enough, the distribution of $\\hat{\\pi}_n$ is close to the **normal distribution**.\n",
    "- The normal distribution lets us use the 68-95-99 rule for $\\hat{\\pi}_n$.\n",
    "- This helps us be more precise when we talk about how close $\\hat{\\pi}_n$ is $\\pi$.\n",
    "\n",
    "## 68-95-99 rule\n",
    "\n",
    "- The 68-95-99 rule states that:\n",
    "  1. With **68%** probability: $\\hat{\\pi}_n$ is within **one** standard deviation of $\\pi$.\n",
    "  2. With **95%** probability: $\\hat{\\pi}_n$ is within **two** standard deviations of $\\pi$.\n",
    "  3. With **99%** probability: $\\hat{\\pi}_n$ is within **three** standard deviations of $\\pi$.\n",
    "- Remember: the standard deviation of $\\hat{\\pi}_n$ is $\\sqrt{\\frac{\\pi(1-\\pi)}{n}}$\n",
    "\n",
    "## 68-95-99 rule visualized \n",
    "\n",
    "<figure style=\"text-align:center;\"><img src=\"../figures/68-95-99-rule.png\" alt=\"\" style=\"width:100%;\" ><figcaption></figcaption></figure>\n",
    "\n",
    "## 68-95-99 rule: example\n",
    "\n",
    "- Suppose $\\pi=0.6$ and $n=100$\n",
    "- The standard error of $\\hat{\\pi}_n$ is\n",
    "\n",
    "  $$\\sqrt{\\frac{\\pi(1-\\pi)}{n}} = \\sqrt{\\frac{0.6 \\times 0.4}{100}} = 0.049$$\n",
    "\n",
    "- With 95% probability, $\\hat{\\pi}_n$ will be within $2 \\times 0.049=0.098$ of $\\pi=0.6$\n",
    "- With 95% probability $\\hat{\\pi}_n$ will be between $0.6 - 0.098=0.502$ and $0.6 + 0.098 = 0.698$\n",
    "\n",
    "## How large does $n$ need to be?\n",
    "\n",
    "- A rule of thumb, is that there should be at least 10 \"yesses\" and 10 \"nos\" in the sample to apply the normal approximation.\n",
    "- For the kissing right study: there were 80 couples who turned right and 44 couples who turn left. The normal approximation can be used.\n",
    "- In general, the normal approximation gets more accurate for large $n$ and for proportions closer to $0.5$.\n",
    "\n",
    "# Confidence intervals\n",
    "\n",
    "## Kissing right\n",
    "\n",
    "- In the kissing right study, $\\hat{\\pi}_n = 0.66$ and the standard deviation of $\\hat{\\pi}_n$ is $0.042$ \n",
    "- By the 68-95-99 rule: with 95% probability $\\hat{\\pi}_n$ is within $2 \\times 0.042$ of $\\pi$\n",
    "- Based on $\\hat{\\pi}_n$, a plausible range for $\\pi$ would be between \n",
    "\n",
    "  $$0.66 - 2\\times 0.042= 0.576$$\n",
    "and \n",
    "  $$0.66+2\\times 0.042 = 0.744$$\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "## Confidence intervals\n",
    "\n",
    "- A **confidence interval** is a collection of plausible values of the parameter based on the data.\n",
    "- Example: $[0.576, 0.744]$ is a confidence interval for the proportion of couples who turn right when kissing ($[a,b]$ represents the interval of numbers between $a$ and $b$).\n",
    "- The confidence interval $[0.576, 0.744]$ is called a **95% confidence interval** (we used two standard deviations).\n",
    "- We are 95% confident that $\\pi$ is between 0.576 and 0.744.\n",
    "\n",
    "## Confidence intervals questions\n",
    "\n",
    "- **Question:** would a 99% confidence interval be bigger or smaller than a  95% confidence interval? \n",
    "- **Answer:** The 99% confidence interval will be bigger. To have higher confidence, we have to include more values.\n",
    "- **Question:** how could we compute a 99% confidence interval based on the kissing right data?\n",
    "- **Answer:** by the 68-95-99 rule, we are 99% confident that $\\pi$ is between\n",
    "  $$0.66 - 3\\times 0.042= 0.534$$\n",
    "and \n",
    "  $$0.66+3\\times 0.042 = 0.786$$\n",
    "\n",
    "\n",
    "## Confidence intervals and hypothesis tests\n",
    "\n",
    "\n",
    "- The confidence interval $[0.576, 0.744]$ contains the values of $\\pi$ which we would *not reject* in a hypothesis test with threshold 0.05\n",
    "- Example 1: we rejected $H_0: \\pi=0.5$ and $0.5$ is not in  $[0.576, 0.744]$\n",
    "- Example 2: we would not reject $H_0 : \\pi=0.6$ because $0.6$ is in $[0.576, 0.744]$ (<a href = \"https://www.rossmanchance.com/applets/2021/oneprop/OneProp.htm\">one-proportion applet</a>)\n",
    "\n",
    "## Practice\n",
    "\n",
    "- A student wants to assess if her dog Muffin is more likely to chase a red ball or a blue ball when both are rolled.\n",
    "- The student rolls both balls 96 times and Muffin chased the blue ball 52 times.\n",
    "- What is the parameter of interest?\n",
    "  - The parameter $\\pi$ is the long-run probability of Muffin chasing the blue ball.\n",
    "\n",
    "## Practice continued\n",
    "\n",
    "- What is the estimate of $\\pi$?\n",
    "  - The estimate of $\\pi$ is the sample proportion $\\hat{\\pi}_n = \\frac{52}{96} = 0.54$.\n",
    "- What is the standard deviation of $\\hat{\\pi}_n$?\n",
    "  - The standard deviation of $\\hat{\\pi}_n$ is $\\sqrt{\\frac{\\hat{\\pi}_n(1-\\hat{\\pi}_n)}{n}} = 0.056$\n",
    "- How would you compute a 95% confidence interval for $\\pi$?\n",
    "- By the 68-95-99 rule:\n",
    "  $$[0.54 - 2\\times 0.056, 0.54 + 2 \\times 0.056] = [0.428, 0.652]$$\n",
    "\n",
    "## Confidence intervals and sample size\n",
    "\n",
    "- To use the 68-95-99 rule to make confidence intervals, the sample size needs to be large.\n",
    "  - Roughly: there should be at least 10 \"yesses\" and \"nos\".\n",
    "  - In math: you need $n \\hat{\\pi}_n \\ge 10$ and $n(1-\\hat{\\pi}_n) \\ge 10$.\n",
    "- For smaller sample sizes, the normal approximation is not very accurate, but there are other methods that can be used to make confidence intervals.\n",
    "\n",
    "\n",
    "# Summary\n",
    "\n",
    "## Sampling and estimation\n",
    "\n",
    "- **Samples** can be used to estimate parameters.\n",
    "- Population \u2194 parameter \u2194 $\\pi$\n",
    "- Sample \u2194 estimate \u2194 $\\hat{\\pi}_n$\n",
    "- The distribution of $\\hat{\\pi}_n$ is centered at $\\pi$ and has standard deviation\n",
    "\n",
    "  $$\n",
    "  \\sqrt{\\frac{\\pi(1-\\pi)}{n}} \\approx \\sqrt{\\frac{\\hat{\\pi}_n(1-\\hat{\\pi}_n)}{n}}\n",
    "  $$ \n",
    "\n",
    "- Larger sample size \u2194 smaller standard deviation \u2194 $\\hat{\\pi}_n$ is closer to $\\pi$\n",
    "\n",
    "## Confidence intervals and the normal approximation\n",
    "\n",
    "\n",
    "- A **confidence interval** is a collection of plausible values for the parameter.\n",
    "- A confidence interval has a **confidence level** (for example 95%).\n",
    "- We are 95% confident that a 95% confidence interval contains the parameter.\n",
    "- Confidence intervals can be calculated using the **68-95-99 rule** and the **normal approximation**.\n",
    "\n",
    "\n",
    "<!-- \n",
    "\n",
    "# Estimation for quantitative variables\n",
    "\n",
    "## Sampling for microplastics \n",
    "\n",
    "- We want to determine the concentration of microplastics in the Palo Alto tap water.\n",
    "- This concentration is also a parameter. It is a fixed unknown quantity $\\mu$.\n",
    "- Estimating $\\mu$ with a sample:\n",
    "  - Take $n$ water samples and measure the microplastics in each. This produces measurements $x_1,x_2,\\ldots,x_n$.\n",
    "  - Estimate $\\mu$ with the sample mean:\n",
    "\n",
    "    $$ \\hat{\\mu}_n = \\frac{x_1+x_2+\\cdots + x_n}{n} $$\n",
    "\n",
    "## Properties of $\\hat{\\mu}_n$\n",
    "\n",
    "- Again $\\hat{\\mu}_n$ is random. If we took a new sample of size $n$, then we would get a different value of $\\hat{\\mu}_n$.\n",
    "- Most of the time we will have $\\hat{\\mu}_n \\neq \\mu$, but $\\hat{\\mu}_n$ should be close to $\\mu$.\n",
    "\n",
    "## Standard deviation of $\\hat{\\mu}_n$\n",
    "\n",
    "- Let $\\sigma_x$ be the standard deviation of a single sample $x$.\n",
    "- The standard deviation of the estimate $\\hat{\\mu}_n$ is given by:\n",
    "\n",
    "  $$\\text{standard deviation of } \\hat{\\mu}_n = \\frac{\\sigma_x}{\\sqrt{n}}$$\n",
    "\n",
    "- As with proportions, the standard deviation of $\\hat{\\mu}_n$ is smaller by a factor of $\\frac{1}{\\sqrt{n}}$.\n",
    "\n",
    "## Computing the standard deviation\n",
    "\n",
    "- The standard deviation of a single sample $\\sigma_x$ is not known.\n",
    "- So instead we will estimate it with the sample standard deviation $\\hat{\\sigma}_x$.\n",
    "\n",
    "  $$\\text{standard deviation of } \\hat{\\mu}_n \\approx \\frac{\\hat{\\sigma}_x}{\\sqrt{n}}$$\n",
    "\n",
    "- Unlike, $\\sigma_x$, $\\hat{\\sigma}_x$ can be computed from the sample $x_1,\\ldots,x_n$.\n",
    "\n",
    "## Microplastics\n",
    "\n",
    "- Suppose that collected $n=100$ water samples and measured the microplastics for all of them.\n",
    "- Suppose that the estimate $\\hat{\\mu}_n$ is  300 nano grams per serving and $\\hat{\\sigma}_x$ is 50 nano grams per serving (these numbers are made up).\n",
    "- What is the standard deviation of $\\hat{\\mu}_n$?\n",
    "\n",
    "- **Answer:**\n",
    "\n",
    "  $$\\frac{\\hat{\\sigma}_x}{\\sqrt{n}} = \\frac{50}{\\sqrt{100}} = \\frac{50}{10} = 5$$\n",
    "\n",
    "# Population and samples\n",
    "\n",
    "## Population vs sample\n",
    "\n",
    "## Polling\n",
    "\n",
    "## Microplastics\n",
    "\n",
    "## Example\n",
    "\n",
    "\n",
    "\n",
    "# (Sample) size matters\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "## Similarities and differences with hypothesis test\n",
    "\n",
    "- Hypothesis tests and p-values can answer the question:\n",
    "  - Based on the data, is $\\pi$ equal to particular value such as $0.5$?\n",
    "- We want to ask:\n",
    "  - Based on the data, what is the likely value of $\\pi$? \n",
    "- Intuitively:\n",
    "  - The parameter $\\pi$ should be \"close\" to $\\hat{\\pi}_n$.\n",
    "  - We can quantify \"close\" in terms of the *distribution* of $\\hat{\\pi}_n$. -->"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Phython (JB)",
   "language": "python",
   "name": "jb-python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
