{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Lecture 19: Randomized Experiments\n",
    "\n",
    "STATS 60 / STATS 160 / PSYCH 10\n",
    "\n",
    "\n",
    "\n",
    "**Concepts and Learning Goals:**\n",
    "\n",
    "- Experimental Design\n",
    "    - Correlation vs. causation\n",
    "    - Treatment and Control groups\n",
    "    - Confounding variables\n",
    "    - Observational studies\n",
    "    - Randomized controlled trials (RCTs)\n",
    "    - The Marshmallow Test\n",
    "\n",
    "\n",
    "<div style=\"display: flex; justify-content: \"right\"; flex-direction: column; align-items: \"right\";\">\n",
    "  <div>\n",
    "    <p style=\"font-size: smaller; text-align: \"right\"; margin-top: 4px;\"></p>\n",
    "  </div>\n",
    "</div>\n",
    "\n",
    "\n",
    "## Retrieval Practice\n",
    "\n",
    "- In today's lesson, we will examine a strategy that may help you learn, \n",
    "called **retrieval practice**.\n",
    "- Retrieval practice is the strategy of recalling facts or concepts \n",
    "from memory. \n",
    "    - The act of retrieving something from your memory\n",
    "    strengthens the connections in your brain, making it more likely that \n",
    "    you'll be able to recall it in the future.\n",
    "    - This is why quizzes and tests help you learn.\n",
    "- How do we know that retrieval practice works?\n",
    "\n",
    "\n",
    "## One Possible Study \n",
    "\n",
    "- Suppose we do lots of retrieval practice in STATS 60 and \n",
    "the average on the exam is 90%.\n",
    "- Is this convincing evidence that retrieval practice works?\n",
    "\n",
    "**NO!** \n",
    "\n",
    "- Maybe the students are good and would have done just as well on the exam\n",
    "without the retrieval practice?\n",
    "- Maybe the exam was just easy?\n",
    "\n",
    "We need to compare the **treatment group** that did retrieval practice \n",
    "to a **control group** that didn't. \n",
    "\n",
    "\n",
    "## A Controlled Study \n",
    "\n",
    "- Suppose we use the students in STATS 60 last year, who didn't do retrieval\n",
    "practice, as a control group.\n",
    "- They took the same exam, and their average on the exam was 75%.\n",
    "- So the group that did retrieval practice scored 15 percentage points higher\n",
    "than the group that didn't. Are you convinced now that retrieval practice\n",
    "causes more learning?\n",
    "\n",
    "**NO!** \"Correlation does not imply causation.\"\n",
    "\n",
    "- Maybe the students this year are stronger than the students last year.\n",
    "- Maybe the instructors this year are better than the instructors last year.\n",
    "\n",
    "The problem is that the two groups are not _comparable_ (in ways that affect\n",
    "the outcome).\n",
    "\n",
    "\n",
    "## Comparable Groups \n",
    "\n",
    "In summary, to determine if retrieval practice _causes_ students to learn more,\n",
    "we need:\n",
    "\n",
    "- two groups, one that does retrieval practice and another that doesn't,\n",
    "- that are comparable with respect to all other variables that affect the outcome.\n",
    "\n",
    "If two groups differ in another variable that affects\n",
    "the outcome, that variable is called a **confounding variable**.\n",
    "\n",
    "We can conclude causality if all \n",
    "**confounding variables** have been eliminated from the comparison.\n",
    "\n",
    "\n",
    "## Designing Comparable Groups\n",
    "\n",
    "A study that compares groups that already exist is called an \n",
    "**observational study**.\n",
    "\n",
    "\n",
    "In general, the treatment and control groups in an observational study \n",
    "are not comparable, so it is difficult to infer causality.\n",
    "\n",
    "\n",
    "The simplest way to ensure that groups are comparable (and \n",
    "eliminate confounding variables) is to _design_ them that way.\n",
    "\n",
    "\n",
    "A study that assigns subjects to groups is called an **experiment**.\n",
    "\n",
    "\n",
    "## Experiments \n",
    "\n",
    "How should we assign subjects to groups so that the groups are comparable?\n",
    "\n",
    "\n",
    "**Idea 1**\n",
    "\n",
    "- Record variables for each subject, like year, sex, major, etc. \n",
    "- Then, manually divide the subjects into two groups so that there are exactly \n",
    "the same proportion of students in each year, of each sex, in each major, \n",
    "etc. in the two groups.\n",
    "\n",
    "\n",
    "<p style=\"color:red;\">\n",
    "Unfortunately, this does not guarantee that the two groups will be balanced\n",
    "with respect to variables we did not record.\n",
    "</p>\n",
    "\n",
    "\n",
    "## Experiments\n",
    "\n",
    "How should we assign subjects to groups so that the groups are comparable?\n",
    "\n",
    "**Idea 2**\n",
    "\n",
    "Randomly assign subjects to the two groups.\n",
    "\n",
    "\n",
    "Now, the two groups are expected to be balanced with respect to _all_ \n",
    "variables, even the ones we did not record.\n",
    "\n",
    "\n",
    "If the number $N$ of participants is large enough, the groups are very likely to *actually* be balanced.\n",
    "\n",
    "<font color=\"gray\">\n",
    "Remember from our airport lecture: a random quantity is not necessarily close to its expectation. But if we sample many times, it is very likely to be close on average. This is the **same phenomenon**.\n",
    "</font>\n",
    "\n",
    "\n",
    "This is called a **randomized (controlled) experiment** and is the gold standard \n",
    "for causal inference.\n",
    "\n",
    "\n",
    "## A Randomized Experiment \n",
    "\n",
    "Let's do a randomized experiment to determine whether retrieval practice\n",
    "benefits learning.\n",
    "\n",
    "\n",
    "First, we have to randomize students to the control and treatment groups.\n",
    "How do we do that?\n",
    "\n",
    "- Count off 1, 2, 1, 2, 1, 2, ... until all students are assigned to a group?\n",
    "- Feed in the names of the students in the room to ChatGPT and ask it to divide\n",
    "the names into two groups?\n",
    "\n",
    "\n",
    "<p style=\"color:red;\">\n",
    "No, none of these options guarantees randomness.\n",
    "</p>\n",
    "\n",
    "\n",
    "**Question:** why?\n",
    "\n",
    "\n",
    "- Count off 1, 2, 3, .... Remember your number!\n",
    "- We will use a proper random number to choose half of these numbers at random.\n",
    "These people will be in the _treatment_ group.\n",
    "\n",
    "\n",
    "\n",
    "## Study Protocol \n",
    "\n",
    "1. Take 5 minutes to read the text on the handout.\n",
    "2. Now, depending on which group you are in, take 8 minutes to do\n",
    "the following:\n",
    "    - Control: Take notes on the text as you normally would.\n",
    "    - Treatment: Turn the page over and write down as much as you can remember.\n",
    "3. Now, take another 5 minutes to read the text again.\n",
    "4. Depending on which group you are in, take 8 minutes to do the following:\n",
    "    - Control: Add to your notes.\n",
    "    - Treatment: Turn the page over and write down as much as you can remember.\n",
    "    Try as hard as you can to recall even more information this time.\n",
    "5. On Wednesday, we'll see how much of this text you learned!\n",
    "\n",
    "# The Marshmallow Test: a famous observational experiment\n",
    "\n",
    "## The Marshmallow test\n",
    "\n",
    "<font color=\"gray\">\n",
    "The following example is also in Chapter 4 of your textbook, *Calling Bullshit* by Bergstrom and West.\n",
    "</font>\n",
    "\n",
    "In the 1970's, researchers led by Stanford professor Walter Mischel performed the following \"Marshmallow Test\" study at Bing Preschool:\n",
    "\n",
    "- Each child was placed in a room alone with a marshmallow\n",
    "- They were told that if they didn't eat the marshmallow now, they would get *two* marshmallows later\n",
    "- They timed the children to see how long they could go without eating the marshmallow\n",
    "\n",
    "[Here is a short video about the study from CBS news.](https://www.youtube.com/watch?v=4y6R5boDqh4)\n",
    "\n",
    "[Here is a modern-day version.](https://www.youtube.com/watch?v=A7Ro2WmY_RE)\n",
    "\n",
    "\n",
    "## Recall: the correlation coefficient\n",
    "\n",
    "If for each observational unit we have measured two quantitative variables $x_i,y_i$, \n",
    "\n",
    "The **correlation coefficient** $R$ is the slope of the best-fit line between the *standardized* $x$ and $y$ datapoints.\n",
    "\n",
    "- $R$ is between $-1$ and $1$\n",
    "\n",
    "- $R = 1$ is perfect positive association\n",
    "\n",
    "- $R = -1$ is perfect negative association\n",
    "\n",
    "![](../figures/mass-beak-standard.png)\n",
    "\n",
    "\n",
    "## Marshmallow, Wait, Profit\n",
    "\n",
    "A [follow-up study](https://searchworks.stanford.edu/articles/eric__EJ426151) calculated the correlation between (\\# seconds waited) and other \"cognitive competencies\" later in life:\n",
    "\n",
    "- SAT scores\n",
    "    - Sample size $N = 35$ \n",
    "    - SAT Verbal score: correlation coefficient $R = .42$, $p$-value $<.05$\n",
    "    - SAT Quantitative score: $R = .57$, $p$-value $<.001$\n",
    "- \"Adolescent coping questionnaire\": parents take a questionnaire about their teenagers, answering on a scale of $1-10$:\n",
    "    - Sample size $N = 43$\n",
    "    - \"How capable is your child of exhibiting self-control when frustrated?\": $R = .4$, $p$-value $<.01$\n",
    "    - \"How able is your child to pursue is or her goals when motivated?\" : $R = .38$, $p$-value $<.05$\n",
    "\n",
    "\n",
    "The ability to delay gratification at age 4 appears **correlated** with success later in life.\n",
    "\n",
    "## Responsible scientists\n",
    "\n",
    "The authors of the follow-up study are very careful to say that correlation is not causation!\n",
    "\n",
    "<font color=\"teal\">\n",
    "\"We must emphasize the need for caution in the interpretation of the total findings linking preschool delay to adolescent outcomes ... \"\n",
    "</font>\n",
    "\n",
    "**Question:** Why should we be cautious to conclude that the ability to delay gratification at age 4 is causal for later success? \n",
    "\n",
    "- What could possible **confounding variables** be?\n",
    "\n",
    "The authors again:\n",
    "\n",
    "<font color=\"teal\">\n",
    "\"A difficult question that remains is the mechanism underlying the associations found between the delay behavior of the preschool child and the subsequent outcome measures. One contributing source may be stability in the subjects' family-mediated environments\"\n",
    "</font>\n",
    "\n",
    "## Irresponsible reporting\n",
    "\n",
    "But the story is just too good! \n",
    "\n",
    "The media played up the results.\n",
    "\n",
    "* * *\n",
    "\n",
    "### Bestselling author James Clear\n",
    "\n",
    "<div style=\"display: flex; justify-content: center; flex-direction: column; align-items: center;\">\n",
    "  <div>\n",
    "    <img src=\"../figures/marshmallow-clear.png\" style=\"width:\"90%\";\"/>\n",
    "    <p style=\"font-size: smaller; text-align: center; margin-top: 4px;\"></p>\n",
    "  </div>\n",
    "</div>\n",
    "\n",
    "From Clear's essay [\"40 Years of Stanford Research Found That People With This One Quality Are More Likely to Succeed\"](https://jamesclear.com/delayed-gratification)\n",
    "\n",
    "* * *\n",
    "\n",
    "### Time Magazine\n",
    "\n",
    "<div style=\"display: flex; justify-content: center; flex-direction: column; align-items: center;\">\n",
    "  <div>\n",
    "    <img src=\"../figures/marshmallow-time-headline.png\" style=\"width:\"90%\";\"/>\n",
    "    <p style=\"font-size: smaller; text-align: center; margin-top: 4px;\"></p>\n",
    "  </div>\n",
    "</div>\n",
    "\n",
    "<div style=\"display: flex; justify-content: center; flex-direction: column; align-items: center;\">\n",
    "  <div>\n",
    "    <img src=\"../figures/marshmallow-time-text.png\" style=\"width:\"90%\";\"/>\n",
    "    <p style=\"font-size: smaller; text-align: center; margin-top: 4px;\"></p>\n",
    "  </div>\n",
    "</div>\n",
    "\n",
    "\n",
    "[From Time Magazine](https://time.com/3697991/achieve-better-success-life)\n",
    "\n",
    "* * *\n",
    "\n",
    "### Psychology Today\n",
    "\n",
    "<div style=\"display: flex; justify-content: center; flex-direction: column; align-items: center;\">\n",
    "  <div>\n",
    "    <img src=\"../figures/marshmallow-psych-today.png\" style=\"width:\"90%\";\"/>\n",
    "    <p style=\"font-size: smaller; text-align: center; margin-top: 4px;\"></p>\n",
    "  </div>\n",
    "</div>\n",
    "\n",
    "[From Psychology Today](https://www.psychologytoday.com/us/blog/beyond-school-walls/202304/10-ways-life-is-a-marshmallow-test)\n",
    "\n",
    "## Correlation is not causation!\n",
    "\n",
    "A 2018 study [**failed to replicate**](https://searchworks.stanford.edu/articles/edsjsr__edsjsr.26957469) the results.\n",
    "\n",
    "- The study followed a larger sample\n",
    "    - $n = 918$ children\n",
    "    - From a variety of U.S. cities and socioeconomic backgrounds   \n",
    "- The correlations found were smaller/absent:\n",
    "    - $R = .236$ for time waited vs. academic achievement test at age 15\n",
    "    - $R = -.062$ for time waited vs. behavioral test at age 15\n",
    "- Socioeconomic factors were found to *largely explain* the association:\n",
    "    - when researchers controlled for measures of socioeconomic status:\n",
    "        - $R = 0.081$ for time waited vs. academic achievement test at 15 ($p$-value $<0.05$) \n",
    "    - children of college-graduate mothers: 68\\% waited the full $7$ minutes\n",
    "    - children of non-college-graduate mothers: 45\\% waited the full $7$ minutes\n",
    "\n",
    "Socioeconomic status appears to be a **confounding variable**.\n",
    "\n",
    "## Causality in doubt\n",
    "\n",
    "\n",
    "Which is it??? \n",
    "\n",
    "<div style=\"display: flex; justify-content: center; flex-direction: column; align-items: center;\">\n",
    "  <div>\n",
    "    <img src=\"../figures/delay-grat.png\" />\n",
    "    <p style=\"font-size: smaller; text-align: center; margin-top: 4px;\"></p>\n",
    "  </div>\n",
    "</div>\n",
    "\n",
    "\n",
    "**Question:** Can we even be sure, based on the evidence from *this* follow-up study, that higher socioeconomic status in early childhood causes success later in life?\n",
    "\n",
    "## Discussion\n",
    "\n",
    "**Question**: what lessons can we take from the Marshmallow test study?\n",
    "\n",
    "**Question**: Suppose you want to test whether the ability to delay gratification at age 4 is **causal** for success later in life.\n",
    "\n",
    "\n",
    "- How would you design a randomized control trial?\n",
    "\n",
    "- Remember you need a *treatment* group and *control* group.\n",
    "\n",
    "- What would the **treatment** be?\n",
    "\n",
    "\n",
    "\n",
    "This question is very tough to study! A randomized trial might be impossible.\n",
    "\n",
    "\n",
    "\n",
    "**Question**: What if you want to test whether <font color=\"teal\">high socioeconomic status at age 4</font> is **causal** for success later in life.\n",
    "\n",
    "\n",
    "\n",
    "- How would you design a randomized control trial?\n",
    "\n",
    "- Remember you need a *treatment* group and *control* group.\n",
    "\n",
    "- What would the **treatment** be?\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Phython (JB)",
   "language": "python",
   "name": "jb-python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}