Discussion 7: Randomized Controlled Trials#
STATS 60 / STATS 160 / PSYCH 10
Plan:
Recap
Practice Quiz 1
Activity: analyzing a sleep deprivation study with the potential outcomes model
Recap: Observational Trials#
Observational trials and other natural experiments:
The treatment and control groups are chosen according to criteria other than randomization
Confounding variables and limitations of observational studies:
Unaccounted for confounding variables could always be present
Causation cannot be inferred
In some cases this is the best option available:
Ethical considerations:
Sometimes treatment is not ethical, but we want to understand its effect (Example: effect of big market shocks on economies)
Sometimes a placebo control is unethical (Example: medication known to save lives, but exact dosage to be determined)
Cost/scale considerations
Recap: Randomized Controlled Trials#
Randomized Controlled Trials:
The gold standard, if we want to infer a causal relationship between treatment and outcome
The treatment and controlled groups are sampled randomly from the same population
Potential outcomes model
Each individual \(i\) has two “potential outcomes”:
\(Y_i(0)\) is the response to control
\(Y_i(1)\) is the response to treatment
We observe only one of them, depending on whether \(i\) received treatment or control
Simulation allows us to compute \(p\)-values for differences in outcomes
Similar to the permutation test for correlation
Practice Quiz 1#
In each of the following experiments (a) list the strengths and weaknesses of each experimental design, (b) state whether you think the experiment will be effective at answering the experimenters’ question, and if not, suggest an alternative design that would be more effective.
Question 1#
An elementary school wants to decide between two choices of curriculum for teaching reading: A or B. The school has 100 pupils in each grade, each divided into 4 equal classrooms of size 25. The administration decides to run the following experiment: they will randomly assign two kindergarten classes to curriculum A, and the remaining two to curriculum B, and at the end of a 1 year period they will give each student a reading assessment, and use the results to determine if A or B was better.
Answer 1#
Relatively effective.
Strengths: Randomization at the classroom level helps control for confounding variables at that level (e.g., classroom assignment isn’t based on ability). Same assessment is used for all students, ensuring consistent measurement.
Weaknesses: If each classroom has a different teacher, the effect of curriculum is confounded with teacher quality or teaching style. The study is limited to kindergarten so if the school wants to apply this across all grades, results may not extrapolate. With only 2 classrooms per treatment, there could be too much noise from teacher quality to make a conclusion about the choice of curriculum.
Proposed changes to make more effective: randomize more classrooms per treatment, potentially across multiple grades or schools. Rotate teachers across treatments to remove teacher quality confounding.
Question 2#
A marketer wants to determine whether people are more likely to buy product A or product B. They do a survey, choosing 100 random customers and asking each one of them if they are more likely to buy A or B.
Answer 2#
Not very effective.
Strengths: Random sample of customers.
Weaknesses: Stated preferences can differ from actual behavior. There is no random assignment and no control for confounding variables (eg. demographics, purchase history, etc.)
Proposed changes to make more effective: make this a truly randomized experiment by randomly showing a customer either A or B in a purchase setting, and measuring real purchase rates.
Question 3#
A psychiatrist is trying to determine whether there is a causal relationship between sleep and positive social interactions. The psychiatrist surveys 1000 people, asking each (1) how many hours of sleep they get on a typical night, and (2) how positive are their social interactions on a scale from 1-10. The psychiatrist then measures the correlation coefficient of these variables.
Answer 3#
Not very effective.
Strengths: Large sample size
Weaknesses: Survey response bias, people can misreport their hours of sleep, no standardized way of measuring amount of sleep, social interactions are rated on a subjective scale, and no consideration for confounding variables (eg. mental health, job stress, age, etc.) No random assignment.
Proposed changes to make more effective: make this a truly randomized experiment by assigning participants to a sleep intervention group versus a control group, and measuring outcomes over time
Sleep Deprivation Study#
Does sleep deprivation hinder learning?
How would you design a study to answer this question?
Randomize subjects to two groups.
One group is deprived of sleep, the other sleeps normally.
Compare their performance on some cognitive task.
Do you have any concerns (ethical or otherwise) about this experimental design?
Study of Sleep Deprivation#
Stickgold, James, and Hobson (2000) conducted such a study.
They randomized 21 subjects, aged 18-25 years, to two groups.
A treatment group of 11 subjects was deprived of sleep for 30 hours.
A control group of 10 subjects was allowed to sleep normally.
They measured them on the following cognitive task.
#
First, subjects looked at this image.

#
Then, they replaced the image with a “mask” and asked about the previous image.
Question 1. What letter was in the middle of the screen?
a. T b. L
Question 2. There were three diagonal bars. How were they arranged?
a. horizontally b. vertically
Measuring Outcomes#

Subjects did this task the day before and 3 days after the sleep deprivation.
The outcome for each subject was the improvement in reaction times (in milliseconds/10).
For example, a subject who took 5 ms at the beginning of the study and 2 ms at the end had an improvement of 3 ms.
Data#
Here were the outcomes (in milliseconds/10) for the 21 subjects.
Control |
Sleep Deprivation |
|---|---|
252 |
-107 |
145 |
45 |
-70 |
22 |
126 |
213 |
345 |
-147 |
456 |
-107 |
116 |
96 |
186 |
24 |
121 |
218 |
305 |
72 |
100 |
What are the null and alternative hypotheses?
\(H_0\): Sleep deprivation has no effect on performance.
\(H_A\): Sleep deprivation decreases performance.
On the handout, set up the potential outcomes table under the null hypothesis.
Use the applet to simulate the distribution of the difference in means under the null hypothesis.
potential-outcomes.github.io{target=”_blank”}