Drawing Statistical Conclusions

Drawing Statistical Conclusions#

Download#

Case Study A: Motivation for creative writers#

Creative writing students randomly assigned to intrinsic vs. extrinsic priming questionnaires.

require(ggplot2)
creativity = read.csv('https://raw.githubusercontent.com/StanfordStatistics/stats191-data/main/Sleuth3/creativity.csv', header=TRUE)
salaries = read.csv('https://raw.githubusercontent.com/StanfordStatistics/stats191-data/main/Sleuth3/salaries.csv', header=TRUE)
set.seed(0)

Loading required package: ggplot2

head(creativity)

A data.frame: 6 × 2
	Score	Treatment
	<dbl>	<chr>
1	5.0	Extrinsic
2	5.4	Extrinsic
3	6.1	Extrinsic
4	10.9	Extrinsic
5	11.8	Extrinsic
6	12.0	Extrinsic

Summarizing the groups#

Extrinsic Group#

extrinsic = creativity$Score[creativity$Treatment == 'Extrinsic']
summary(extrinsic)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   5.00   12.15   17.20   15.74   18.95   24.00 

Intrinsic Group#

intrinsic = creativity$Score[creativity$Treatment == 'Intrinsic']
summary(intrinsic)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  12.00   17.43   20.40   19.88   22.30   29.70 

Histogram of `Score` stratified by `Sex`#

# this plot is for visualization only
# students not expected to reproduce
fig <- (ggplot(creativity, aes(x=Score, fill=Treatment)) +
        geom_histogram(aes(y=after_stat(density)),
	color="#e9ecef",
	alpha=0.6,
	position='identity',
	bins=10) +
        labs(fill=""))
fig

../../_images/4361dbbfd59a1e3357dd7dab3f6087ec903cb93daf61e934bb34d88eccd019f6.png

Case Study B: Difference in salaries between male and female employees#

Salaries from Harris Trust and Bank over years 1969-1977

head(salaries)

A data.frame: 6 × 2
	Salary	Sex
	<int>	<chr>
1	3900	Female
2	4020	Female
3	4290	Female
4	4380	Female
5	4380	Female
6	4380	Female

Females#

female = salaries$Salary[salaries$Sex == 'Female']
summary(female)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   3900    4800    5220    5139    5400    6300 

Males#

male = salaries$Salary[salaries$Sex == 'Male']
summary(male)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   4620    5400    6000    5957    6075    8100 

Histogram of `Salary` stratified by `Sex`#

# this plot is for visualization only
# students not expected to reproduce

fig <- (ggplot(salaries, aes(x=Salary, fill=Sex)) +
        geom_histogram(aes(y=after_stat(density)),
                       color="#e9ecef",
                       alpha=0.6,
                       position='identity',
                       bins=10) +
        labs(fill=""))
fig

../../_images/e2f3a033b311a2e753496a121007e9da593cf3d1a720d6e3b385b5d8c309a7e5.png

Boxplot of `Salary` stratified by `Sex`#

boxplot(Salary ~ Sex, data=salaries, col='orange')

../../_images/29ab0d5bc6e2104f0700a43fe39dd694a420b952309335e625bdadaca951836e.png

Key differences between the studies#

Creative writing study was a randomized experiment.
Salary dataset was an observational study.

Implications#

Differences in strength of conclusions: randomized experiments like creativity can admit causal conclusions
Generalizability: what population are the data from?
1. If we consider Harris a typical bank, then salaries represents a sample of starting salaries.

Making statistical inferences#

{width=600 fig-align=”center”}

What sort of conclusions are we entitled to make?

Modelling uncertainty in `creativity`#

The observed difference (i.e. treatment_effect) is not 0. Is there a real difference?
We need a (statistical) model to draw statistical inferences!

Mental model: the world before randomization#

{width=600 fig-align=”center”}

Potential outcomes before randomization

Mental model: the null hypothesis#

{width=400 fig-align=”center”}

\(H_0\): green outcome identical to red

Mental model: the world after randomization#

{width=800 fig-align=”center”}

Observed outcomes after randomization

Computing difference via `t.test`#

estimates = t.test(Score ~ Treatment, data=creativity)$estimate
estimates

mean in group Extrinsic: 15.7391304347826
mean in group Intrinsic: 19.8833333333333

Effect#

treatment_effect = treatment_effect=estimates[2] - estimates[1]
treatment_effect

mean in group Intrinsic: 4.14420289855072

Null hypothesis: no difference in `Score` between the groups#

null_treatment = sample(creativity$Treatment, 47, replace=FALSE)
null_data = data.frame(Score=creativity$Score,
                       Treatment=null_treatment)
null_estimates = t.test(Score ~ Treatment, data=null_data)$estimate
null_estimates

mean in group Extrinsic: 17.3782608695652
mean in group Intrinsic: 18.3125

null_treatment_effect = null_estimates[2] - null_estimates[1]
null_treatment_effect

mean in group Intrinsic: 0.934239130434783

Repeated 10000 times#

# this code/plot is for visualization only
# students not expected to reproduce

estimates = t.test(Score ~ Treatment, data=creativity)$estimate
treatment_effect = treatment_effect=estimates[2] - estimates[1]

null_treatment_effect = rep(NA, 10000)
for (i in 1:10000) {
    null_treatment = sample(creativity$Treatment, 47, replace=FALSE)
    null_data = data.frame(Score=creativity$Score,
                           Treatment=null_treatment)
    null_estimates = t.test(Score ~ Treatment, data=null_data)$estimate
    null_treatment_effect[i] = null_estimates[2] - null_estimates[1]
}

treatment_effect = data.frame(treatment_effect=treatment_effect)
fig <- (ggplot(data.frame(null_treatment_effect),
               aes(x=null_treatment_effect)) +
    geom_histogram(aes(y=after_stat(density)),
                   color="#e9ecef",
                   alpha=0.6, bins=30) +
    geom_vline(aes(xintercept=treatment_effect), treatment_effect,
               color='red', linewidth=2) +
    geom_vline(aes(xintercept=-treatment_effect), treatment_effect,
               color='red', alpha=0.5, linewidth=1) +
    geom_density() +  
    labs(fill=""))
fig
treatment_effect = treatment_effect[1,]

../../_images/b689521fb8e67d4e132298ca0260aa1c05da7a6f6fad9141b7d04cb84a8ba1fd.png

length(null_treatment_effect)
p_value = mean(abs(null_treatment_effect) > treatment_effect)
p_value

10000

0.006

Modelling uncertainty in `salaries`#

The difference is not 0. Is the difference real?
We need a model to draw statistical inferences!

Mental model: `Male` and `Female` salaries#

{width=600 fig-align=”center”}

There are two populations of salaries

Mental model: `Male` and `Female` salaries#

{width=600 fig-align=”center”}

\(H_0\): distribution of orange box identical to purple

Computing difference via `t.test`#

sex_estimates = t.test(Salary ~ Sex, data=salaries)$estimate
sex_effect = sex_estimates[2] - sex_estimates[1]
sex_effect

mean in group Male: 818.022540983607

Repeated 10000 times#

# this code/plot is for visualization only
# students not expected to reproduce

sex_estimates = t.test(Salary ~ Sex, data=salaries)$estimate
sex_effect = sex_estimates[2] - sex_estimates[1]

null_sex_effect = rep(NA, 10000)
for (i in 1:10000) {
    null_sex = sample(salaries$Sex, length(salaries$Sex), replace=FALSE)
    null_data = data.frame(Salary=salaries$Salary,
                           Sex=null_sex)
    null_estimates = t.test(Salary ~ Sex, data=null_data)$estimate
    null_sex_effect[i] = null_estimates[2] - null_estimates[1]
}

sex_effect = data.frame(sex_effect=sex_effect)
fig <- (ggplot(data.frame(null_sex_effect),
               aes(x=null_sex_effect)) +
        geom_histogram(aes(y=after_stat(density)),
                       color="#e9ecef",
	               alpha=0.6, bins=30) +
    geom_vline(aes(xintercept=sex_effect), sex_effect,
               color='red', linewidth=2) +
    geom_vline(aes(xintercept=-sex_effect), sex_effect,
               color='red', alpha=0.5, linewidth=1) +
    geom_density() +  
    labs(fill=""))
fig
sex_effect = sex_effect[1,]

../../_images/7b5723b62abd6cf1f4d19e0799bdd7e37b3134f01a01106a04abb037ac2cc03a.png

length(null_sex_effect)
p_value = mean(abs(null_sex_effect) > sex_effect)
p_value

10000

0

Other issues#

We used the same method even for these different studies… does this make sense?
Terminology:
1. Parameter: a property of the probability model (often written \(\theta\))
2. Estimate: a function of the sample data (often written \(\hat{\theta}\))
3. Goal of statistical inference is to learn about the parameter \(\theta\) from the estimate \(\hat{\theta}\)

Other issues#

Experimental design:
1. Randomization: individuals were randomly assigned Treatment in creative study
2. Simple random sample: a way of sampling \(n\) from a population such that every \(n\) points are equally likely.
3. Other sampling mechanisms: systematic sampling, cluster sampling.

Drawing Statistical Conclusions

Contents

Drawing Statistical Conclusions#

Download#

Case Study A: Motivation for creative writers#

Summarizing the groups#

Extrinsic Group#

Intrinsic Group#

Histogram of Score stratified by Sex#

Case Study B: Difference in salaries between male and female employees#

Females#

Males#

Histogram of Salary stratified by Sex#

Boxplot of Salary stratified by Sex#

Key differences between the studies#

Implications#

Making statistical inferences#

Modelling uncertainty in creativity#

Mental model: the world before randomization#

Mental model: the null hypothesis#

Mental model: the world after randomization#

Computing difference via t.test#

Effect#

Null hypothesis: no difference in Score between the groups#

Repeated 10000 times#

Modelling uncertainty in salaries#

Mental model: Male and Female salaries#

Mental model: Male and Female salaries#

Computing difference via t.test#

Repeated 10000 times#

Other issues#

Other issues#

Histogram of `Score` stratified by `Sex`#

Histogram of `Salary` stratified by `Sex`#

Boxplot of `Salary` stratified by `Sex`#

Modelling uncertainty in `creativity`#

Computing difference via `t.test`#

Null hypothesis: no difference in `Score` between the groups#

Modelling uncertainty in `salaries`#

Mental model: `Male` and `Female` salaries#

Mental model: `Male` and `Female` salaries#

Computing difference via `t.test`#