Course introduction

web.stanford.edu/class/stats364/

Jonathan Taylor

Spring 2020

Replicability crisis in science

Replicability crisis in science

Scientists collect data first and ask questions later. (Candes) DFQL The idea of a scientist, struck, as if by lightning with a question, is far from the truth. (Tukey)

Replicability crisis in science

Exploratory data analysis

Confirmatory data analysis

Conflict identified by Candes, Tukey (and certainly others)

\[ \newcommand{\data}{{\cal T}} \]

Modern science

A (typical?) data scientist \(U\)’s workflow…

Modern science

A (typical?) data scientist \(U\)’s workflow…

Modern science

A (typical?) data scientist \(U\)’s workflow…

library(glmnet)
## Loading required package: Matrix
## Loading required package: foreach
## Loaded glmnet 2.0-18
print(dim(X))
## [1] 500 100
# Q^U_1
lam = cv.glmnet(X, y)$lambda.min
# Q^U_2
G = glmnet(X, y)
beta.hat = coef(G, s=lam)[-1]
interesting_features = which(beta.hat != 0)             
interesting_features
##  [1]   2   4   5   8  12  13  15  18  20  21  24  26  27  29  33  34  39
## [18]  40  47  48  52  54  56  57  58  60  61  65  66  73  74  75  76  78
## [35]  79  80  83  84  85  87  90  91  92  94  95  96  98  99 100

Modern science

Writing up the paper?

# publish this?
confint(lm(y ~ X[,interesting_features]))[-1,]
##                                    2.5 %        97.5 %
## X[, interesting_features]1  -0.134666925  0.0296184179
## X[, interesting_features]2  -0.005672499  0.1613077599
## X[, interesting_features]3  -0.053930161  0.1109551781
## X[, interesting_features]4   0.044349186  0.2076825410
## X[, interesting_features]5  -0.027926685  0.1365118283
## X[, interesting_features]6  -0.032769667  0.1316021200
## X[, interesting_features]7  -0.158303438  0.0072563316
## X[, interesting_features]8  -0.024253365  0.1418585076
## X[, interesting_features]9  -0.121432787  0.0417744865
## X[, interesting_features]10  0.014321121  0.1792561263
## X[, interesting_features]11 -0.005529815  0.1593495228
## X[, interesting_features]12 -0.013091442  0.1517333395
## X[, interesting_features]13 -0.133795013  0.0305297154
## X[, interesting_features]14 -0.036626589  0.1327903094
## X[, interesting_features]15 -0.004901185  0.1590721641
## X[, interesting_features]16 -0.157122744  0.0047186279
## X[, interesting_features]17 -0.098683460  0.0639697418
## X[, interesting_features]18 -0.138232394  0.0260313420
## X[, interesting_features]19 -0.169516536 -0.0045634327
## X[, interesting_features]20 -0.146869516  0.0173152539
## X[, interesting_features]21 -0.010776887  0.1511154896
## X[, interesting_features]22 -0.028609014  0.1356236522
## X[, interesting_features]23 -0.025354318  0.1406934766
## X[, interesting_features]24 -0.044578207  0.1197330768
## X[, interesting_features]25 -0.045746324  0.1183011239
## X[, interesting_features]26 -0.011813679  0.1541610998
## X[, interesting_features]27 -0.152104784  0.0133121940
## X[, interesting_features]28 -0.130852270  0.0352438034
## X[, interesting_features]29 -0.129883488  0.0369461130
## X[, interesting_features]30 -0.178834263 -0.0124556931
## X[, interesting_features]31 -0.164157792  0.0011051734
## X[, interesting_features]32  0.089396198  0.2541155146
## X[, interesting_features]33 -0.036072714  0.1341247449
## X[, interesting_features]34 -0.059571959  0.1062204072
## X[, interesting_features]35 -0.031589104  0.1345085549
## X[, interesting_features]36 -0.128526666  0.0373178142
## X[, interesting_features]37 -0.231997646 -0.0645460622
## X[, interesting_features]38 -0.165737784 -0.0007462745
## X[, interesting_features]39 -0.112658740  0.0496528162
## X[, interesting_features]40  0.027106618  0.1922657835
## X[, interesting_features]41 -0.126489256  0.0370006396
## X[, interesting_features]42 -0.149240623  0.0152159647
## X[, interesting_features]43 -0.117646630  0.0465191464
## X[, interesting_features]44 -0.025995732  0.1381550821
## X[, interesting_features]45 -0.127133786  0.0369194617
## X[, interesting_features]46 -0.006418778  0.1568876267
## X[, interesting_features]47 -0.011286811  0.1513481766
## X[, interesting_features]48 -0.030729088  0.1356614915
## X[, interesting_features]49 -0.167370469 -0.0030653721

Modern science

Data splitting: a valid method

test = 1:(n/2)
train = (n/2+1):n
library(glmnet)
# Q^U_1
lam = cv.glmnet(X[train,], y[train])$lambda.min 
# Q^U_2
G = glmnet(X[train,], y[train])
beta.hat = coef(G, s=lam)[-1]
interesting_features_train = which(beta.hat != 0)             
print(interesting_features_train)
##  [1]   2   4   7   8  11  12  13  18  26  27  29  30  33  34  35  36  37
## [18]  46  47  52  53  54  58  59  61  63  66  69  70  73  75  78  80  83
## [35]  85  87  91  96  98 100
# publish this?
confint(lm(y ~ X[,interesting_features_train], subset=test))[-1,]
##                                          2.5 %      97.5 %
## X[, interesting_features_train]1  -0.176888392 0.095904972
## X[, interesting_features_train]2  -0.067808045 0.206493314
## X[, interesting_features_train]3  -0.129119678 0.157355086
## X[, interesting_features_train]4   0.043409650 0.316791551
## X[, interesting_features_train]5  -0.127656060 0.151734772
## X[, interesting_features_train]6  -0.097833997 0.190320548
## X[, interesting_features_train]7  -0.125115056 0.152009075
## X[, interesting_features_train]8  -0.108613262 0.181516865
## X[, interesting_features_train]9  -0.205486415 0.073064216
## X[, interesting_features_train]10 -0.166316178 0.125287400
## X[, interesting_features_train]11 -0.099492671 0.173449817
## X[, interesting_features_train]12 -0.146830468 0.140499848
## X[, interesting_features_train]13 -0.059709656 0.223870212
## X[, interesting_features_train]14 -0.208018437 0.077280011
## X[, interesting_features_train]15 -0.155336405 0.147398781
## X[, interesting_features_train]16 -0.271185834 0.006001601
## X[, interesting_features_train]17 -0.226568395 0.040258694
## X[, interesting_features_train]18 -0.087402111 0.180298799
## X[, interesting_features_train]19 -0.227014421 0.040702382
## X[, interesting_features_train]20 -0.043512844 0.221811221
## X[, interesting_features_train]21 -0.135436241 0.122398662
## X[, interesting_features_train]22 -0.076496690 0.194795728
## X[, interesting_features_train]23 -0.112629537 0.160590765
## X[, interesting_features_train]24 -0.013667991 0.262754717
## X[, interesting_features_train]25 -0.098221085 0.187082754
## X[, interesting_features_train]26 -0.158577403 0.106630717
## X[, interesting_features_train]27 -0.128189404 0.159698822
## X[, interesting_features_train]28 -0.155836548 0.122465899
## X[, interesting_features_train]29 -0.181898830 0.095007694
## X[, interesting_features_train]30 -0.267139422 0.015324643
## X[, interesting_features_train]31  0.054168268 0.340460622
## X[, interesting_features_train]32 -0.121854709 0.162504021
## X[, interesting_features_train]33 -0.113851202 0.144323265
## X[, interesting_features_train]34 -0.204572534 0.084396327
## X[, interesting_features_train]35 -0.121602240 0.163892733
## X[, interesting_features_train]36 -0.004123234 0.252562041
## X[, interesting_features_train]37 -0.143859313 0.122233435
## X[, interesting_features_train]38 -0.134951838 0.150227919
## X[, interesting_features_train]39 -0.141267417 0.132107945
## X[, interesting_features_train]40 -0.237839828 0.038029446

Multiple comparisons

Large scale inference

Multiple comparisons

Large scale inference

Z = rep(NA, p)
for (i in 1:p) {
    Z[i] = summary(lm(y ~ X[,i]))$coef[2,3]
}
plot(density(Z), ylim=c(0,0.4))
zval = seq(-5,5,length=101)
lines(zval, dnorm(zval), col='red', lwd=2)

Multiple comparisons

Large scale inference

library(locfdr)
locfdrZ = locfdr(Z);
## Warning in locfdr(Z): CM estimation failed, middle of histogram non-normal

BH_select = which(p.adjust(2 * (1 - pnorm(abs(Z))), method='BH') < 0.2)
bonf_select = which(p.adjust(2 * (1 - pnorm(abs(Z))), method='bonferroni') < 0.2)
print(BH_select)
## [1]  8 75 83

Course

Outline

Modern science

Simple example: Drop the losers (Sampson and Sill)

Modern science

Simple example: Drop the losers (Sampson and Sill)

Modern science

Drop the losers

T = first_stage = list(D1=rnorm(200), D2=rnorm(200),
                       D3=rnorm(200), D4=rnorm(200),
                       D5=rnorm(200), D6=rnorm(200),
                       D7=rnorm(200), D8=rnorm(200),
                       D9=rnorm(200), D10=rnorm(200))
treatment_means = c(mean(T$D1), mean(T$D2),
                    mean(T$D3), mean(T$D4),
                    mean(T$D5), mean(T$D6),
                    mean(T$D7), mean(T$D8),
                    mean(T$D9), mean(T$D10))
treatment_order = order(treatment_means, decreasing=TRUE)
candidates = treatment_order[1:2]
candidates
## [1] 7 4

Modern science

Simple example: Drop the losers (Sampson and Sill)

T_prime = second_stage = list(D7=rnorm(200), D4=rnorm(200))
print(t.test(c(second_stage$D7), conf.level=1-0.05)$conf.int)
## [1] -0.06839504  0.20088615
## attr(,"conf.level")
## [1] 0.95
print(t.test(c(second_stage$D4), conf.level=1-0.05)$conf.int)
## [1] -0.1853057  0.1017236
## attr(,"conf.level")
## [1] 0.95

Modern science

Simple example: Drop the losers (Sampson and Sill)

Modern science

Conflict between confirmatory and exploratory

Modern science

Reproducibility and replicability

Modern science

Classical inference

Modern science

Challenge for selective inference

Statistical objects

  1. point estimators

  2. confidence intervals

  3. posterior distributions

Scientists, when struck by anything, are not struck with null hypotheses…

Workflows to statistics

Example: Bonferroni

From workflows to statistics

Simultaneous: Bonferroni

print(t.test(c(first_stage$D7, second_stage$D7), conf.level=1-0.05/10)$conf.int)
## [1] -0.03792903  0.23805049
## attr(,"conf.level")
## [1] 0.995
print(t.test(c(first_stage$D4, second_stage$D4), conf.level=1-0.05/10)$conf.int)
## [1] -0.1444023  0.1505564
## attr(,"conf.level")
## [1] 0.995

From workflows to statistics

Conditional: classical with Bonferroni

print(t.test(c(second_stage$D7), conf.level=1-0.05/2)$conf.int)
## [1] -0.08795882  0.22044994
## attr(,"conf.level")
## [1] 0.975
print(t.test(c(second_stage$D4), conf.level=1-0.05/2)$conf.int)
## [1] -0.2061589  0.1225768
## attr(,"conf.level")
## [1] 0.975

From workflows to statistics

Back of the envelope comparison

print(list(bonferroni=qnorm(1-0.05/(2*10)), classical=qnorm(1-0.05/(2*2))*sqrt(2))) # current
## $bonferroni
## [1] 2.807034
## 
## $classical
## [1] 3.169822
print(list(bonferroni=qnorm(1-0.05/(2*40)), classical=qnorm(1-0.05/(2*2))*sqrt(2)))
## $bonferroni
## [1] 3.227218
## 
## $classical
## [1] 3.169822
print(list(bonferroni=qnorm(1-0.05/(2*10)), classical=qnorm(1-0.05/2)*sqrt(2)))
## $bonferroni
## [1] 2.807034
## 
## $classical
## [1] 2.771808

From workflows to statistics

Regression example

Bonferroni?

From workflows to statistics

Data splitting

Conditional inference

Classical inference

Conditional inference

Drop the losers

Conditional inference

The (classical) scientific method is inadmissible!

Conditional inference

Classical scientific method

Conditional inference

General approach

Conditional inference

General approach

Conditional inference

What should we condition on?

Conditional inference

Choice of \({\cal M}\)

confint(lm(y ~ X))[-1,][interesting_features,]
##              2.5 %       97.5 %
## X2   -0.1616404680  0.024749262
## X4   -0.0204230882  0.168689551
## X5   -0.0556223871  0.129492952
## X8    0.0355275128  0.220338426
## X12  -0.0423558821  0.141731525
## X13  -0.0426952021  0.143103270
## X15  -0.1706528788  0.015595744
## X18  -0.0378114079  0.145547615
## X20  -0.1370011432  0.044158381
## X21  -0.0007036447  0.186546336
## X24  -0.0221560022  0.170244284
## X26  -0.0179344367  0.165311365
## X27  -0.1506676792  0.031240039
## X29  -0.0434491740  0.150770465
## X33  -0.0095953969  0.173100259
## X34  -0.1892108442 -0.005328268
## X39  -0.1084096858  0.073399864
## X40  -0.1533432574  0.033077796
## X47  -0.1973497473 -0.013104382
## X48  -0.1638190571  0.019842431
## X52  -0.0207974683  0.162661017
## X54  -0.0220168882  0.161524815
## X56  -0.0410425027  0.145646701
## X57  -0.0570684814  0.126227891
## X58  -0.0627383985  0.122373870
## X60  -0.0133046349  0.171560957
## X61  -0.1737119144  0.010669169
## X65  -0.1418024309  0.042776546
## X66  -0.1332307367  0.052068724
## X73  -0.1787070093  0.005354464
## X74  -0.1682989641  0.017257623
## X75   0.0766551805  0.257676063
## X76  -0.0532339871  0.140339787
## X78  -0.0610251725  0.124040924
## X79  -0.0414717948  0.143714331
## X80  -0.1322885226  0.050390804
## X83  -0.2325298441 -0.046958139
## X84  -0.1670509076  0.014208098
## X85  -0.1198502220  0.058091444
## X87   0.0335590952  0.219650453
## X90  -0.1450235342  0.037292465
## X91  -0.1570719312  0.028110668
## X92  -0.1299231174  0.051292739
## X94  -0.0336919902  0.146026743
## X95  -0.1395153360  0.040032891
## X96  -0.0126469605  0.169566314
## X98  -0.0068351396  0.171314613
## X99  -0.0284372199  0.155062343
## X100 -0.1789870458  0.002695492

Conditional inference

Choice of \({\cal M}\)

confint(lm(y ~ X[,interesting_features]))[-1,]
##                                    2.5 %        97.5 %
## X[, interesting_features]1  -0.134666925  0.0296184179
## X[, interesting_features]2  -0.005672499  0.1613077599
## X[, interesting_features]3  -0.053930161  0.1109551781
## X[, interesting_features]4   0.044349186  0.2076825410
## X[, interesting_features]5  -0.027926685  0.1365118283
## X[, interesting_features]6  -0.032769667  0.1316021200
## X[, interesting_features]7  -0.158303438  0.0072563316
## X[, interesting_features]8  -0.024253365  0.1418585076
## X[, interesting_features]9  -0.121432787  0.0417744865
## X[, interesting_features]10  0.014321121  0.1792561263
## X[, interesting_features]11 -0.005529815  0.1593495228
## X[, interesting_features]12 -0.013091442  0.1517333395
## X[, interesting_features]13 -0.133795013  0.0305297154
## X[, interesting_features]14 -0.036626589  0.1327903094
## X[, interesting_features]15 -0.004901185  0.1590721641
## X[, interesting_features]16 -0.157122744  0.0047186279
## X[, interesting_features]17 -0.098683460  0.0639697418
## X[, interesting_features]18 -0.138232394  0.0260313420
## X[, interesting_features]19 -0.169516536 -0.0045634327
## X[, interesting_features]20 -0.146869516  0.0173152539
## X[, interesting_features]21 -0.010776887  0.1511154896
## X[, interesting_features]22 -0.028609014  0.1356236522
## X[, interesting_features]23 -0.025354318  0.1406934766
## X[, interesting_features]24 -0.044578207  0.1197330768
## X[, interesting_features]25 -0.045746324  0.1183011239
## X[, interesting_features]26 -0.011813679  0.1541610998
## X[, interesting_features]27 -0.152104784  0.0133121940
## X[, interesting_features]28 -0.130852270  0.0352438034
## X[, interesting_features]29 -0.129883488  0.0369461130
## X[, interesting_features]30 -0.178834263 -0.0124556931
## X[, interesting_features]31 -0.164157792  0.0011051734
## X[, interesting_features]32  0.089396198  0.2541155146
## X[, interesting_features]33 -0.036072714  0.1341247449
## X[, interesting_features]34 -0.059571959  0.1062204072
## X[, interesting_features]35 -0.031589104  0.1345085549
## X[, interesting_features]36 -0.128526666  0.0373178142
## X[, interesting_features]37 -0.231997646 -0.0645460622
## X[, interesting_features]38 -0.165737784 -0.0007462745
## X[, interesting_features]39 -0.112658740  0.0496528162
## X[, interesting_features]40  0.027106618  0.1922657835
## X[, interesting_features]41 -0.126489256  0.0370006396
## X[, interesting_features]42 -0.149240623  0.0152159647
## X[, interesting_features]43 -0.117646630  0.0465191464
## X[, interesting_features]44 -0.025995732  0.1381550821
## X[, interesting_features]45 -0.127133786  0.0369194617
## X[, interesting_features]46 -0.006418778  0.1568876267
## X[, interesting_features]47 -0.011286811  0.1513481766
## X[, interesting_features]48 -0.030729088  0.1356614915
## X[, interesting_features]49 -0.167370469 -0.0030653721

Combining conditional and simultaneous

Different \(U\) will have different workflows

pval_marginal = rep(NA, p)
for (i in 1:p) {
    pval_marginal[i] = summary(lm(y ~ X[,i]))$coef[2,4]
}
BH_select_marginal = which(p.adjust(pval_marginal, method='BH') < 0.2)
confint(lm(y ~ X[,BH_select_marginal]))[-1,]
##                                2.5 %      97.5 %
## X[, BH_select_marginal]1  0.06954523  0.23726017
## X[, BH_select_marginal]2  0.06638218  0.23408711
## X[, BH_select_marginal]3 -0.23528794 -0.06770316
# OR
confint(lm(y ~ X))[-1,][BH_select_marginal,]
##           2.5 %      97.5 %
## X8   0.03552751  0.22033843
## X75  0.07665518  0.25767606
## X83 -0.23252984 -0.04695814
# OR
BH_select = which(p.adjust(summary(lm(y ~ X))$coef[,4][-1], method='BH') < 0.2)
confint(lm(y ~ X[,BH_select]))[-1,]
##                       2.5 %      97.5 %
## X[, BH_select]1  0.07730475  0.24423593
## X[, BH_select]2  0.06396168  0.23060713
## X[, BH_select]3 -0.24261213 -0.07577966
## X[, BH_select]4  0.03402342  0.20092145
# OR
confint(lm(y ~ X))[-1,][BH_select,]
##           2.5 %      97.5 %
## X8   0.03552751  0.22033843
## X75  0.07665518  0.25767606
## X83 -0.23252984 -0.04695814
## X87  0.03355910  0.21965045

Combining conditional and simultaneous

Valid intervals

pval_marginal_train = rep(NA, p)
for (i in 1:p) {
    pval_marginal_train[i] = summary(lm(y ~ X[,i], subset=train))$coef[2,4]
}
BH_select_marginal_train = which(p.adjust(pval_marginal_train, method='BH') < 0.2)
confint(lm(y ~ X[,BH_select_marginal_train], subset=test))[-1,]
##       2.5 %      97.5 % 
## -0.19747788  0.07067857
# OR
confint(lm(y ~ X, subset=test))[-1,][BH_select_marginal_train,]
##       2.5 %      97.5 % 
## -0.26593609  0.07141491
# OR
BH_select_train = which(p.adjust(summary(lm(y ~ X, subset=train))$coef[,4][-1], method='BH') < 0.3)
confint(lm(y ~ X[,BH_select_train], subset=test))[-1,]
##                              2.5 %     97.5 %
## X[, BH_select_train]1  -0.10757752 0.16077195
## X[, BH_select_train]2  -0.19988675 0.06581076
## X[, BH_select_train]3  -0.10608767 0.16716446
## X[, BH_select_train]4  -0.04930042 0.21902080
## X[, BH_select_train]5  -0.11423483 0.15507340
## X[, BH_select_train]6   0.06607727 0.33684580
## X[, BH_select_train]7  -0.11567498 0.15233420
## X[, BH_select_train]8  -0.18005117 0.09019904
## X[, BH_select_train]9  -0.11305847 0.16053405
## X[, BH_select_train]10 -0.12887128 0.13209600
# OR
confint(lm(y ~ X, subset=test))[-1,][BH_select_train,]
##           2.5 %     97.5 %
## X18 -0.18944837 0.15316565
## X26 -0.15107276 0.16607232
## X27 -0.21681786 0.10889059
## X33 -0.13059936 0.20194491
## X61 -0.10510713 0.21216939
## X75  0.02024787 0.34420962
## X78 -0.11952105 0.20939033
## X83 -0.26593609 0.07141491
## X96 -0.17421196 0.16571101
## X98 -0.14426862 0.16039904

Course

Final takeaway

Conditional and simultaneous approach are different, but can provide similar guarantees in DFQL.

Goals (conditional inference)

Goals (simultaneous inference)