|
|
|
|
|
|
|
| Scientists collect data first and ask questions later. (Candes) DFQL | The idea of a scientist, struck, as if by lightning with a question, is far from the truth. (Tukey) |
\[ \newcommand{\data}{{\cal T}} \]
Query \(Q^U_1\) might be choice of a tuning parameter
Query \(Q^U_2\) might be a feature selection step
Data \(T'\) might be validation data
Suppose we are modeling phenotype \(y\) as a function of genotype (or other -omic features) \(X\).
Data: \(T=(X,y)\) and \(T'\) might be some holdout set.
\(U\) will build a predictive model for \(y|X\).
## Loading required package: Matrix
## Loading required package: foreach
## Loaded glmnet 2.0-18
## [1] 500 100
# Q^U_1
lam = cv.glmnet(X, y)$lambda.min
# Q^U_2
G = glmnet(X, y)
beta.hat = coef(G, s=lam)[-1]
interesting_features = which(beta.hat != 0)
interesting_features## [1] 2 4 5 8 12 13 15 18 20 21 24 26 27 29 33 34 39
## [18] 40 47 48 52 54 56 57 58 60 61 65 66 73 74 75 76 78
## [35] 79 80 83 84 85 87 90 91 92 94 95 96 98 99 100
## 2.5 % 97.5 %
## X[, interesting_features]1 -0.134666925 0.0296184179
## X[, interesting_features]2 -0.005672499 0.1613077599
## X[, interesting_features]3 -0.053930161 0.1109551781
## X[, interesting_features]4 0.044349186 0.2076825410
## X[, interesting_features]5 -0.027926685 0.1365118283
## X[, interesting_features]6 -0.032769667 0.1316021200
## X[, interesting_features]7 -0.158303438 0.0072563316
## X[, interesting_features]8 -0.024253365 0.1418585076
## X[, interesting_features]9 -0.121432787 0.0417744865
## X[, interesting_features]10 0.014321121 0.1792561263
## X[, interesting_features]11 -0.005529815 0.1593495228
## X[, interesting_features]12 -0.013091442 0.1517333395
## X[, interesting_features]13 -0.133795013 0.0305297154
## X[, interesting_features]14 -0.036626589 0.1327903094
## X[, interesting_features]15 -0.004901185 0.1590721641
## X[, interesting_features]16 -0.157122744 0.0047186279
## X[, interesting_features]17 -0.098683460 0.0639697418
## X[, interesting_features]18 -0.138232394 0.0260313420
## X[, interesting_features]19 -0.169516536 -0.0045634327
## X[, interesting_features]20 -0.146869516 0.0173152539
## X[, interesting_features]21 -0.010776887 0.1511154896
## X[, interesting_features]22 -0.028609014 0.1356236522
## X[, interesting_features]23 -0.025354318 0.1406934766
## X[, interesting_features]24 -0.044578207 0.1197330768
## X[, interesting_features]25 -0.045746324 0.1183011239
## X[, interesting_features]26 -0.011813679 0.1541610998
## X[, interesting_features]27 -0.152104784 0.0133121940
## X[, interesting_features]28 -0.130852270 0.0352438034
## X[, interesting_features]29 -0.129883488 0.0369461130
## X[, interesting_features]30 -0.178834263 -0.0124556931
## X[, interesting_features]31 -0.164157792 0.0011051734
## X[, interesting_features]32 0.089396198 0.2541155146
## X[, interesting_features]33 -0.036072714 0.1341247449
## X[, interesting_features]34 -0.059571959 0.1062204072
## X[, interesting_features]35 -0.031589104 0.1345085549
## X[, interesting_features]36 -0.128526666 0.0373178142
## X[, interesting_features]37 -0.231997646 -0.0645460622
## X[, interesting_features]38 -0.165737784 -0.0007462745
## X[, interesting_features]39 -0.112658740 0.0496528162
## X[, interesting_features]40 0.027106618 0.1922657835
## X[, interesting_features]41 -0.126489256 0.0370006396
## X[, interesting_features]42 -0.149240623 0.0152159647
## X[, interesting_features]43 -0.117646630 0.0465191464
## X[, interesting_features]44 -0.025995732 0.1381550821
## X[, interesting_features]45 -0.127133786 0.0369194617
## X[, interesting_features]46 -0.006418778 0.1568876267
## X[, interesting_features]47 -0.011286811 0.1513481766
## X[, interesting_features]48 -0.030729088 0.1356614915
## X[, interesting_features]49 -0.167370469 -0.0030653721
Do these confidence intervals mean anything?
Do such intervals have any (desirable) properties?
If they were to cover something, what would they cover?
test = 1:(n/2)
train = (n/2+1):n
library(glmnet)
# Q^U_1
lam = cv.glmnet(X[train,], y[train])$lambda.min
# Q^U_2
G = glmnet(X[train,], y[train])
beta.hat = coef(G, s=lam)[-1]
interesting_features_train = which(beta.hat != 0)
print(interesting_features_train)## [1] 2 4 7 8 11 12 13 18 26 27 29 30 33 34 35 36 37
## [18] 46 47 52 53 54 58 59 61 63 66 69 70 73 75 78 80 83
## [35] 85 87 91 96 98 100
## 2.5 % 97.5 %
## X[, interesting_features_train]1 -0.176888392 0.095904972
## X[, interesting_features_train]2 -0.067808045 0.206493314
## X[, interesting_features_train]3 -0.129119678 0.157355086
## X[, interesting_features_train]4 0.043409650 0.316791551
## X[, interesting_features_train]5 -0.127656060 0.151734772
## X[, interesting_features_train]6 -0.097833997 0.190320548
## X[, interesting_features_train]7 -0.125115056 0.152009075
## X[, interesting_features_train]8 -0.108613262 0.181516865
## X[, interesting_features_train]9 -0.205486415 0.073064216
## X[, interesting_features_train]10 -0.166316178 0.125287400
## X[, interesting_features_train]11 -0.099492671 0.173449817
## X[, interesting_features_train]12 -0.146830468 0.140499848
## X[, interesting_features_train]13 -0.059709656 0.223870212
## X[, interesting_features_train]14 -0.208018437 0.077280011
## X[, interesting_features_train]15 -0.155336405 0.147398781
## X[, interesting_features_train]16 -0.271185834 0.006001601
## X[, interesting_features_train]17 -0.226568395 0.040258694
## X[, interesting_features_train]18 -0.087402111 0.180298799
## X[, interesting_features_train]19 -0.227014421 0.040702382
## X[, interesting_features_train]20 -0.043512844 0.221811221
## X[, interesting_features_train]21 -0.135436241 0.122398662
## X[, interesting_features_train]22 -0.076496690 0.194795728
## X[, interesting_features_train]23 -0.112629537 0.160590765
## X[, interesting_features_train]24 -0.013667991 0.262754717
## X[, interesting_features_train]25 -0.098221085 0.187082754
## X[, interesting_features_train]26 -0.158577403 0.106630717
## X[, interesting_features_train]27 -0.128189404 0.159698822
## X[, interesting_features_train]28 -0.155836548 0.122465899
## X[, interesting_features_train]29 -0.181898830 0.095007694
## X[, interesting_features_train]30 -0.267139422 0.015324643
## X[, interesting_features_train]31 0.054168268 0.340460622
## X[, interesting_features_train]32 -0.121854709 0.162504021
## X[, interesting_features_train]33 -0.113851202 0.144323265
## X[, interesting_features_train]34 -0.204572534 0.084396327
## X[, interesting_features_train]35 -0.121602240 0.163892733
## X[, interesting_features_train]36 -0.004123234 0.252562041
## X[, interesting_features_train]37 -0.143859313 0.122233435
## X[, interesting_features_train]38 -0.134951838 0.150227919
## X[, interesting_features_train]39 -0.141267417 0.132107945
## X[, interesting_features_train]40 -0.237839828 0.038029446
Different interesting_features… seems likely quality of selection is lower than not splitting.
Can we provide better (valid) intervals?
Much attention (early 2000s) focused on large scale inference
In our current data setup \(U\) will focus on marginal associations
Z = rep(NA, p)
for (i in 1:p) {
Z[i] = summary(lm(y ~ X[,i]))$coef[2,3]
}
plot(density(Z), ylim=c(0,0.4))
zval = seq(-5,5,length=101)
lines(zval, dnorm(zval), col='red', lwd=2)## Warning in locfdr(Z): CM estimation failed, middle of histogram non-normal
BH_select = which(p.adjust(2 * (1 - pnorm(abs(Z))), method='BH') < 0.2)
bonf_select = which(p.adjust(2 * (1 - pnorm(abs(Z))), method='bonferroni') < 0.2)
print(BH_select)## [1] 8 75 83
What about confidence intervals?
What would these confidence intervals cover?
Recognizing DFQL, can we provide valid inference?
Inference here means hypothesis tests, confidence intervals, point estimators, posterior distributions…
What kind of data analyses can we handle (in theory? in practice?)
Two broad approaches: simultaneous (tied to multiple comparisons) and conditional.
The two approaches are not mutually exclusive.
Classical scientific method is a form of conditional inference.
\(T\) denotes \(K=10\) different treatments in a clinical trial.
\(Q^U_1\) asks which treatment (apparently) works best.
\(Q^U_2\) asks which is second best (variant of Sampson and Sill).
Data \(T'\) is confirmatory sample: gold standard reports an estimate of best treatment effect based on \(T'\) alone.
T = first_stage = list(D1=rnorm(200), D2=rnorm(200),
D3=rnorm(200), D4=rnorm(200),
D5=rnorm(200), D6=rnorm(200),
D7=rnorm(200), D8=rnorm(200),
D9=rnorm(200), D10=rnorm(200))
treatment_means = c(mean(T$D1), mean(T$D2),
mean(T$D3), mean(T$D4),
mean(T$D5), mean(T$D6),
mean(T$D7), mean(T$D8),
mean(T$D9), mean(T$D10))
treatment_order = order(treatment_means, decreasing=TRUE)
candidates = treatment_order[1:2]
candidates## [1] 7 4
T_prime = second_stage = list(D7=rnorm(200), D4=rnorm(200))
print(t.test(c(second_stage$D7), conf.level=1-0.05)$conf.int)## [1] -0.06839504 0.20088615
## attr(,"conf.level")
## [1] 0.95
## [1] -0.1853057 0.1017236
## attr(,"conf.level")
## [1] 0.95
This is classic scientific method (and equivalent to data splitting)
Using \(T'\) for confirmation allows quite arbitrary \(Q^U_i\).
Wasteful to only use \(T\) for selection? (common criticism of data splitting)
Can we reuse \(T\)?
Unadjusted estimator \((T_k+T'_k)/2\) is biased, gold standard estimator \(T'_k\) is unbiased but more variable.
Simple manifestation of DFQL: sometimes referred to as researcher degrees of freedom.
Difference can easily be bigger with larger \(K\), different sample sizes, etc.
Great efforts have been made to make \(U\)’s results reproducible.
For replicability we need to (statistically) understand this collection of random variables.
\(U\)’s exploratory interaction with the data is limited to \(T\).
Earlier data is “wasted”, confirmatory focus on \(T'\).
No selection effect.
Statistical model: e.g. for gold standard \(T'|T \sim F \in {\cal M}\)
Parameter: \(\theta: {\cal M} \rightarrow \mathbb{R}\), e.g. treament effect for “best” treatment.
Language of statistics: for parameter \(\theta\) we have
point estimators
confidence intervals
posterior distributions
In the drop the losers example, it is clear a priori what the model is and, correspondingly what the parameters of interest are.
Not really in the spirit of DFQL.
Model (parametric form): \(T \sim N(\theta, \Sigma), T' \sim N(\theta, \Sigma')\)
Parameters of interest \(\theta\), \(\Sigma\) and \(\Sigma'\) diagonal.
Implicitly assumes we could sample from all arms in second stage though we only sample in best two arms.
## [1] -0.03792903 0.23805049
## attr(,"conf.level")
## [1] 0.995
## [1] -0.1444023 0.1505564
## attr(,"conf.level")
## [1] 0.995
## [1] -0.08795882 0.22044994
## attr(,"conf.level")
## [1] 0.975
## [1] -0.2061589 0.1225768
## attr(,"conf.level")
## [1] 0.975
Same simultaneity as Bonferroni method.
Only really requires a model for \(T'|T\).
## $bonferroni
## [1] 2.807034
##
## $classical
## [1] 3.169822
## $bonferroni
## [1] 3.227218
##
## $classical
## [1] 3.169822
## $bonferroni
## [1] 2.807034
##
## $classical
## [1] 2.771808
Here \(K=10\) but as soon as \(K > 40\) the classical intervals are shorter (assuming equal size in both \(T\) and \(T'\) and large number of samples in each arm).
When reporting only best treatment, as soon as \(K \approx 10\) the classical intervals are shorter.
There are better simultaneous intervals as well as better conditional intervals. We’ll revisit this example later.
Could work with no split into test, train.
Order \(p 2^p\) contrasts possible in confint(lm(...)).
Bonferroni threshold \(\approx \sqrt{2 \log m}\) with \(m\) the number of comparisons yields threshold \(\approx \sqrt{p}\).
Similar to Scheff'e correction for all possible contrasts.
More precise adjustment possible POSI (Berk et al.) but worst case still \(O(\sqrt{p})\).
In terms of length of interval, data splitting certainly attractive here.
Quality of selection is harder to measure but likely better with no splitting.
Closer to DQFL but this list of contrasts is still pre-specified.
If interesting_features is selected by \(Q^U_1\) (either on all data or just train) does that mean that the true model is lm(y ~ X[,interesting_features_train])?
Clearly no, and \(U\) will have some opinions about what a correct model is knowing this that interesting_features_train were chosen.
When using data splitting \(U\) is free to choose a model to publish: supports DFQL setting.
Back of the envelope calculation says we will pay \(\sqrt{2}\) penalty in length of intervals (along with deterioration of selection quality)
Can we mitigate this \(\sqrt{2}\)?
Required to collect data \(T'\).
Throwing away \(T\) is formally equivalent to conditioning on \(T\).
Instead of throwing out all of \(T\), condition only on which treatment is apparently best:
Rao-Blackwell (Cohen + Sacrowicz) \[ \hat{\theta}_{\hat{K}} = E[T'_{k}|(T'+T)_k, (T_j)_{j \neq k}, \hat{K}=k] \]
Tests and confidence intervals also available (Sampson and Sill)
Similar technique can be used when looking at best 2 treatments, rather than just single best treatment.
Specifically allows \(U\)’s intervention – even model \({\cal M}\) is chosen after observing \((Q^U_1, Q^U_2)\)!
\(U\) specifies model \({\cal M}\) for the law \(T'|T\).
\(U\) specifies model \({\cal M}\) for the law \((T',T)\).
Do inference on all data \((T,T')\).
Is this improvement limited to drop the losers? No.
Do we need confirmatory sample \(T'\)? No. We can even have \(T=T'\).
Can we allow arbitrary queries? Probably not.
If choice of \(\theta:{\cal M} \rightarrow \mathbb{R}\) is affected by \((Q^U_1, Q^U_2)\) it is difficult to interpret \(\theta\) as a parameter unless we fix \((Q^U_1, Q^U_2)\) (or whatever coarser sigma algebra determined \(\theta\)).
Note this is a choice of both \({\cal M}\) and \(\theta: {\cal M} \rightarrow \mathbb{R}\).
## 2.5 % 97.5 %
## X2 -0.1616404680 0.024749262
## X4 -0.0204230882 0.168689551
## X5 -0.0556223871 0.129492952
## X8 0.0355275128 0.220338426
## X12 -0.0423558821 0.141731525
## X13 -0.0426952021 0.143103270
## X15 -0.1706528788 0.015595744
## X18 -0.0378114079 0.145547615
## X20 -0.1370011432 0.044158381
## X21 -0.0007036447 0.186546336
## X24 -0.0221560022 0.170244284
## X26 -0.0179344367 0.165311365
## X27 -0.1506676792 0.031240039
## X29 -0.0434491740 0.150770465
## X33 -0.0095953969 0.173100259
## X34 -0.1892108442 -0.005328268
## X39 -0.1084096858 0.073399864
## X40 -0.1533432574 0.033077796
## X47 -0.1973497473 -0.013104382
## X48 -0.1638190571 0.019842431
## X52 -0.0207974683 0.162661017
## X54 -0.0220168882 0.161524815
## X56 -0.0410425027 0.145646701
## X57 -0.0570684814 0.126227891
## X58 -0.0627383985 0.122373870
## X60 -0.0133046349 0.171560957
## X61 -0.1737119144 0.010669169
## X65 -0.1418024309 0.042776546
## X66 -0.1332307367 0.052068724
## X73 -0.1787070093 0.005354464
## X74 -0.1682989641 0.017257623
## X75 0.0766551805 0.257676063
## X76 -0.0532339871 0.140339787
## X78 -0.0610251725 0.124040924
## X79 -0.0414717948 0.143714331
## X80 -0.1322885226 0.050390804
## X83 -0.2325298441 -0.046958139
## X84 -0.1670509076 0.014208098
## X85 -0.1198502220 0.058091444
## X87 0.0335590952 0.219650453
## X90 -0.1450235342 0.037292465
## X91 -0.1570719312 0.028110668
## X92 -0.1299231174 0.051292739
## X94 -0.0336919902 0.146026743
## X95 -0.1395153360 0.040032891
## X96 -0.0126469605 0.169566314
## X98 -0.0068351396 0.171314613
## X99 -0.0284372199 0.155062343
## X100 -0.1789870458 0.002695492
Model is full model \[ y | X \sim N(X\beta, \sigma^2 I) \]
Choice is which \(\beta\)’s we should report intervals for.
## 2.5 % 97.5 %
## X[, interesting_features]1 -0.134666925 0.0296184179
## X[, interesting_features]2 -0.005672499 0.1613077599
## X[, interesting_features]3 -0.053930161 0.1109551781
## X[, interesting_features]4 0.044349186 0.2076825410
## X[, interesting_features]5 -0.027926685 0.1365118283
## X[, interesting_features]6 -0.032769667 0.1316021200
## X[, interesting_features]7 -0.158303438 0.0072563316
## X[, interesting_features]8 -0.024253365 0.1418585076
## X[, interesting_features]9 -0.121432787 0.0417744865
## X[, interesting_features]10 0.014321121 0.1792561263
## X[, interesting_features]11 -0.005529815 0.1593495228
## X[, interesting_features]12 -0.013091442 0.1517333395
## X[, interesting_features]13 -0.133795013 0.0305297154
## X[, interesting_features]14 -0.036626589 0.1327903094
## X[, interesting_features]15 -0.004901185 0.1590721641
## X[, interesting_features]16 -0.157122744 0.0047186279
## X[, interesting_features]17 -0.098683460 0.0639697418
## X[, interesting_features]18 -0.138232394 0.0260313420
## X[, interesting_features]19 -0.169516536 -0.0045634327
## X[, interesting_features]20 -0.146869516 0.0173152539
## X[, interesting_features]21 -0.010776887 0.1511154896
## X[, interesting_features]22 -0.028609014 0.1356236522
## X[, interesting_features]23 -0.025354318 0.1406934766
## X[, interesting_features]24 -0.044578207 0.1197330768
## X[, interesting_features]25 -0.045746324 0.1183011239
## X[, interesting_features]26 -0.011813679 0.1541610998
## X[, interesting_features]27 -0.152104784 0.0133121940
## X[, interesting_features]28 -0.130852270 0.0352438034
## X[, interesting_features]29 -0.129883488 0.0369461130
## X[, interesting_features]30 -0.178834263 -0.0124556931
## X[, interesting_features]31 -0.164157792 0.0011051734
## X[, interesting_features]32 0.089396198 0.2541155146
## X[, interesting_features]33 -0.036072714 0.1341247449
## X[, interesting_features]34 -0.059571959 0.1062204072
## X[, interesting_features]35 -0.031589104 0.1345085549
## X[, interesting_features]36 -0.128526666 0.0373178142
## X[, interesting_features]37 -0.231997646 -0.0645460622
## X[, interesting_features]38 -0.165737784 -0.0007462745
## X[, interesting_features]39 -0.112658740 0.0496528162
## X[, interesting_features]40 0.027106618 0.1922657835
## X[, interesting_features]41 -0.126489256 0.0370006396
## X[, interesting_features]42 -0.149240623 0.0152159647
## X[, interesting_features]43 -0.117646630 0.0465191464
## X[, interesting_features]44 -0.025995732 0.1381550821
## X[, interesting_features]45 -0.127133786 0.0369194617
## X[, interesting_features]46 -0.006418778 0.1568876267
## X[, interesting_features]47 -0.011286811 0.1513481766
## X[, interesting_features]48 -0.030729088 0.1356614915
## X[, interesting_features]49 -0.167370469 -0.0030653721
What is the model here?
Note: even if interesting_features fixed a priori the targets these intervals are trying to cover different quantities than previous slide…
pval_marginal = rep(NA, p)
for (i in 1:p) {
pval_marginal[i] = summary(lm(y ~ X[,i]))$coef[2,4]
}
BH_select_marginal = which(p.adjust(pval_marginal, method='BH') < 0.2)
confint(lm(y ~ X[,BH_select_marginal]))[-1,]## 2.5 % 97.5 %
## X[, BH_select_marginal]1 0.06954523 0.23726017
## X[, BH_select_marginal]2 0.06638218 0.23408711
## X[, BH_select_marginal]3 -0.23528794 -0.06770316
## 2.5 % 97.5 %
## X8 0.03552751 0.22033843
## X75 0.07665518 0.25767606
## X83 -0.23252984 -0.04695814
# OR
BH_select = which(p.adjust(summary(lm(y ~ X))$coef[,4][-1], method='BH') < 0.2)
confint(lm(y ~ X[,BH_select]))[-1,]## 2.5 % 97.5 %
## X[, BH_select]1 0.07730475 0.24423593
## X[, BH_select]2 0.06396168 0.23060713
## X[, BH_select]3 -0.24261213 -0.07577966
## X[, BH_select]4 0.03402342 0.20092145
## 2.5 % 97.5 %
## X8 0.03552751 0.22033843
## X75 0.07665518 0.25767606
## X83 -0.23252984 -0.04695814
## X87 0.03355910 0.21965045
Note the targets that these intervals are trying to cover are different from target of large scale inference setting.
Each BH correction may enjoy a simultaneous guarantee (subject to satisying assumptions) but we may still want to construct confidence intervals.
Are these intervals valid? No (even if the lm models are well specified in the population, e.g. \((y,X)\) jointly Gaussian).
Why? DFQL.
pval_marginal_train = rep(NA, p)
for (i in 1:p) {
pval_marginal_train[i] = summary(lm(y ~ X[,i], subset=train))$coef[2,4]
}
BH_select_marginal_train = which(p.adjust(pval_marginal_train, method='BH') < 0.2)
confint(lm(y ~ X[,BH_select_marginal_train], subset=test))[-1,]## 2.5 % 97.5 %
## -0.19747788 0.07067857
## 2.5 % 97.5 %
## -0.26593609 0.07141491
# OR
BH_select_train = which(p.adjust(summary(lm(y ~ X, subset=train))$coef[,4][-1], method='BH') < 0.3)
confint(lm(y ~ X[,BH_select_train], subset=test))[-1,]## 2.5 % 97.5 %
## X[, BH_select_train]1 -0.10757752 0.16077195
## X[, BH_select_train]2 -0.19988675 0.06581076
## X[, BH_select_train]3 -0.10608767 0.16716446
## X[, BH_select_train]4 -0.04930042 0.21902080
## X[, BH_select_train]5 -0.11423483 0.15507340
## X[, BH_select_train]6 0.06607727 0.33684580
## X[, BH_select_train]7 -0.11567498 0.15233420
## X[, BH_select_train]8 -0.18005117 0.09019904
## X[, BH_select_train]9 -0.11305847 0.16053405
## X[, BH_select_train]10 -0.12887128 0.13209600
## 2.5 % 97.5 %
## X18 -0.18944837 0.15316565
## X26 -0.15107276 0.16607232
## X27 -0.21681786 0.10889059
## X33 -0.13059936 0.20194491
## X61 -0.10510713 0.21216939
## X75 0.02024787 0.34420962
## X78 -0.11952105 0.20939033
## X83 -0.26593609 0.07141491
## X96 -0.17421196 0.16571101
## X98 -0.14426862 0.16039904
Note the targets that these intervals are trying to cover are different from target of large scale inference setting.
Each BH correction may enjoy a simultaneous guarantee (subject to satisfying assumptions) but we may still want to construct confidence intervals.
Conditional and simultaneous approach are different, but can provide similar guarantees in DFQL.
Provide \(U\) tools for valid inference (e.g. hypothesis tests, confidence intervals, point estimators, posterior distributions) under DFQL.
Admit as many data analysis workflows as we can subject to computational and theoretical constraints.
In lectures on Mondays, we will build up from simple one sample problem to selection algorithms presented today.
The BH paper with introduction of False Discovery Rate (FDR) and corresponding procedure (and descendants) has been hugely influential in early 21st century.
Much recent work extending settings under which FDR (or close) can be achieved.
Less focus on usual statistical objects (e.g. hypothesis tests, confidence intervals, point estimators, posterior distributions) and DFQL than on the particular FDR control in a multiple comparisons setting.
Much of these methods will be discussed in our discussion sessions on Wednesday.