Brief notes on readings #1

web.stanford.edu/class/stats364/

Jonathan Taylor

Spring 2020

Benjamini (2010)

At the heart of our concern in this example is inference following selection, or as it can be called – selective inference. Once applied only to the selected few, the interpretation of the usual measures of uncertainty do not remain intact directly, unless properly adjusted.

Benjamini (2010)

On Iaonnidis (2005):
To a reader of the paper who is familiar with Multiple Comparisons it is clear that the source of the problems he is discussing is the use of nominal hypothesis testing even though many hypotheses are being tested, both within and across studies, and manifested via publication bias.

Benjamini (2010)

On the literature:
  • 47/60 papers in NEJM had no multiplicity adjustment but needed it.
  • Lander and Kruglyak (1995): Adopting too lax a standard guarantees a burgeoning literature of false positive linkage claims, each with its own symbol… Scientific disciplines erode their credibility when substantial proportion of claims cannot be replicated…

Benjamini (2010)

We were always aware that Multiple Comparisons is concerned with the effect of simultaneous and selective inference on the properties of the usual inferences if unadjusted for multiplicities.


The confusion arises because methods that assure simultaneous inference also assure selective inference, and within the FWER framework all methods offer simultaneous inference, and therefore answer both concerns.

Benjamini (2010)

On estimation vs testing:

Finally, they searched for the brain region whose correlation with the reported distress was the highest. The reported correlation with activity in anterior cingulate cortex was 0.88.

\(\to\) A poor estimate of the true correlation at that location. What is a better estimate?

Benjamini (2010)

  • Can lead to confusion for users.
  • Choice of error rate may depend on end goal.

Benjamini (2010)

Other issues:
  • Conjunction null \(H_{conj,I} = \cup_{i \in I} H_{0,i}\).
  • False Coverage Rate (FCR) for simple selection procedures with independent test statistics.

Taylor and Tibshirani (2015)

Most statistical analyses involve some kind of “selection”—searching through the data for the strongest associations. Measuring the strength of the resulting associations is a challenging task, because one must account for the effects of the selection.


This statistical problem has become known as selective inference the assessment of significance and effect sizes from a dataset after mining the same data to find these associations.

Taylor and Tibshirani (2015)

  • Examples are more closely related to statistical learning: assessment after searching through many models.
  • How would you carry out simultaneous inference here?
  • TN distribution for polyhedral conditioning events \(\{y: Ay \leq b\}\)

Taylor and Tibshirani (2015)

Why a truncated normal?
  • At first, an accident…
  • Failure of the plugin principle (c.f. impossibility of Leeb and Potscher in upcoming reading).
  • Would bootstrap work?