Brief notes on readings #4

web.stanford.edu/class/stats364/

Jonathan Taylor

Spring 2020

Leeb and Potscher (2006)

Leeb and Potscher (2006)

Leeb and Potscher (2006)

Leeb and Potscher (2006)

Main result

Leeb and Potscher (2006)

An implication

Leeb and Potscher (2006)

A simple example

Leeb and Potscher (2006)

Leeb and Potscher (2006)

Exercise

Consider our selection of \(Z_1 > Z_2\) but consider a pre-asymptotic scenario where our choice is based on \[ (Z_{1,n_1}, Z_{2,n_2}) = \left(n_1^{1/2}\bar{X}_{1,n_1}, n_2^{1/2} \bar{X}_{2,n_2}\right) \] where \(X_1=(X_{1,1}, \dots, X_{1,n_1}) \overset{IID}{\sim} F_1\) and \(X_2 = (X_{2,1}, \dots, X_{2,n_2}) \overset{IID}{\sim} F_2\).

  1. How might you bootstrap this experiment (ignoring selection)?

  2. Suppose we restrict the bootstrap samples so \(Z_{1,n_1}^* > Z_{2,n_2}^*\) (i.e. bootstrapped sample mean in arm 1 beats arm 2), what does Leeb and Potscher’s result say about the quantiles of \[ Z_{1,n_1}^* - Z_{1,n_1} | X_1, X_2, Z_{1,n_1}^* > Z_{2,n_2}^*?\]

  3. Compare this to what we could say about the pre-selection quantiles of \[ Z_{1,n_1}^* - Z_{1,n_1} | X_1, X_2.\] Do you think the bootstrap quantile interval will work in the conditional setting?

  4. Describe how you might carry out valid conditional inference in this setting. Try your procedure out using a few choices for \((F_1, F_2)\) with non-normal errors and unknown variance.

  5. Are all bests of if we had paired data? That is, \[ \mathbb{R}^2 \ni W_i \overset{IID}{\sim} F \] and \[ \begin{aligned} \bar{Z}_{1,n} &= n^{1/2} \sum_{i=1}^n W_{i,1} \\ \bar{Z}_{2,n} = n^{1/2} \sum_{i=1}^n W_{i,2}. \end{aligned} \]

Lee et al. (2016)