Models for matched pairs (11.1)

vote_data = matrix(c(175, 16, 54, 188), 2, 2, byrow=TRUE) # Table 11.1
colnames(vote_data) = c('D_2008', 'R_2008')
rownames(vote_data) = c('D_2004', 'R_2004')
vote_data
A matrix: 2 × 2 of type dbl
D_2008R_2008
D_2004175 16
R_2004 54188

Q: has party affiliation shifted?

  • Null is marginal homogeneity \(H_0: \pi_{1+} = \pi_{+1}\)

  • Difference is \(\delta=\pi_{1+} - \pi_{+1}\)

  • Estimate

\[ \widehat{\delta} = \widehat{\pi}_{1+} - \widehat{\pi}_{+1} \]
  • Standard error (under \(H_0\))

\[ SE(\widehat{\delta}) = \sqrt{\frac{N_{12} + N_{21}}{N_{++}^2}} \]

Cochran-Mantel-Haenszel test (6.4)

  • Each subject produces a 2x2 table with rows being years, columns being political parties.

  • E.g. a Bush supporter in 2004 who changed votes to Obama would be

voter = matrix(c(0,1,1,0), 2, 2)
rownames(voter) = c(2004, 2008)
colnames(voter) = c('D', 'R')
voter
A matrix: 2 × 2 of type dbl
DR
200401
200810
  • We can thus form a \(2x2x433\) table representing each voter’s data.

  • Marginal homogeneity \(\equiv\) conditional independence in \(2x2x433\) table.


Combining multiple \(2x2\) tables

  • Let \(N\) be a \(2x2xK\) table.

  • \(N_{11k} \vert \text{marginals of } N[:,:,k] \sim \text{Hypergeometric}(\dots)\)

  • Moments:

\[\begin{split} \begin{aligned} \mu_{11k} &= E[N_{11k} \vert \text{marginals of } N[:,:,k]] \\ &= \frac{N_{1+k}N_{+1k}}{N_{++k}} \\ \sigma^2_{11k} &= \text{Var}(N_{11k} \vert \text{marginals of } N[:,:,k]) \\ &= \frac{N_{1+k}N_{2+k}N_{+1k}{N_{+2k}}}{N^2_{++k}(N_{++k}-1)} \end{aligned} \end{split}\]
  • Cochran’s variation has slightly different variance…


Test statistic

\[ CMH = \frac{[\sum_k (N_{11k} - \mu_{11k})]^2}{\sum_k \sigma^2_{11k}} \]

Test of conditional independence

  • Under \(H_0\) of conditional independence within each table

\[ CMH \overset{k \to \infty}{\to} \chi^2_1 \]

McNemar’s test

  • Cochran-Mantel-Haenszel with one stratum per subject (433 subjects) on \(2x2x433\) table.

  • \(X\): whether vote is Republican or Democrat

  • \(Y\): year of vote

  • \(Z\): subject label

  • Applying CMH: no variation when both votes are R or D: only discordant pairs contribute.

pi_2008 = (175 + 54) / (175 + 16 + 54 + 188) # total is 433
pi_2004 = (175 + 16) / (175 + 16 + 54 + 188)
delta = pi_2004 - pi_2008
SE = sqrt(54 + 16) / (175 + 16 + 54 + 188)
Z = delta / SE
Z
-4.5418687154707

  • We’ve observed 70 discordant pairs (people who changed votes).

  • Under \(H_0\), \(N_{12} | N_{12} + N_{21} \sim \text{Binomial}(N_{12} + N_{21}, 1/2)\)

  • Exact p-value can be computed using e.g. pbinom.

pbinom(16, 54 + 16, 0.5)
Z_mcnemar = (16 - 0.5 * (54 + 16)) / sqrt(0.5 * 0.5 * (54+16))
Z_mcnemar
c(1 - pchisq(Z_mcnemar^2, 1), 2 * pbinom(16, 54 + 16, 0.5))
2.92697776469405e-06
-4.5418687154707
  1. 5.5757761732167e-06
  2. 5.8539555293881e-06

Conditional logistic regression (11.2)

  • Data can be represented as \(Y_i=(Y_{i,1}, Y_{i,2}), 1 \leq i \leq 433\) where \(Y_{i,1}\) are votes in 2004 and \(Y_{i,2}\) are votes in 2008 (with, say, \(Y_{i,1}=1\) if vote is D).

  • Might model these as

\[\begin{split} \text{logit}(P(Y_{i,j}=1)) = \begin{cases} \alpha & j=1 \\ \alpha + \beta & j=2 \end{cases} \end{split}\]
  • This is marginal model.


Subject specific model

  • Might model these as $\( \text{logit}(P(Y_{i,j}=1)) = \begin{cases} \alpha_i & j=1 \\ \alpha_i + \beta & j=2 \end{cases} \)$

  • This has 434 parameters!

  • Can’t expect to be easy to fit, or standard MLE theory to work.

  • Eliminate \(\alpha_i\)’s by conditioning – this is yields a conditional likelihood similar to those we saw when discussing pseudolikelihood.


  • Full likelihood is proportional to

\[\begin{split} \begin{aligned} L(\alpha_1, \dots, \alpha_{433}, \beta) &\propto \exp \biggl(\biggl(\sum_i \alpha_i (Y_{i,1}+Y_{i,2})\biggr) \\ & \qquad + \beta \biggl(\sum_i Y_{i,2}\biggr)\biggr) \end{aligned} \end{split}\]
  • Sufficient statistic for \(\alpha_i\) is \(Y_{i,1}+Y_{i,2}\)

  • Conditioning on \((Y_{i,1}+Y_{i,2})_{1 \leq i \leq n}\) eliminates \(\alpha_i\) from the likelihood.


  • No information when \(Y_{i,1}+Y_{i,2}\) is 0 or 2. Focuses again on discordant pairs…

  • For discordant pairs, let

\[\begin{split} Y_k^* = \begin{cases} 1 & Y_{k,2} = 1 \\ 0 & Y_{k,1} = 1 \end{cases} \end{split}\]
  • Conditional likelihood $\( \prod_k \left[\frac{e^{\beta}}{1 + e^{\beta}}\right]^{Y^*_k} \left[\frac{1}{1 + e^{\beta}}\right]^{1 - Y^*_k} \)$

  • This is logistic likelihood intercept only (one sample Binomial likelihood) $\( \widehat{\beta} = \text{logit}(\hat{\pi^*}), \qquad \widehat{\pi}^* = \frac{N_{21}}{N_{21} + N_{12}} \)$


Including covariats

  • Suppose the pairs have \(p\)-dimensional features \((X_{i,1}, X_{i,2})\).

  • Similar conditioning and calculation yields product over discordant pairs $\( \prod_k \left[\frac{e^{X_{k,2}^T\beta}}{e^{X_{k,2}^T\beta} + e^{X_{k,2}^T\beta}}\right]^{Y^*_k} \left[\frac{e^{X_{k,1}^T\beta}}{e^{X_{k,2}^T\beta} + e^{X_{k,2}^T\beta}}\right]^{1-Y^*_k} \)$

  • A standard logistic likelihood with features \(X_k^* = X_{k,2}-X_{k,1}\).


Square tables (11.4)

  • An \(I \times I\) table often has (essentially) same row names as column names.

  • There are several models of

\[ \pi_{ij} \qquad 1 \leq i, j \leq I \]

or, thinking of log-linear model

\[ \mu_{ij} = \exp(\lambda_{ij}) \qquad 1 \leq i, j \leq I \]
  • Marginal homogeneity

\[ \pi_{i+} = \pi_{+i} \qquad 1 \leq i \leq I \]

or

\[ \mu_{i+} = \mu_{+i} \qquad 1 \leq i \leq I \]
  • Marginal homogeneity generally not expressible as log-linear model.


Symmetry

  • Model:

\[ \log \mu_{ij} = \lambda + \lambda_i + \lambda_j + \lambda_{ij} \]

with \(\lambda_{ij} = \lambda_{ji}\)

  • Note: symmetry implies marginal homogeneity (converse not generally true).

  • Model fit

\[ \widehat{\mu}_{ij} = \frac{N_{ij} + N_{ji}}{2} \]
  • Goodness of fit test with \(H_0\) being symmetry

\[ X^2 = \sum_{(i,j):i < j} \frac{(N_{ij} - N_{ji})^2}{N_{ij} + N_{ji}} \]

Quasi-symmetry

  • Model:

\[ \log \mu_{ij} = \lambda + \lambda^R_i + \lambda^C_j + \lambda_{ij} \]

with \(\lambda_{ij} = \lambda_{ji}\) but \(\lambda^R\) not restricted to agree with \(\lambda^C\).

Quasi-independence

  • Model: $\( \log \mu_{ij} = \lambda + \lambda^R_i + \lambda^C_j + \lambda^I_{i} \delta_{ij} \)$

  • Exact independence is \(\lambda^I \equiv 0\).

  • A lowish dimensional departure from independence.


Agreement between raters

agreement = matrix(c(22, 2, 2, 0,
                      5, 7,14, 0,
                      0, 2,36, 0,
                      0, 1,17,10),
                   4, 4, 
                   byrow=TRUE)
agreement
A matrix: 4 × 4 of type dbl
222 2 0
5714 0
0236 0
011710
  • Quasi-independence model might be reasonable here: raters agree on easy to classify objects (diagonal is higher than independence)

carcinoma = data.frame(N=as.numeric(agreement), # Table 11.8
                       A=factor(c(rep(1,4), rep(2,4), rep(3,4), rep(4,4))),
                       B=factor(rep(1:4,4))) 
indep_fit = glm(N ~ A + B, family=poisson, data=carcinoma)
indep_resid = matrix(resid(indep_fit), 4, 4)
indep_resid # positive on diagonal
A matrix: 4 × 4 of type dbl
5.0439297-0.4140653-4.2770661-2.099233
-0.4002454 2.2177506-0.3128405-2.099233
-4.1701116-1.0460256 2.6798251-2.537849
-3.5796032-1.2657260 0.1540102 3.676325

departure = factor((carcinoma$A == carcinoma$B) * as.numeric(carcinoma$A))
quasi_indep_fit = glm(N ~ A + B + departure, family=poisson, data=carcinoma)
quasi_indep_resid = matrix(resid(quasi_indep_fit), 4, 4)
quasi_indep_resid  # residuals are 0 on diagonal
A matrix: 4 × 4 of type dbl
5.161914e-08 1.1958607-0.7480157-6.169760e-05
1.486609e+00 0.0000000-0.6636398-1.395289e-04
-1.236669e+00 0.6307957 0.0000000-7.928944e-05
-1.932544e+00-1.3509531 1.0251702 0.000000e+00
anova(indep_fit, quasi_indep_fit)
A anova: 2 × 4
Resid. DfResid. DevDfDeviance
<dbl><dbl><dbl><dbl>
19117.95686NA NA
25 13.17806 4104.7788

Cohen’s \(\kappa\)

  • Cohen’s \(\kappa\)

\[ \kappa = \frac{\sum_i \pi_{ii} - \sum_i \pi_{i+} \pi_{+i}}{1 - \sum_i \pi_{i+} \pi_{+i}} \]
  • See textbook for estimate of variance of sample version \(\widehat{\kappa}\)

row_margin = apply(agreement, 2, sum) / sum(agreement)
col_margin = apply(agreement, 1, sum) / sum(agreement)
kappa_hat = (sum(diag(agreement)) / sum(agreement) - sum(row_margin * col_margin)) / (1 - sum(row_margin * col_margin))
kappa_hat
0.493005595523581

Paired preferences (Bradley-Terry model, 11.6)

baseball = matrix(c(NA,12, 6,10,10, 
                     6,NA, 9,11,13,
                    12, 9,NA,12, 9,
                     8, 7, 6,NA,12,
                     8, 5, 9, 6,NA), 
                  5, 5, byrow=TRUE)
rownames(baseball) = c('Boston', 
                       'New York', 
                       'Tampa Bay', 
                       'Toronto', 
                       'Baltimore')
colnames(baseball) = rownames(baseball)
baseball
A matrix: 5 × 5 of type dbl
BostonNew YorkTampa BayTorontoBaltimore
BostonNA12 61010
New York 6NA 91113
Tampa Bay12 9NA12 9
Toronto 8 7 6NA12
Baltimore 8 5 9 6NA

  • Let \(\Pi_{ij}\) denote probability team \(i\) beats team \(j\).

  • Model

\[ \text{logit}(\Pi_{ij}) = \beta_i - \beta_j \]
  • Bradley-Terry model is an example of quasi-symmetry.


Paired preferences

library(BradleyTerry2)
processed_baseball = BradleyTerry2::countsToBinomial(baseball)
processed_baseball
Attaching package: ‘BradleyTerry2’
The following object is masked _by_ ‘.GlobalEnv’:

    baseball
A data.frame: 10 × 4
player1player2win1win2
<fct><fct><dbl><dbl>
Boston New York 12 6
Boston Tampa Bay 612
Boston Toronto 10 8
Boston Baltimore10 8
New York Tampa Bay 9 9
New York Toronto 11 7
New York Baltimore13 5
Tampa BayToronto 12 6
Tampa BayBaltimore 9 9
Toronto Baltimore12 6

BradleyTerry2::BTm(cbind(win1, win2), 
                   player1=player1, 
                   player2=player2, 
                   data=processed_baseball, refcat='Baltimore')
Bradley Terry model fit by glm.fit 

Call:  BradleyTerry2::BTm(outcome = cbind(win1, win2), player1 = player1, 
    player2 = player2, refcat = "Baltimore", data = processed_baseball)

Coefficients:
   ..Boston   ..New York  ..Tampa Bay    ..Toronto  
     0.4539       0.4991       0.6354       0.2287  

Degrees of Freedom: 10 Total (i.e. Null);  6 Residual
Null Deviance:	    13.18 
Residual Deviance: 7.703 	AIC: 48.66

  • Can break down games into home and away

  • Let \(\Pi^*_{ij}\) denote probability home team \(i\) beats away team \(j\)

  • Model $\( \text{logit}(\Pi^*_{ij}) = \alpha + \beta_i - \beta_j. \)$