Models for matched pairs (11.1)¶

vote_data = matrix(c(175, 16, 54, 188), 2, 2, byrow=TRUE) # Table 11.1
colnames(vote_data) = c('D_2008', 'R_2008')
rownames(vote_data) = c('D_2004', 'R_2004')
vote_data

A matrix: 2 × 2 of type dbl
	D_2008	R_2008
D_2004	175	16
R_2004	54	188

Q: has party affiliation shifted?¶

Null is marginal homogeneity $H_{0} : π_{1 +} = π_{+ 1}$
Difference is $δ = π_{1 +} - π_{+ 1}$
Estimate

\hat{δ} = {\hat{π}}_{1 +} - {\hat{π}}_{+ 1}

Standard error (under $H_{0}$ )

S E (\hat{δ}) = \sqrt{\frac{N_{12} + N_{21}}{N_{+ +}^{2}}}

Cochran-Mantel-Haenszel test (6.4)¶

Each subject produces a 2x2 table with rows being years, columns being political parties.
E.g. a Bush supporter in 2004 who changed votes to Obama would be

voter = matrix(c(0,1,1,0), 2, 2)
rownames(voter) = c(2004, 2008)
colnames(voter) = c('D', 'R')
voter

A matrix: 2 × 2 of type dbl
	D	R
2004	0	1
2008	1	0

We can thus form a $2 x 2 x 433$ table representing each voter’s data.
Marginal homogeneity $\equiv$ conditional independence in $2 x 2 x 433$ table.

Combining multiple $2 x 2$ tables¶

Let $N$ be a $2 x 2 x K$ table.
$N_{11 k} | marginals of N [:, :, k] \sim Hypergeometric (\dots)$
Moments:

\begin{array}{r} \begin{aligned} μ_{11 k} & = E [N_{11 k} | marginals of N [:, :, k]] \\ = \frac{N_{1 + k} N_{+ 1 k}}{N_{+ + k}} \\ σ_{11 k}^{2} & = Var (N_{11 k} | marginals of N [:, :, k]) \\ = \frac{N_{1 + k} N_{2 + k} N_{+ 1 k} N_{+ 2 k}}{N_{+ + k}^{2} (N_{+ + k} - 1)} \end{aligned} \end{array}

Cochran’s variation has slightly different variance…

Test statistic¶

C M H = \frac{[\sum_{k} (N_{11 k} - μ_{11 k})]^{2}}{\sum_{k} σ_{11 k}^{2}}

Test of conditional independence¶

Under $H_{0}$ of conditional independence within each table

C M H \overset{k \to \infty}{\to} χ_{1}^{2}

McNemar’s test¶

Cochran-Mantel-Haenszel with one stratum per subject (433 subjects) on $2 x 2 x 433$ table.
$X$ : whether vote is Republican or Democrat
$Y$ : year of vote
$Z$ : subject label
Applying CMH: no variation when both votes are R or D: only discordant pairs contribute.

pi_2008 = (175 + 54) / (175 + 16 + 54 + 188) # total is 433
pi_2004 = (175 + 16) / (175 + 16 + 54 + 188)
delta = pi_2004 - pi_2008
SE = sqrt(54 + 16) / (175 + 16 + 54 + 188)
Z = delta / SE
Z

-4.5418687154707

We’ve observed 70 discordant pairs (people who changed votes).
Under $H_{0}$ , $N_{12} | N_{12} + N_{21} \sim Binomial (N_{12} + N_{21}, 1 / 2)$
Exact p-value can be computed using e.g. pbinom.

pbinom(16, 54 + 16, 0.5)
Z_mcnemar = (16 - 0.5 * (54 + 16)) / sqrt(0.5 * 0.5 * (54+16))
Z_mcnemar
c(1 - pchisq(Z_mcnemar^2, 1), 2 * pbinom(16, 54 + 16, 0.5))

2.92697776469405e-06

-4.5418687154707

5.5757761732167e-06
5.8539555293881e-06

Conditional logistic regression (11.2)¶

Data can be represented as $Y_{i} = (Y_{i, 1}, Y_{i, 2}), 1 \leq i \leq 433$ where $Y_{i, 1}$ are votes in 2004 and $Y_{i, 2}$ are votes in 2008 (with, say, $Y_{i, 1} = 1$ if vote is D).
Might model these as

\begin{array}{r} logit (P (Y_{i, j} = 1)) = {\begin{cases} α & j = 1 \\ α + β & j = 2 \end{cases} \end{array}

This is marginal model.

Subject specific model¶

Might model these as $ $logit (P (Y_{i, j} = 1)) = {\begin{cases} α_{i} & j = 1 \\ α_{i} + β & j = 2 \end{cases}$ $
This has 434 parameters!
Can’t expect to be easy to fit, or standard MLE theory to work.
Eliminate $α_{i}$ ’s by conditioning – this is yields a conditional likelihood similar to those we saw when discussing pseudolikelihood.

Full likelihood is proportional to

\begin{array}{r} \begin{aligned} L (α_{1}, \dots, α_{433}, β) & \propto \exp ((\sum_{i} α_{i} (Y_{i, 1} + Y_{i, 2})) \\ + β (\sum_{i} Y_{i, 2})) \end{aligned} \end{array}

Sufficient statistic for $α_{i}$ is $Y_{i, 1} + Y_{i, 2}$
Conditioning on $(Y_{i, 1} + Y_{i, 2})_{1 \leq i \leq n}$ eliminates $α_{i}$ from the likelihood.

No information when $Y_{i, 1} + Y_{i, 2}$ is 0 or 2. Focuses again on discordant pairs…
For discordant pairs, let

\begin{array}{r} Y_{k}^{*} = {\begin{cases} 1 & Y_{k, 2} = 1 \\ 0 & Y_{k, 1} = 1 \end{cases} \end{array}

Conditional likelihood $ $\prod_{k} {[\frac{e^{β}}{1 + e^{β}}]}^{Y_{k}^{*}} {[\frac{1}{1 + e^{β}}]}^{1 - Y_{k}^{*}}$ $
This is logistic likelihood intercept only (one sample Binomial likelihood) $ $\hat{β} = logit (\hat{π^{*}}), {\hat{π}}^{*} = \frac{N_{21}}{N_{21} + N_{12}}$ $

Including covariats¶

Suppose the pairs have $p$ -dimensional features $(X_{i, 1}, X_{i, 2})$ .
Similar conditioning and calculation yields product over discordant pairs $ $\prod_{k} {[\frac{e^{X_{k, 2}^{T} β}}{e^{X_{k, 2}^{T} β} + e^{X_{k, 2}^{T} β}}]}^{Y_{k}^{*}} {[\frac{e^{X_{k, 1}^{T} β}}{e^{X_{k, 2}^{T} β} + e^{X_{k, 2}^{T} β}}]}^{1 - Y_{k}^{*}}$ $
A standard logistic likelihood with features $X_{k}^{*} = X_{k, 2} - X_{k, 1}$ .

Square tables (11.4)¶

An $I \times I$ table often has (essentially) same row names as column names.
There are several models of

π_{i j} 1 \leq i, j \leq I

or, thinking of log-linear model

μ_{i j} = \exp (λ_{i j}) 1 \leq i, j \leq I

Marginal homogeneity

π_{i +} = π_{+ i} 1 \leq i \leq I

μ_{i +} = μ_{+ i} 1 \leq i \leq I

Marginal homogeneity generally not expressible as log-linear model.

Symmetry¶

Model:

\log μ_{i j} = λ + λ_{i} + λ_{j} + λ_{i j}

with $λ_{i j} = λ_{j i}$

Note: symmetry implies marginal homogeneity (converse not generally true).
Model fit

{\hat{μ}}_{i j} = \frac{N_{i j} + N_{j i}}{2}

Goodness of fit test with $H_{0}$ being symmetry

X^{2} = \sum_{(i, j) : i < j} \frac{(N_{i j} - N_{j i})^{2}}{N_{i j} + N_{j i}}

Quasi-symmetry¶

Model:

\log μ_{i j} = λ + λ_{i}^{R} + λ_{j}^{C} + λ_{i j}

with $λ_{i j} = λ_{j i}$ but $λ^{R}$ not restricted to agree with $λ^{C}$ .

Quasi-independence¶

Model: $ $\log μ_{i j} = λ + λ_{i}^{R} + λ_{j}^{C} + λ_{i}^{I} δ_{i j}$ $
Exact independence is $λ^{I} \equiv 0$ .
A lowish dimensional departure from independence.

Agreement between raters¶

agreement = matrix(c(22, 2, 2, 0,
                      5, 7,14, 0,
                      0, 2,36, 0,
                      0, 1,17,10),
                   4, 4, 
                   byrow=TRUE)
agreement

A matrix: 4 × 4 of type dbl
22	2	2	0
5	7	14	0
0	2	36	0
0	1	17	10

Quasi-independence model might be reasonable here: raters agree on easy to classify objects (diagonal is higher than independence)

carcinoma = data.frame(N=as.numeric(agreement), # Table 11.8
                       A=factor(c(rep(1,4), rep(2,4), rep(3,4), rep(4,4))),
                       B=factor(rep(1:4,4))) 
indep_fit = glm(N ~ A + B, family=poisson, data=carcinoma)
indep_resid = matrix(resid(indep_fit), 4, 4)
indep_resid # positive on diagonal

A matrix: 4 × 4 of type dbl
5.0439297	-0.4140653	-4.2770661	-2.099233
-0.4002454	2.2177506	-0.3128405	-2.099233
-4.1701116	-1.0460256	2.6798251	-2.537849
-3.5796032	-1.2657260	0.1540102	3.676325

departure = factor((carcinoma$A == carcinoma$B) * as.numeric(carcinoma$A))
quasi_indep_fit = glm(N ~ A + B + departure, family=poisson, data=carcinoma)
quasi_indep_resid = matrix(resid(quasi_indep_fit), 4, 4)
quasi_indep_resid  # residuals are 0 on diagonal

A matrix: 4 × 4 of type dbl
5.161914e-08	1.1958607	-0.7480157	-6.169760e-05
1.486609e+00	0.0000000	-0.6636398	-1.395289e-04
-1.236669e+00	0.6307957	0.0000000	-7.928944e-05
-1.932544e+00	-1.3509531	1.0251702	0.000000e+00

anova(indep_fit, quasi_indep_fit)

A anova: 2 × 4
	Resid. Df	Resid. Dev	Df	Deviance
	<dbl>	<dbl>	<dbl>	<dbl>
1	9	117.95686	NA	NA
2	5	13.17806	4	104.7788

Cohen’s $κ$ ¶

Cohen’s $κ$

κ = \frac{\sum_{i} π_{i i} - \sum_{i} π_{i +} π_{+ i}}{1 - \sum_{i} π_{i +} π_{+ i}}

See textbook for estimate of variance of sample version $\hat{κ}$

row_margin = apply(agreement, 2, sum) / sum(agreement)
col_margin = apply(agreement, 1, sum) / sum(agreement)
kappa_hat = (sum(diag(agreement)) / sum(agreement) - sum(row_margin * col_margin)) / (1 - sum(row_margin * col_margin))
kappa_hat

0.493005595523581

Paired preferences (Bradley-Terry model, 11.6)¶

baseball = matrix(c(NA,12, 6,10,10, 
                     6,NA, 9,11,13,
                    12, 9,NA,12, 9,
                     8, 7, 6,NA,12,
                     8, 5, 9, 6,NA), 
                  5, 5, byrow=TRUE)
rownames(baseball) = c('Boston', 
                       'New York', 
                       'Tampa Bay', 
                       'Toronto', 
                       'Baltimore')
colnames(baseball) = rownames(baseball)
baseball

A matrix: 5 × 5 of type dbl
	Boston	New York	Tampa Bay	Toronto	Baltimore
Boston	NA	12	6	10	10
New York	6	NA	9	11	13
Tampa Bay	12	9	NA	12	9
Toronto	8	7	6	NA	12
Baltimore	8	5	9	6	NA

Let $Π_{i j}$ denote probability team $i$ beats team $j$ .
Model

logit (Π_{i j}) = β_{i} - β_{j}

Bradley-Terry model is an example of quasi-symmetry.

Paired preferences¶

library(BradleyTerry2)
processed_baseball = BradleyTerry2::countsToBinomial(baseball)
processed_baseball

Attaching package: ‘BradleyTerry2’

The following object is masked _by_ ‘.GlobalEnv’:

    baseball

A data.frame: 10 × 4
player1	player2	win1	win2
<fct>	<fct>	<dbl>	<dbl>
Boston	New York	12	6
Boston	Tampa Bay	6	12
Boston	Toronto	10	8
Boston	Baltimore	10	8
New York	Tampa Bay	9	9
New York	Toronto	11	7
New York	Baltimore	13	5
Tampa Bay	Toronto	12	6
Tampa Bay	Baltimore	9	9
Toronto	Baltimore	12	6

BradleyTerry2::BTm(cbind(win1, win2), 
                   player1=player1, 
                   player2=player2, 
                   data=processed_baseball, refcat='Baltimore')

Bradley Terry model fit by glm.fit 

Call:  BradleyTerry2::BTm(outcome = cbind(win1, win2), player1 = player1, 
    player2 = player2, refcat = "Baltimore", data = processed_baseball)

Coefficients:
   ..Boston   ..New York  ..Tampa Bay    ..Toronto  
     0.4539       0.4991       0.6354       0.2287  

Degrees of Freedom: 10 Total (i.e. Null);  6 Residual
Null Deviance:	    13.18 
Residual Deviance: 7.703 	AIC: 48.66

Can break down games into home and away
Let $Π_{i j}^{*}$ denote probability home team $i$ beats away team $j$
Model $ $logit (Π_{i j}^{*}) = α + β_{i} - β_{j} .$ $

From discrete to continuous: Lindsey’s method

Survival analysis

STATS 305B