Education 257 SOLUTIONS HW2 2/17/03
1.
example done in full in
http://www.stanford.edu/class/ed257/exs/castletukey.lis

2.
Let's start out by looking at the cell means
MTB > table c1 c2;
SUBC> means c3.
1 2 3 4 5 6 7 8
1 29.000 23.500 40.500 26.500 31.500 34.500 21.000 32.500
2 42.500 31.000 48.000 33.000 44.500 41.000 28.000 33.500
3 31.500 28.000 40.500 37.000 39.000 38.000 26.000 30.000
ALL 34.333 27.500 43.000 32.167 38.333 37.833 25.000 32.000
9 10 ALL
1
1 37.500 25.000 30.150
2 43.500 28.500 37.350
3 43.000 28.500 34.150
ALL 41.333 27.333 33.883
CELL CONTENTS 
C3:MEAN
Examining a profile plot for counsel.dat cell means there
seems to be a main effect for counselors and a main effect
for therapy. There seems to be little evidence of an interaction.
Below is a minitab ANOVA on the COUNSEL.DAT data. In this analysis, we
treat counsellor as a random effect and we use the 'restricted mixed model'
in which levels of the interaction factor sum to zero across levels of the
fixed factor, but are not required to do so across levels of the random
factor.
MTB > read 'COUNSEL.DAT' c1c3
60 ROWS READ
ROW C1 C2 C3
1 1 1 31
2 1 1 27
3 1 2 21
4 1 2 26
. . .
MTB > name c1 'method' c2 'counselr' c3 'outcome'
MTB > anova outcome = counselr  method;
SUBC> random counselr;
SUBC> restricted;
SUBC> ems;
SUBC> means method.
Factor Type Levels Values
counselr random 10 1 2 3 4 5 6 7 8 9
10
method fixed 3 1 2 3
Analysis of Variance for outcome
Source DF SS MS F P
counselr 9 2059.683 228.854 28.91 0.000
method 2 520.533 260.267 17.45 0.000
counselr*method 18 268.467 14.915 1.88 0.061
Error 30 237.500 7.917
Total 59 3086.183
Source Variance Error Expected Mean Square
component term (using restricted model)
1 counselr 36.823 4 (4) + 6(1)
2 method 3 (4) + 2(3) + 20Q[2]
3 counselr*method 3.499 4 (4) + 2(3)
4 Error 7.917 (4)
MEANS
method N outcome
1 20 30.150
2 20 37.350
3 20 34.150
Point estimates of the fixed effect of 'method' are computed by
subtracting the grand mean from each of the group means above or from
the table command. The grand mean 33.88 is obtained from the TABLE
output. The results are:
PARAM POINT
EST
alpha(1) 3.73
alpha(2) 3.46
alpha(3) .26
Tukey pairwise comparisons are computed across levels of the 'method';
remember to use the degrees of freedom and mean squares from the 'method'
main effect error term (which is the counselr*method interaction in this
example  see the Error term portion of the expected mean squares table)
and the number of observations in each level of method, not the number in
each cell of the design.
q (3,18) = 3.61 at the .05 Type I experimentwise error rate
number of observations per level of 'method' = 20 (2 replications by each
of 10 counsellors)
14.915 = MS(counselr*method)
thus W = 3.12,
the endpoints of the 3 pairwise comparison interval estimates are obtained
from substituting the row means into
Xbar(i..)  Xbar(i'..) +/ W
Interval Estimates are:
Mean 2  Mean 1 (4.08, 10.32)
Mean 3  Mean 1 (0.88, 7.12)
Mean 3  Mean 2 (6.32, .08)

3.
Model is:
y(ijk) = mu + alpha(i) + beta(j) + alphabeta(ij) + epsilon(ijk)
where mu is the grand mean of all evaluations.
y(ijk) is evaluation k observed within the group defined by
level i of the row factor and level j of the column factor.
alpha(i) is a random effect of patient at level i.
beta(j) is the fixed effect of physician at level j.
alphabeta(ij) is a random effect of the interaction between patient
and physician at level i of patient and level j of physician.
epsilon(ijk) is a random component of error for observation k
at level i of patient and level j of physician.

b.
With the evaluation measures stored in the file, we start by
MTB > read '03hw2p3.dat' c1
40 ROWS READ
C1
7.2 9.6 8.5 9.6 . . .
MTB > name c1 'eval' c2 'MD' c3 'patient'
MTB > set 'patient'
DATA> (1:4)10
DATA> end
MTB > set 'MD'
DATA> 4(1:5)2
DATA> end
************So here's the complete data structure
MTB > print c1c3
ROW eval MD patient
1 7.2 1 1
2 9.6 1 1
3 8.5 2 1
4 9.6 2 1
5 9.1 3 1
6 8.6 3 1
7 8.2 4 1
8 9.0 4 1
9 7.8 5 1
10 8.0 5 1
11 4.2 1 2
12 3.5 1 2
13 2.9 2 2
14 3.3 2 2
15 1.8 3 2
16 2.4 3 2
17 3.6 4 2
18 4.4 4 2
19 3.7 5 2
20 3.9 5 2
21 9.5 1 3
22 9.3 1 3
23 8.8 2 3
24 9.2 2 3
25 7.6 3 3
26 7.1 3 3
27 7.3 4 3
28 7.0 4 3
29 9.2 5 3
30 8.3 5 3
31 5.4 1 4
32 3.9 1 4
33 6.3 2 4
34 6.0 2 4
35 6.1 3 4
36 5.6 3 4
37 5.0 4 4
38 5.4 4 4
39 6.5 5 4
40 6.9 5 4
So now the mixed model anova
MTB > anova eval = MDpatient;
SUBC> random patient;
SUBC> restricted;
SUBC> ems.
Factor Type Levels Values
MD fixed 5 1 2 3 4 5
patient random 4 1 2 3 4
Analysis of Variance for eval
Source DF SS MS F P
MD 4 3.812 0.953 0.71 0.602
patient 3 180.133 60.044 173.41 0.000
MD*patient 12 16.159 1.347 3.89 0.004
Error 20 6.925 0.346
Total 39 207.028
Source Variance Error Expected Mean Square
component term (using restricted model)
1 MD 3 (4) + 2(3) + 8Q[1]
2 patient 5.9698 4 (4) + 10(2)
3 MD*patient 0.5001 4 (4) + 2(3)
4 Error 0.3463 (4)
you should check the Minitab produced the correct test statistics
(i.e. choosing the denominators according the the indications
from the E(MS) table and also implicitly that the pvalues
reflect Minitab using the correct df from the test statistics by
playing with invcdf for example). Clearly in these data there is
no effect for MD (which could indicate that all doctors are
equally good?), there is a large patient effect and also a
significant interaction (we should look at the profile plot etc
if we were doing more than just using this for practice in
setting up the mixedmodel analysis.)
c. Posthoc comparisons
Now we could the physician means from a table of all 20 cell means
or from the anova subcommand (SUBC> means MD) as here....
MEANS
MD N eval
1 8 6.5750
2 8 6.8250
3 8 6.0375
4 8 6.2375
5 8 6.7875
Tukey pairwise comparisons for the physician factor (using number
of levels of physician = 5, df(error) use the error term
MSinteraction and df(physician*patient) = 12, and thus q = 4.52
at Type 1 error rate = .05, and number of observations per level
of physician = 8  2 replications times 4 levels of patient).
W = (4.52)*Sqrt[1.347/8] = 1.85. So we can see that all
confidence intervals include 0 (so none of the 10 pairwise
differences are significantly different from 0). We would expect
this result given the anova table above. In real life you
wouldn't bother most likely with a posthoc pairwise comparison
for a main effect this seemingly nonexistent. But for the
exercise it's worth it.....

Another look at part (b) "for old times sake" before the ANOVA
routine made various parts automatic.
MTB > name c1 'patient' c2 'MD' c3 'eval'
MTB > set 'patient'
DATA> (1:4)10
DATA> end
MTB > set 'MD'
DATA> 4(1:5)2
DATA> end
MTB > set 'eval'
DATA> 7.2 9.6 8.5 9.6 9.1 8.6 8.2 9.0 7.8 8.0
DATA> 4.2 3.5 2.9 3.3 1.8 2.4 3.6 4.4 3.7 3.9
DATA> 9.5 9.3 8.8 9.2 7.6 7.1 7.3 7.0 9.2 8.3
DATA> 5.4 3.9 6.3 6.0 6.1 5.6 5.0 5.4 6.5 6.9
DATA> end
Here's an anova table a la TWOWAY would produce
Analysis of Variance for eval
Source DF SS MS
patient 3 180.133 60.044
MD 4 3.812 0.953
patient*MD 12 16.159 1.347
Error 20 6.925 0.346
Total 39 207.028
Tests for interaction and main effects;
(t.s. = test statistic).
reference distrib
interaction t.s. 1.347/.346 = 3.89 (F 12,20)
patient t.s. 60.04/.346 =173.5 (F 3,20)
MD t.s. .953/1.347 = .707 (F 4,12)

4.
Model is:
y(ijk) = mu + alpha(i) + beta(j) + alphabeta(ij) + epsilon(ijk)
where mu is the grand mean of all observations.
y(ijk) is sales volume k observed within the group defined by alpha
level i and beta level j.
alpha(i) is a random effect of SMSA at level i.
beta(j) is a random effect of chain store at level j.
alphabeta(ij) is a random effect of the interaction between SMSA.
and chain store at level i of SMSA and level j of chain store.
epsilon(ijk) is a random component of error for observation k at level i
of SMSA and level j of chain store.
See NWK p.976. In short, all effects are independent of one
another (and of mu), and are distributed normally with a mean of
0. The 'by hand" way to do this problem is to run a simple anova
using TWOWAY in Minitab and then be smart enough to use the
correct MS for testing. Variance components are obtained by
simple substitution into formulas derived from E(MS) as were
presented in class. The computer is let loose to do all this
automatically here. You can check by hand results.
A random effects ANOVA done in minitab release > 7 is shown below.
To use that data file you (after naming columns):
MTB> read '03hw2p4.dat' 'volume'
MTB> set 'smsa'
DATA> 3(1:4)2
DATA> end
MTB> set 'chain'
DATA> (1:3)8
DATA> end
on to the anova.....
MTB > anova volume = smsa  chain ;
SUBC> random smsa chain;
SUBC> ems.
Factor Type Levels Values
smsa random 4 1 2 3 4
chain random 3 1 2 3
Analysis of Variance for volume
Source DF SS MS F P
smsa 3 136644 45548 5.05 0.044
chain 2 62973 31486 3.49 0.099
smsa*chain 6 54127 9021 22.03 0.000
Error 12 4914 409
Total 23 258658
******NOTE teststatistics (F) use MSinteraction for main effects.
The interaction of SMSA and chain store clearly is significant even
with individual F tests at level .01. Neither main effect is
significant (say with each test carried out at .017). Even though the
point estimates for the variance components are large (see below) ,
these are not significantly different from zero for chain and smsa.
Here's the expected mean square table produced by Minitab. It may be
clearer to write your own using math notation not available here.
For expressions of expected mean squares refer to NWK Table 24.5
p.981.
Source Variance Error Expected Mean Square
component term (using unrestricted model)
1 smsa 6087.8 3 (4) + 2(3) + 6(1)
2 chain 2808.2 3 (4) + 2(3) + 8(2)
3 smsa*chain 4305.9 4 (4) + 2(3)
4 Error 409.5 (4)
NOTE: SAS PROC VARCOMP gives identical variance component estimates.
Note: For fun you may want to reproduce variance component estimation
by hand or by calculator
================================================================
5.
2way factorial design, one replication per cell.
NWK problem 21.2 parts a, b
Coinoperated terminals. A university computer service conducted an
experiment in which one coinoperated computer terminal was placed at
each of four different locations on the campus last semester during the
midterm week and again during the final week of classes. The data that
follow show the number of hours each terminal was NOT in use during the
week at the four locations (factor A) and for the two different weeks
(factor B).
Factor A (location): i = 1,...,4
Factor B (week): j = 1, 2
The 8 data points are given on p.886
16.5, 11.8, 12.3, 16.6 21.4, 17.3, 16.9, 21.0
a)
Problem asks us to construct a Profile Plot
Starting with the data looking like this
(extra id in c1) we can do the following in Minitab
MTB > print c1c4
Data Display
Row C1 C2 C3 C4
1 1 16.5 1 1
2 2 11.8 2 1
3 3 12.3 3 1
4 4 16.6 4 1
5 5 21.4 1 2
6 6 17.3 2 2
7 7 16.9 3 2
8 8 21.0 4 2
MTB > copy c2 c3 c12c13;
SUBC> use c4 = 1.
MTB > copy c2 c3 c22c23;
SUBC> use c4 = 2.
MTB > MPlot C12 C13 C22 C23.
Character Multiple Plot
 B
21.0+ B




17.5+ B
 A B A



14.0+

 A
 A

10.5+
++++++
1.20 1.80 2.40 3.00 3.60 4.20
A = C12 vs. C13 B = C22 vs. C23
mplot is a very useful command (though this one is quicker by
hand; I did it on the machine to insert into solutions)
what we see here is a big main effect for week (final > midterm)
and an effect for location. Not much interaction apparrent.
b)
Conduct separate tests for location and week main effects. In each
test, use a level of significance of alpha = .05 and state the
alternatives, decision rule, and conclusion. Give an upper bound for the
family level of significance: use Kimball inequality. What is the
Pvalue for each test?
SOLUTION
we have one rep per cell so we need to fit the twoway anova
model without an interaction term. The twoway minitab command
will automatically do that; I show here use of the anova
command that explicitly says no interaction (just
main effects)
MTB > anova c2= c3 c4
Analysis of Variance (Balanced Designs)
Factor Type Levels Values
C3 fixed 4 1 2 3 4
C4 fixed 2 1 2
Analysis of Variance for C2
Source DF SS MS F P
C3 3 37.005 12.335 107.26 0.002
C4 1 47.045 47.045 409.09 0.000
Error 3 0.345 0.115
Total 7 84.395
As discussed in class the strategy is to hope interaction is
small so that the "error" term (interaction confounded) is
not too inflated. Tests are conservative in that the
obtained "error" term is likely too big.
We'll control overall error rate to .05 by conducting each of the
test for main effects at .0253 (or .025 would be fine) from
alphatot.tab which is an implementation of Kimball's inequality.
So the relevant critical values are:
MTB > let k1 = 1 .0253
MTB > invcdf k1;
SUBC> f 3 3.
Inverse Cumulative Distribution Function
F distribution with 3 d.f. in numerator and 3 d.f. in denominator
P( X <= x) x
0.9747 15.3075
MTB > invcdf k1;
SUBC> f 1 3.
Inverse Cumulative Distribution Function
F distribution with 1 d.f. in numerator and 3 d.f. in denominator
P( X <= x) x
0.9747 17.2871
In testing factor A main effects for location,
The test statistic to be used in the 2Way Fixed Factor model is:
F* = MSA/MSE
= 12.335/0.115
= 107.26
Since F* = 107.26 > 15.3 we can reject the null hypothesis
of no location effects.

In testing factor B main effects for week,
The test statistic to be used in the 2Way Fixed Factor model is:
F* = MSB/MSE
= 47.045/0.115
= 409.08
Since F* = 409.08 > 17.29, we can reject the null hypothesis
of no week effects.
You see even with this "conservative" test for the 1 rep per
cell, big effects will show up!
NWK problem 21.4
In referring to NWK 21.2, conduct the Tukey test for additivity; use
alpha = .025. State the alternatives, decision rule, and conclusion. If
the additive model is not appropriate, what might you do?
so we need to do some arithmetic (which I am no better
at than you are).
I get Dhat from 21.10a = (4.13473)/(18.5025)(11.76125) = .019
then the SSAB* in 21.11 = .0786
which leaves SSrem* = .2664 from (21.12a)
Then to carry out the 1df for nonadditivity test in (21.13),
construct the test statistic F* = .0768/(.2664/2) = .59.
compare with F 1,2 degres of freedom .025 level which is 38.5.
(note: 2 = 2*4  4  2). Clearly no significant interaction
via this method.
If there were a large interaction, there are cases where you
would want to use SSrem* as the error term in testing main
effects.
MTB > invcdf .975;
SUBC> f 1 2.
Inverse Cumulative Distribution Function
F distribution with 1 d.f. in numerator and 2 d.f. in denominator
P( X <= x) x
0.9750 38.5064
================================================================
6.
Note that this is a 2 way (4 X 3) anova design, with both factors
(subject of course, highest degree earned) fixed and where cell n's
are unequal.
Part a.
A profile plot of the cell means could be drawn with either factor A
or factor B on the xaxis, as below (connect the like symbols between
columns).
Comments about the plots: There appears to be a main effect for
subject of course, with earnings per course lowest for Humanities and
highest for Engineering courses, with Social Sciences and Management
courses at similar intermediate earning levels. This is true
regardless of the highest degree earned by the instructor, suggesting
no important interaction between the factors. There also appears to
be a main effect for highest degree earned, with instructors who have
a PhD earning more than those with either BA or MA degrees. Again
this is true regardless of subject of the course. No important
interaction is visible.
3.80 
 *PhD
E 3.60 
A  *PhD
R 3.40  *PhD
N 
I 3.20 
N 
G 3.00 
S  *MA
2.80  *BA
P  *PhD
E 2.60  *BA*MA
R  *MA*BA
2.40 
C 
O 2.20 
U 
R 2.00  *MA
S 
E 1.80  *BA

1.60 

1 2 3 4
Humanities Engineering
Social Sciences Management
Factor A: Subject of course
3.80 
 *ENG
E 3.60 
A  *SS
R 3.40  *MGT
N 
I 3.20 
N 
G 3.00 
S  *ENG
2.80  *ENG
P  *HUM
E 2.60  *MGT *MGT
R  *SS *SS
2.40 
C 
O 2.20 
U 
R 2.00  *HUM
S 
E 1.80  *HUM

1.60 

1 2 3
BA MA PhD
Factor B: Highest degree earned
Part b.
To test H(o): all alpha(i)s equal 0 vs. H(a): not all alpha(i)s equal
0, the test statistic is MSA/MSE . To find these values,
first the corresponding Degrees of Freedom obscured in the output must
be calculated. We know that a = 4, b = 3 and N(Total) = 45.
Source DF
A (Subject) 3
B (Degree) 2
AB interactions 6
Error 33
Total 44
(You could also find error DF by subtracting other DFs from total DF:
44  3  2  6 = 33.)
(NOTE: For this problem you need not calculate any DFs or Mean
Squares except for Factor A (Subject) and Error.)
The required Mean Squares obscured in the output can be calculated
from the Adjusted Sum of Squares (as you may recall from lecture the
Sequential SS are not
appropriate here as they depend on the order in which the factors are
entered into the GLM procedure; see UNBALANC.LOG).
Source Adj. MS = Adj. SS / DF
A (Subject) 1.4109 4.2326 / 3
B (Degree) 4.1144 8.2287 / 2
AB interaction 0.0074 0.0444 / 6
Error 0.0218 0.7180 / 33
MSA 1.4109
The test statistics is then  =  = 64.72
MSE 0.0218
The critical F value F(0.01;3,33) is approximately 4.46 (see NWK table
B.4 F(.01;3,30) = 4.51
and F(.01;3,40) = 4.31). For purposes here a critical value about 4.5
(or a touch less) is fine here.
Since 64.72 > 4.46 we reject H(o) that all alpha(i)s equal 0.
Part c.
MTB > read '/usr/class/ed257/adjprof.dat' into c1c3
45 ROWS READ
ROW C1 C2 C3
1 1.7 1 1
2 1.9 1 1
3 1.8 1 2
4 2.1 1 2
. . .
In order to replicate the GLM analysis:
MTB > glm c1 = c2c3
Factor Levels Values
C2 4 1 2 3 4
C3 3 1 2 3
Analysis of Variance for C1
Source DF Seq SS Adj SS Adj MS F P
C2 3 4.1676 4.2326 1.4109 64.85 0.000
C3 2 8.3825 8.2287 4.1144 189.10 0.000
C2*C3 6 0.0444 0.0444 0.0074 0.34 0.910
Error 33 0.7180 0.7180 0.0218
Total 44 13.3124
Unusual Observations for C1
Obs. C1 Fit Stdev.Fit Residual St.Resid
39 2.30000 2.55000 0.10430 0.25000 2.40R
40 2.80000 2.55000 0.10430 0.25000 2.40R
R denotes an obs. with a large st. resid.
In order to obtain unweighted mean (i.e. ) Miller's approximate solution,
first carry out a oneway analysis of variance on the cell means. You
can obtain the cell means by issuing the following command.
MTB > table c2 c3;
SUBC> mean c1;
SUBC> count.
ROWS: C2 COLUMNS: C3
1 2 3 ALL
1 1.8000 1.9500 2.7000 2.4250
2 2 8 12
2 2.4500 2.5200 3.4500 2.7846
4 5 4 13
3 2.7500 2.8500 3.7400 3.2364
2 4 5 11
4 2.5500 2.5500 3.4200 3.0333
2 2 5 9
ALL 2.4000 2.5385 3.2364 2.8489
10 13 22 45
CELL CONTENTS 
C1:MEAN
COUNT
Now you want to enter the cell means, the row indicators, and the
column indicators into columns so that you can carry out a oneway
analysis of variance. (An alternative to typing is to use your
editor to cut in the means from above and use set command to
create indices in C2 C3)
MTB > read data into c11c13
DATA> 1.80 1 1
DATA> 1.95 1 2
DATA> 2.70 1 3
DATA> 2.45 2 1
DATA> 2.52 2 2
DATA> 3.45 2 3
DATA> 2.75 3 1
DATA> 2.85 3 2
DATA> 3.74 3 3
DATA> 2.55 4 1
DATA> 2.55 4 2
DATA> 3.42 4 3
DATA> end
12 ROWS READ
Carry out a twoway ANOVA.
MTB > twoway c11c13
ANALYSIS OF VARIANCE C11
SOURCE DF SS MS
C12 3 1.50389 0.50130
C13 2 2.17280 1.08640
ERROR 6 0.01413 0.00236
TOTAL 11 3.69083
In order to obtain the unweighted mean (Miller's) approximate solution,
calculate the harmonic mean for this set of data.
harmonic mean = [(1/12) x (1/2 + 1/2 + 1/8 + 1/4 + 1/5 + 1/4 + 1/2 +
1/4 + 1/5 + 1/2 + 1/2 + 1/5)]^1 = 3.0189 (approx.=3)

or if you like computational overkill, here's an example
Harmonic Mean from Mathematica
< 15/4, HarmonicMean > 160/53, Median > 4}
N[%]
{Mean > 3.75, HarmonicMean > 3.018867924528302,
Median > 4.}

Comparing the GLM and Miller solutions we see:
GLM Solution
SOURCE DF Adj SS Adj MS F
C2 3 4.2326 1.4109 64.85
C3 2 8.2287 4.1144 189.10
Interaction 6 .0444 .0074 .34
Error 33 .7180 .0218
MILLER Solution
SOURCE DF SS MS F
C12 3 3(1.50389)=4.51167 1.50389 68.99
C13 2 3(2.17280)=6.51840 3.25920 149.50
Interaction 6 3(0.01413)=0.04239 .00707 .32
Error 33 .7180 .0218
(Note: The "error" term we obtained from the twoway analysis of variance
on the cell means is actually an estimate of the interaction. Our estimate
of SSE for Miller's solution is obtained by taking the sum of the
squared within cell deviations (i.e. X(ijk)  X(ij bar)).
This procedure gives withincell error estimates identical to GLM.)
As we can see, the GLM solution and Miller's approximate solution yield
similar results. At any reasonable Type I error rate, we can reject the
null hypotheses of no main effects, for both subject matter and degree
earned. Likewise, neither solution offers evidence of an interaction
between subject matter and degree earned. This is pretty good considering
this unbalanced design is a little outside the guidline of nMax/nMin <= 3
for Miller's procedure.


7.
"More on interactions..."
This problem was constructed to address the question, What more can be
done about describing (or drawing inferences) about the interaction
terms beyond just (rejecting or not) the omnibus null hypothesis of no
interaction? Previously, we had discussed the importance of the
profile plot as the major descriptive technique. In lecture we covered
Section 20.3 of NWK ("Analysis of factor effects when interaction
important"), which involves estimating the row effects separately for
each level of the column factor. (Refer also to Figure 20.3b.) . And
the mathematics learning example (a 2x3 design) that was described in
class is a good template for this example.
Note: In the output below the columns c1c4 in the data file
extraint.dat are labelled as c10c13that is outcome in c10, row,column
indicators in c12,c13. Columns c1c6 here contain the data for each
cell in the 2x3 design.
The data (5 replications in the 2x3 design) were
generated with withincell variance of 9 and cell means
12 15 18
10 11 12
Here's how I did it, putting each cell in its own column.
Generate the data:
MTB > random 5 c1;
SUBC> normal 12 3.
MTB > random 5 c2;
SUBC> normal 15 3.
MTB > random 5 c3;
SUBC> normal 18 3.
MTB > random 5 c4;
SUBC> normal 10 3.
MTB > random 5 c5;
SUBC> normal 11 3.
MTB > random 5 c6;
SUBC> normal 12 3.
Here's the data with each cell of the 2x3 in it's own column
MTB > print c1c6
ROW C1 C2 C3 C4 C5 C6
1 8.1988 20.3003 17.1762 9.5724 9.7497 10.3652
2 8.7541 16.3191 18.7042 11.1114 6.6897 12.2657
3 14.5011 14.4865 16.4200 13.2167 13.9392 9.7098
4 8.0545 12.3120 17.5973 11.9217 14.0845 11.9133
5 7.8095 14.6511 21.7529 11.6024 7.9765 10.0781

Let's start the analysis
Describe twoway data using the stacked form that you have in extraint.dat
MTB > table c12 c13;
SUBC> stats c10.
ROWS: C12 COLUMNS: C13
1 2 3 ALL
1 5 5 5 15
9.464 15.614 18.330 14.469
2.837 2.982 2.084 4.563
2 5 5 5 15
11.485 10.488 10.866 10.946
1.323 3.396 1.147 2.086
ALL 10 10 10 30
10.474 13.051 14.598 12.708
2.343 4.047 4.241 3.919
CELL CONTENTS 
C10:N
MEAN
STD DEV
Obtain anova table
MTB > twoway c10 c12 c13
ANALYSIS OF VARIANCE C10
SOURCE DF SS MS
C12 1 93.07 93.07
C13 2 86.80 43.40
INTERACTION 2 122.10 61.05
ERROR 24 143.52 5.98
TOTAL 29 445.49
Clearly, the interaction is significant, as are the main effects.
Profile plot based on sample data will show marked interaction
which actually appears disordinal even though for
the population cell means the interaction is ordinal.
First, let's compare row cell means at levels of the column (1,2,3
respectively) by a series of twosample t inferences. First the
default 95% interval produced by twosample is shown. This is most
likely what is commonly done in the literature when such a
comparison is attempted. We know that MUCH better practice would
be to specify (following Bonferroni) a set of 98.5% intervals to
control the overall confidence coefficient to approx 95%. Here I
have my original data set with each cell in its own column; you
can get there by unstacking extraint.dat.
Comparing the two rows at the first level of the column factor:
MTB > twosample c1 c4
TWOSAMPLE T FOR C1 VS C4
N MEAN STDEV SE MEAN
C1 5 9.46 2.84 1.3
C4 5 11.48 1.32 0.59
95 PCT CI FOR MU C1  MU C4: (5.6, 1.58)
TTEST MU C1 = MU C4 (VS NE): T= 1.44 P=0.21 DF= 5
Same for the second level of the column factor:
MTB > twosample c2 c5
TWOSAMPLE T FOR C2 VS C5
N MEAN STDEV SE MEAN
C2 5 15.61 2.98 1.3
C5 5 10.49 3.40 1.5
95 PCT CI FOR MU C2  MU C5: (0.3, 9.9)
TTEST MU C2 = MU C5 (VS NE): T= 2.54 P=0.039 DF= 7
And for the third level of the column factor:
MTB > twosample c3 c6
TWOSAMPLE T FOR C3 VS C6
N MEAN STDEV SE MEAN
C3 5 18.33 2.08 0.93
C6 5 10.87 1.15 0.51
95 PCT CI FOR MU C3  MU C6: (4.86, 10.07)
TTEST MU C3 = MU C6 (VS NE): T= 7.02 P=0.0004 DF= 6
An interesting note on the above is that the interval estimate at the
second level of the column factor will include 0 for any reasonable
use of an overall confidence coefficient of 95% (i.e. pvalue =~ .04)

Now let's use a more appropriate confidence
coefficient considering we are doing 3 of these98.3%, which
is close to Bonferroni with overall 95% for the 3. (I checked that
Minitab really does do .983 confidence here, not just .98 as
the output indicates).
The confidence intervals are as you would expect; the second
includes 0.
MTB > twosample .983 c1 c4.
TWOSAMPLE T FOR C1 VS C4
N MEAN STDEV SE MEAN
C1 5 9.46 2.84 1.3
C4 5 11.48 1.32 0.59
98 PCT CI FOR MU C1  MU C4: (6.9, 2.90)
TTEST MU C1 = MU C4 (VS NE): T= 1.44 P=0.21 DF= 5
MTB > twosample .983 c2 c5.
TWOSAMPLE T FOR C2 VS C5
N MEAN STDEV SE MEAN
C2 5 15.61 2.98 1.3
C5 5 10.49 3.40 1.5
98 PCT CI FOR MU C2  MU C5: (1.2, 11.4)
TTEST MU C2 = MU C5 (VS NE): T= 2.54 P=0.039 DF= 7
MTB > twosample .983 c3 c6.
TWOSAMPLE T FOR C3 VS C6
N MEAN STDEV SE MEAN
C3 5 18.33 2.08 0.93
C6 5 10.87 1.15 0.51
98 PCT CI FOR MU C3  MU C6: (3.98, 10.95)
TTEST MU C3 = MU C6 (VS NE): T= 7.02 P=0.0004 DF= 6

Now Compare with pairwise intervals from Tukey (e.g. NWK eq 20.32)
with familywise 95% confidence
Here I use the stacked data in the form you have
it in extraint.dat. Oneway is run on the outcome and the
indicator of group (1,...6)
MTB > oneway c10 c11;
SUBC> tukey.
ANALYSIS OF VARIANCE ON C10
SOURCE DF SS MS F p
C11 5 301.97 60.39 10.10 0.000
ERROR 24 143.52 5.98
TOTAL 29 445.49
INDIVIDUAL 95 PCT CI'S FOR MEAN
BASED ON POOLED STDEV
LEVEL N MEAN STDEV ++++
1 5 9.464 2.837 (*)
2 5 15.614 2.982 (*)
3 5 18.330 2.084 (*)
4 5 11.485 1.323 (*)
5 5 10.488 3.396 (*)
6 5 10.866 1.147 (*)
++++
POOLED STDEV = 2.445 8.0 12.0 16.0 20.0
Tukey's pairwise comparisons
Family error rate = 0.0500
Individual error rate = 0.00498
Critical value = 4.37
Intervals for (column level mean)  (row level mean)
1 2 3 4 5
2 10.929
1.371
3 13.646 7.496
4.087 2.063
4 6.801 0.650 2.066
2.758 8.908 11.624
5 5.803 0.347 3.063 3.782
3.755 9.905 12.621 5.776
6 6.182 0.032 2.685 4.161 5.158
3.376 9.527 12.243 5.398 4.401
The 2,5 entry is the most interesting. Tukey provides an interval
that does not include 0 with familywise confidence coeff 95% that
about matches the twosample based interval having a far lower
confidence coeff for the set of 3 intervals.

Finally, we can construct comparisons using the Bonferroni method just as
NWK did for the math learning study in 20.3. The formulas are in
NWK, eq.20.3220.35. The potential advantage of this over the
Tukey results above is that we can limit to just the 3 comparisons that
we seek. Main difference from the repeated tintervals is the use of MSW.
Point estimates:
Dhat[1] = 9.46411.485 = 2.021
Dhat[2] = 15.614  10.488 = 5.126
Dhat[3] = 18.33010.866 = 7.464
Var(Dhat) = 2*MSW/n = 2(5.98)/5 = 2.392
B = t(1.05/6, 24) = t(.9917,24) = 2.5754
Therefore the width of each interval is 2*2.5754*sqrt(2.392) = 2*3.98 = 7.96
Intervals
MU C1  MU C4: (2.0213.98, 2.021+3.98) = (6.00,1.96)
MU C2  MU C5: (1.15,9.11)
MU C3  MU C6: (3.48,11.44)
As with the Tukey intervals, only the interval for MU C1  MU C4 contains
zero. Notice that these intervals are somewhat narrower than the Tukey
intervals (7.96 versus 9.56 width). Since we're only interested in 3
of the 16 possible pairwise comparisons, the Bonferroni method appears to
be an improvement over Tukey.

Here's a summary table for the comparison of the rows at each
of the 3 levels of the column factor.
2sample t Tukey NWK sec20.3
tint .983 overall .95
  
CI FOR MU C1  MU C4: (6.9, 2.90) (6.801, 2.758) (6.00,1.96)
CI FOR MU C2  MU C5: (1.2, 11.4) ( 0.347, 9.905) (1.15,9.11)
CI FOR MU C3  MU C6: (3.98, 10.95) ( 2.685, 12.243) (3.48,11.44)

Problem #8. NWK Problem 23.423.5
First, we need to calculate marginal means...
collapse across j and k to get mu(i..)
mu(1..)=(130+138+140+144)/4=138
mu(2..)=(126+130+134+136)/4=131.5
mu(3..)=125
collapse across i and k to get mu(.j.) abnd collapse across i and j
to get mu(..k)
mu(.1.)=(130+126+122+140+134+122)/6=129
mu(.2.)=134
mu(..1)=(130+126+122+138+130+125)/6=128.5
mu(..2)=134.5
mu(...)=(138+131.5+125)/3=(129+134)/2=(128.5+134.5)/2=131.5
collapse across only one variable to get mu(ij.), mu(i.k), mu(.jk)
mu(11.)=135 mu(1.1)=134 mu(.11)=126
mu(12.)=141 mu(1.2)=142 mu(.12)=132
mu(21.)=130 mu(2.1)=128 mu(.21)=131
mu(22.)=133 mu(2.2)=135 mu(.22)=137
mu(31.)=122 mu(3.1)=123.5
mu(32.)=128 mu(3.2)=126.5
To check your calculations, see if (mu(11.)+mu(12.))/2 = mu(1..), etc.
To calculate main effects and interactions, just pluginski...
a. alpha(1)=mu(1..)mu(...)=138131.5= 6.5
alpha(2)=131.5131.5= 0
alpha(3)= 6.5
b. beta(2)=mu(.2.)mu(...)=134131.5= 2.5
gamma(1)=mu(..1)mu(...)=128.5131.5= 3
**notation below: a=alpha, b=beta, c=gamma
c. ab(12)= mu(12.)mu(1..)mu(.2.)+mu(...)= .5
ac(21)= mu(2.1)mu(2..)mu(..1)+mu(...)= .5
bc(12)= mu(.12)mu(.1.)mu(..2)+mu(...)= 0
********
d.
abc(111)=mu(111)mu(11.)mu(1.1)mu(.11)+mu(1..)+mu(.1.)+mu(..1)mu(...)=
130135134126+138+129+128.5131.5= 1
abc(322)=131128126.5137+125+134+134.5131.5= 1.5
***********
NWK 23.5: Profile Plots
Xaxis carries levels of factor A,
1's represent j=1, 2's represent j=2 for factor B
MTB > lplot c1 c3 c4 # for k=1 of factor C
k=1  2


135.0+




130.0+ 1 2



 1
125.0+ 2


 1
+++++a
1.20 1.60 2.00 2.40 2.80
MTB > lplot c2 c3 c4 # for k=2 of factor C
k=2  2


140.0+ 1


 2
 1
133.0+
 2



126.0+


 1
+++++a
1.20 1.60 2.00 2.40 2.80
>From these plots, we can see a main effect for factor C (k=2 plot
higher than the k=1 plot) a factor B main effect (the 2's are higher than the
1's) and a factor A main effect (the 1's and 2's decrease along the xaxis).
The two plots reveal small AB interactions, and these differ between
the two plots, suggesting perhaps a small ABC interaction.
Also, Laura notes that outside what is asked here, other plots would
indicate an AC interaction.


9.
NWK Problem 23.923.11
NWK 23.9 a.
Using the market.dat data set.....
get residuals and fits using the RESIDUALS and FITS subcommands of the
ANOVA command:
MTB > anova c1=c2c3c4;
SUBC> fits c5;
SUBC> residuals c6.
....
....
note: fits for this model are just the cell means; could print the fits out
or better yet use Table on the indices c2 c3 c4 to array them

MTB > name c5 'fits' c6 'resid'
MTB > plot c6 c5
resid 
 * *
 * *2 *
3.0+ *
 * ** *
 *
 * * ** *
 * * * *
0.0+ ** *
 *
 * * * 2
 2 3 *
 * * * * * *
3.0+ * * * *

 *

++++++fits
60 72 84 96 108 120
MTB > ## all looks pretty good....variances are about equal
Note also 2 of the cell means are identical in the plot; that's why only 11
vertical bands of points appear (one has 8 points)
NWK 23.10 a
I put the cell means into c15 and c16 (for k=1 and k=2 respectively)
and i,j indexes in c17 , c18.
MTB > lplot c15 c17 c18 # plot for k=1
 1 1
120+
 2
k=1  2


108+




96+

 1

 2
84+
+++++i
1.20 1.60 2.00 2.40 2.80
>From this plot, looks like A and B main effects, and no substantial AB
interaction.
MTB > lplot c16 c17 c18 # for k=2
112+ 1 1

k=2 


96+
 2
 2


80+ 1




64+
 2
+++++i
1.20 1.60 2.00 2.40 2.80
>From this plot, again A and B main effects, and comparing to the first
plot, a main effect of C too. Also, the size of the B main effect (the
distance between the curves) seems to be bigger in the second plot,
suggesting a BC interaction. Other interactions appear to be minimal.
NWK 23.10 c.
MTB > anova c1=c2c3c4
Analysis of Variance for quality
Source DF SS MS F P
fee 2 10044.3 5022.1 679.34 0.000
scope 1 1834.0 1834.0 248.08 0.000
super 1 3832.4 3832.4 518.40 0.000
fee*scope 2 1.6 0.8 0.11 0.898
fee*super 2 0.8 0.4 0.05 0.948
scope*super 1 574.8 574.8 77.75 0.000
fee*scope*super 2 3.9 2.0 0.27 0.767
Error 36 266.1 7.4
Total 47 16557.9
23.10d
H0: No ABC interaction vs Ha: ABC interaction present
(alphabetagamma)ijk = 0 (alphabetagamma)ijk =/ 0
test statistic MSABC/MSE = .27
reject H0 if MSABC/MSE > F(2,36;.99)= 5.3 (approx.) >> .27
Do not reject! (pvalue=.767)
23.10e
Tests of twoway interactions and main effects proceed similarly.
For AB interaction test statistic is .11
For AC interaction test statistic is .05
Both of these are compared to F(2,36;.99)= 5.3
For BC interaction test statistic is 77.75; compare with F(.99,1,36) = 7.4
So BC interaction clearly significant.
23.10f
Test statistic for effect of Factor A
5022/7.4 = 679. compare to F(2,36;.99)= 5.3. reject Ho of no A effect.
g. Using the Kimball inequality
overall level of signif <= 1(1alpha)^(number of tests)
If we had carried out all 7 tests
upper bound= 1 (1.01)^7 = 1(.99)^7 = .0679
So there would be about a 7% chance at least one incorrect rejections of
a null hypothesis.
For the 5 tests carried out above (all at .01) the overall error
rate is .049.
refer to NWK (23.24) on p.937, comment #1
h. The results of the ANOVA do indeed confirm what the profile plots
suggested all main effects and a BC interaction appeared sizable.
=======================================================================
NWK 23.11 c

This question asks us to make pairwise comparisons among all 12
cell means using Tukey with overall confidence coefficient .90
So the way we will go about it is to treat the 3X2X2 design as
a single classification with 12 groups (there are other approaches
we could take)
Set up Tukey multiple comparisons:
W=q(1alpha,abc,(n1)abc) * sqrt(MSE/n)
a=3, b=2, c=2, n=4, MSE=7.4
W=q(.90,12,36)*sqrt(7.4/4)=4.52*1.36= 6.15
A confidence interval for mu(ijk)mu(i'j'k') is:
xbar(ijk)xbar(i'j'k') + 6.15

To identify the best quality agencies, let's rank the means:
mean a b c
122.05 1 1 1
121.23 2 1 1
116.93 1 2 1
116.25 2 2 1
111.25 1 1 2
110.97 2 1 2
92.68 1 2 2
91.75 3 1 1
90.60 2 2 2
85.53 3 2 1
79.95 3 1 2
61.05 3 2 2
Pairs of means that differ by more than 6.15 will be significantly
different (i.e. the CIs for the difference will not include zero)
using the Tukey procedure.
So it seems that the highest quality agencies have Local Supervisors
(c=1), All Contract Work Performed in House (b=1) and High or Average
Fee Levels (a=1 or 2). The means for these two cells are
significantly greater than the means for almost all the others.


10
Knee problem, design considerations
part (f)
f)
Note: remember, for Tukey we don't need to worry splitting up the
familywise error rate
width = 5 = 2*q(.95,I,dfw)*sqrt(MSW/n)
= 2*q(.95,3,dfw)*sqrt(19.8/n) using MSW from previous analysis
a little algebra...
2.5 = q(.95,3,dfw)*sqrt(19.8/n)
6.25 = [q(.95,3,dfw)]^2 * 19.8/n
n = 3.168 * [q(.95,3,dfw)]^2
Problem is, we don't know exactly what q is without knowing n, because
q depends on degrees of freedom within. So we should use any prior
information we have to suggest a best guess.
The widths of the intervals in our previous analysis were around 11.
Since we want to cut this width approximately in half, we'll need to
quadruple (approximately) our group sample sizes. So start with n=40
as a best guess, which gives dfw =1203 =117.
Using Table B.9, q(.95,3,117) = 3.36 (approx.).
Therefore n = 3.168 * (3.36)^2 = 36, which is pretty close to our
original guess. Anything reasonably close to this number is acceptable.

11
a)
NWK 26.7
What is the power of the test in Problem 16.12 ?
Sample sizes were n_1 = 8, n_2 = 10, n_3=6
Test was carried out with Type I error rate alpha set to .01
For the power calculation we are given:
mu_1 = 37 mu_2 = 35 mu_3 = 28
sigma = 4.5
Begin by finding the Weighted Mean following NWK sec 26.4
(also example on p.1055)
Weighted Mean, mu(.) = (8*37 + 10*35 + 6*28)/24 = 33.9167
Next, compute the quantity called phi from NWK p. 1055
phi is a measure of how unequal the mu_i are.
phi = (Sqrt[(8*(37  33.9167)^2 + 10*(35  33.9167)^2 + 6*(28 33.9167)^2)/3]/4.5)
= 2.21418
The degrees of freedom for the omnibus test are 2 and
21
Interpolating Table B.11 (NWK p. 1356) to look up the related
value of power
Power = .70 or .71
b)
NWK 26.19a p.1066.
For a single factor study and equal treatment sample sizes and:
6 groups , Type I error rate alpha set to .05, and
Type II error rate beta is desired to be .10 and
the quantity called DELTA = 50
Note: DELTA is the minimum range of factor level
means for which it is important to detect differences
between the mu(i) with high probability).
Determine the required sample sizes for sigma value of 50, 25, 20?
From the above Power = 1Beta = 1.10 = .90
Using Table B.12, and the related given and calculated info.:
NWK p. 1361
Sigma = 50 > Sample size = 34
Sigma = 25 > Sample size = 10
Sigma = 20 > Sample size = 7
What generalization is suggested by your results?
Clearly, for larger sigma, the larger the sample
size is required for a specified value of power.
(c.f NWK 10561060)
Note: Choice of the possible value for sigma is
consequential for the determination of needed sample
sizes, a little bit off on sigma does make a difference.

****************************************************************************

END