Education 161 Winter 2000
Assignment 2 Solutions Feb 1, 2000
1.
First order of business is to read in outcomes and set up data file.
MTB > read '[data from file]' c1
36 ROWS READ
C1
26 23 28 19 . . .
**So far so good****
Now we can use the SET command to construct row
and column indices.
or use set patterned data in MT menu
**set up row index*****
MTB > set c2
DATA> (1:2)18.
DATA> end
**let's look at it***
MTB > print c2
C2
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2
**set up col index***
MTB > set c3
DATA> 2(1:3)6
DATA> end
MTB > print c3
C3
1 1 1 1 1 1 2 2 2 2 2 2 3 3 3
3 3 3 1 1 1 1 1 1 2 2 2 2 2 2
3 3 3 3 3 3
a. For profile plot get cell means:
MTB > table c2 c3;
SUBC> stats c1.
ROWS: C2 COLUMNS: C3
1 2 3 ALL
1 6 6 6 18
23.167 28.333 12.833 21.444
3.971 4.320 4.792 7.801
2 6 6 6 18
20.500 25.000 32.833 26.111
4.135 2.898 3.430 6.201
ALL 12 12 12 36
21.833 26.667 22.833 23.778
4.108 3.916 11.175 7.337
CELL CONTENTS --
C1:N
MEAN
STD DEV
I won't try to draw here. Main effects of both Time of Isolation and
Level of Reinforcement are best interpreted keeping in mind the
disordinal interaction indicated in the profile plot. The profile
plot indicates that recall increases steadily with time in isolation
only for verbally reinforced children. For unreinforced children,
recall increases from 20 to 40 minutes of isolation, but decreases
from the 40 minute level (and falls below the 20 minute level) for 60
minutes of isolation.
see Hopkins and Glass (HG) 18.2-18.5
b. The model for these data is:
y(ijk) = mu + alpha(i) + beta(j) + alphabeta (ij) + epsilon(ijk)
where mu is the grand mean of all observations.
y(ijk) is recall for child k observed within the group
defined by reinforcement level i and isolation level j.
alpha(i) is the effect of reinforcement at level i.
beta(j) is the effect of isolation at level j.
alphabeta (ij) is the effect of the interaction between reinforcement
at level i and isolation at level j .
epsilon(ijk) is a random error
see HG 18.8
part c.
The ANOVA table below indicates significant main effects and interaction
between the two factors (Time of Isolation and Level of Reinforcement) using
overall error rate .05.
MTB > anove c1 = c2|c3
***Note: a wonderful feature of Minitab is that it executes misspelled
commands as long as uniquely determinable*****
Factor Type Levels Values
C2 fixed 2 1 2
C3 fixed 3 1 2 3
Analysis of Variance for C1
Source DF SS MS F P
C2 1 196.00 196.00 12.42 0.001
C3 2 156.22 78.11 4.95 0.014
C2*C3 2 1058.67 529.33 33.55 0.000
Error 30 473.33 15.78
Total 35 1884.22
To get the critical values for the series of 3 hypothesis tests,
we use minitab
MTB > invcdf .983;
SUBC> f 2 30.
0.9830 4.6817
MTB > invcdf .983;
SUBC> f 1 30.
0.9830 6.3871
This gives us each test at .017 (c.f. alphatot.tab)
We are able to reject the main effects and interaction null hypotheses.
see HG 18.17
------------------------------------------------------------------
2.
part (f)
Note: remember, for Tukey we don't need to worry splitting up the
familywise error rate
width = 5 = 2*q(.95,I,dfw)*sqrt(MSW/n)
= 2*q(.95,3,dfw)*sqrt(19.8/n) using MSW from previous analysis
a little algebra...
2.5 = q(.95,3,dfw)*sqrt(19.8/n)
6.25 = [q(.95,3,dfw)]^2 * 19.8/n
n = 3.168 * [q(.95,3,dfw)]^2
Problem is, we don't know exactly what q is without knowing n, because
q depends on degrees of freedom within. So we should use any prior
information we have to suggest a best guess.
The widths of the intervals in our previous analysis were around 11.
Since we want to cut this width approximately in half, we'll need to
quadruple (approximately) our group sample sizes. So start with n=40
as a best guess, which gives dfw =120-3 =117.
Using Table values, q(.95,3,117) = 3.36 (approx.).
Therefore n = 3.168 * (3.36)^2 = 36--i.e., 36 subjects in each of the three
groups--which is pretty close to our original guess.
Anything reasonably close to this number is acceptable.
---------------------------------------------------
3.
(i)
So first let's take a look at the descriptive statistics for these
data (five groups, single classification.
MTB > desc c1-c5
N MEAN MEDIAN TRMEAN STDEV SEMEAN
C1 5 39.40 36.00 39.40 7.92 3.54
C2 5 44.20 41.00 44.20 10.40 4.65
C3 5 52.0 62.0 52.0 26.0 11.6
C4 5 40.80 43.00 40.80 10.13 4.53
C5 5 53.20 52.00 53.20 17.48 7.82
MIN MAX Q1 Q3
C1 32.00 51.00 33.00 47.50
C2 32.00 56.00 35.00 55.00
C3 10.0 75.0 27.5 71.5
C4 30.00 55.00 31.00 49.50
C5 35.00 80.00 38.00 69.00
The standard deviations range from 8 to 26, representing variances
ranging from 64 to 680, a range of 10 to 1 (rather non-equal). Now,
one consideration is that sample sizes are very small (so we cannot
assume these differences are highly significant) and also since sample
sizes are equal, we don't need to worry much about the effects of unequal
variances on the one-way anova tests. We could, to carry this further,
look at the Brown-Forsythe or Welch alternatives to
standard anova (done in part ii)
see HG Table 15.3
ii. Standard one-way anova (unstacked data). F-test for the
null hypothesis of equal group means across the 5 groups (against
the alternative of unequal means) has a miniscule value .81.
Compare with critical value based on F(4,20) (F.95(4,20) = 2.87)
so we cannot reject the null hypothesis of equal group means.
MTB > aovoneway c1-c5
ANALYSIS OF VARIANCE
SOURCE DF SS MS F p
FACTOR 4 808 202 0.81 0.536
ERROR 20 5016 251
TOTAL 24 5824
INDIVIDUAL 95 PCT CI'S FOR MEAN
BASED ON POOLED STDEV
LEVEL N MEAN STDEV ----------+---------+---------+------
C1 5 39.40 7.92 (-----------*-----------)
C2 5 44.20 10.40 (-----------*-----------)
C3 5 52.00 25.97 (-----------*------------)
C4 5 40.80 10.13 (-----------*-----------)
C5 5 53.20 17.48 (-----------*------------)
----------+---------+---------+------
POOLED STDEV = 15.84 36 48 60
(iii).
For comparison, we are asked to try out the non-parametric
alternative to the standard one-way anova, Kruskal-Wallis.
We need to put the data in stacked form to carry this out.
MTB > stack c1 c2 c3 c4 c5 into c6;
SUBC> subscripts c7.
MTB > kruskal-wallis c6 c7
LEVEL NOBS MEDIAN AVE. RANK Z VALUE
1 5 36.00 9.5 -1.19
2 5 41.00 12.3 -0.24
3 5 62.00 17.0 1.36
4 5 43.00 10.1 -0.99
5 5 52.00 16.1 1.05
OVERALL 25 13.0
H = 4.32 d.f. = 4 p = 0.366
H = 4.33 d.f. = 4 p = 0.364 (adj. for ties)
And turning to the chi-square with (5-1) degrees of freedom (see NWK 18.7)
the critical value is 9.49 (type I error rate .05). The test statistic H
is 4.3. So we do not reject Ho, just as with the
parametric anova on raw data (or even transformed data if we had tried
to stabilize variance).
see HG 15.30
-----------------------------------------------
4.
a. Interpretation of contrasts
Contrast 1: Is the average learning outcome score of animals
given both food and water different from the average score of animals
deprived of one or both substances?
Contrast 2: For animals given both food and water, does it make a
difference whether they receive it "ad lib" versus twice a day?
Contrast 3: Does the average learning of animals deprived of
either food or water differ from the learning of animals deprived of
both?
Contrast 4: Does the effect of being deprived of food differ from
the effect of being deprived of water?
see HG 17.8 but especially 17.9
b. Verifying orthogonality
First, write out the coefficients for all 4 contrasts:
Group
1 2 3 4 5
C1: 1/2 1/2 -1/3 -1/3 -1/3
C2: 1 -1 0 0 0
C3: 0 0 1/2 1/2 -1
C4: 0 0 1 -1 0
To verify orthogonality, multiply corresponding coefficients for a
pair of contrasts and add them up. If the pair is orthogonal, this
sum will be zero. Do this for each of the 3-choose-2 = 6 pairs.
C1 & C2: sigma(a[i]b[i]) = (1/2)(1)+(1/2)(-1)+0+0+0=0
C1 & C3: 0+0+(-1/3)(1/2)+(-1/3)(1/2)+(-1/3)(-1)=0
C1 & C4: 0+0+(-1/3)(1)+(-1/3)(-1)=0
C2 & C3: 0+0+0+0+0=0
C2 & C4: 0+0+0+0+0=0
C3 & C4: 0+0+(1/2)(1)+(1/2)(-1)+0=0
So all contrasts are orthogonal with one another. And since we have
5-1=4 degrees of freedom between, we have a full set of orthogonal
contrasts here.
see HG 17.16
c. First run the ANOVA, which will give us the sample group means and
other things we'll need later.
MTB > read '/usr/class/ed257/96hw1p5.dat' c1-c5
5 ROWS READ
ROW C1 C2 C3 C4 C5
1 18 20 6 15 12
2 20 25 9 10 11
3 21 23 8 9 8
4 16 27 6 12 13
. . .
MTB > aovoneway c1-c5
ANALYSIS OF VARIANCE
SOURCE DF SS MS F p
FACTOR 4 816.00 204.00 36.43 0.000
ERROR 20 112.00 5.60
TOTAL 24 928.00
INDIVIDUAL 95 PCT CI'S FOR MEAN
BASED ON POOLED STDEV
LEVEL N MEAN STDEV -+---------+---------+---------+-----
C1 5 18.000 2.550 (---*---)
C2 5 24.000 2.646 (---*---)
C3 5 8.000 2.121 (--*---)
C4 5 12.000 2.550 (---*---)
C5 5 11.000 1.871 (--*---)
-+---------+---------+---------+-----
POOLED STDEV = 2.366 6.0 12.0 18.0 24.0
Point estimate for contrast, denoted as l-hat is
a[1]X-bar[1]+...+a[I]X-bar[I]
Use the sample means above to construct point estimates for our 4
contrasts:
l-hat[1] = (1/2)(18)+(1/2)(24)+(-1/3)(8)+(-1/3)(12)+(-1/3)(11) = 10.67
l-hat[2] = (1)(18)+(-1)(24) = -6
l-hat[3] = (1/2)(8)+(1/2)(12)+(-1)(11) = -1
l-hat[4] = (1)(8)+(-1)(12) = -4
d. Looking up alpha(total) = .10 and C=4 on alphatot.tab gives an
individual Type I error rate of .0259963. So our t critical value is
t(20, .9871).
To find the actual value of t, Minitab gives us:
MTB > invcdf .9871;
SUBC> t 20.
0.9871 2.4082
Interval estimates:
l-hat[1]: 10.67 +/- 2.4082*sqrt(5.6(1/4+1/4+1/9+1/9+1/9)/5) =
[8.34, 12.99]
l-hat[2]: -6 +/- 2.4082*sqrt(5.6(1+1)/5) =
[-9.60, -2.40]
l-hat[3]: -1 +/- 2.4082*sqrt(5.6(1/4+1/4+1)/5) =
[-4.12, 2.12]
l-hat[4]: -4 +/- 2.4082*sqrt(5.6(1+1)/5) =
[-7.60, -.40]
Only one of these intervals, the one for contrast 3, contains zero.
So we conclude that the effects of food & water versus some
deprivation (l-hat[1]), timing of receiving food & water (l-hat[2]), and being
deprived of food versus water (l-hat[4]) do make a difference on our
learning outcome measure, whereas being deprived of either food or
water versus being deprived of both does not matter. Note that the
interval for the 4th contrast comes pretty close to 0 but does not
contain it.
see HG 17.17, and formula 17.8A-C
--------------------------------------------
5.
"More on interactions..."
Preamble:
This problem was constructed to address the question,
What more can be done about describing (or drawing inferences) about
the interaction terms beyond just (rejecting or not) the omnibus null
hypothesis of no interaction? Previously, we had discussed the
importance of the profile plot as the major descriptive technique.
A stategy for following the graphical display involves
estimating the row effects separately for each level of the column
factor. The college mathematics learning
example (a 2x3 design) that was described in class is a good template
for this example.
Note: In the output below the columns c1-c4 in the data file
extraint.dat are labelled as c10-c13--that is outcome in c10, row,column
indicators in c12,c13. Columns c1-c6 here contain the data for each
cell in the 2x3 design.
The data (5 replications in the 2x3 design) were
generated with within-cell variance of 9 and cell means
12 15 18
10 11 12
Here's how I did it, putting each cell in its own column.
Generate the data:
MTB > random 5 c1;
SUBC> normal 12 3.
MTB > random 5 c2;
SUBC> normal 15 3.
MTB > random 5 c3;
SUBC> normal 18 3.
MTB > random 5 c4;
SUBC> normal 10 3.
MTB > random 5 c5;
SUBC> normal 11 3.
MTB > random 5 c6;
SUBC> normal 12 3.
Here's the data with each cell of the 2x3 in it's own column
MTB > print c1-c6
ROW C1 C2 C3 C4 C5 C6
1 8.1988 20.3003 17.1762 9.5724 9.7497 10.3652
2 8.7541 16.3191 18.7042 11.1114 6.6897 12.2657
3 14.5011 14.4865 16.4200 13.2167 13.9392 9.7098
4 8.0545 12.3120 17.5973 11.9217 14.0845 11.9133
5 7.8095 14.6511 21.7529 11.6024 7.9765 10.0781
------------------------------
Let's start the analysis
Describe two-way data using the stacked form that you have in extraint.dat
MTB > table c12 c13;
SUBC> stats c10.
ROWS: C12 COLUMNS: C13
1 2 3 ALL
1 5 5 5 15
9.464 15.614 18.330 14.469
2.837 2.982 2.084 4.563
2 5 5 5 15
11.485 10.488 10.866 10.946
1.323 3.396 1.147 2.086
ALL 10 10 10 30
10.474 13.051 14.598 12.708
2.343 4.047 4.241 3.919
CELL CONTENTS --
C10:N
MEAN
STD DEV
Obtain anova table
MTB > twoway c10 c12 c13
ANALYSIS OF VARIANCE C10
SOURCE DF SS MS
C12 1 93.07 93.07
C13 2 86.80 43.40
INTERACTION 2 122.10 61.05
ERROR 24 143.52 5.98
TOTAL 29 445.49
Clearly, the interaction is significant, as are the main effects.
Profile plot based on sample data will show marked interaction
which actually appears disordinal even though for
the population cell means the interaction is ordinal.
First, let's compare row cell means at levels of the column (1,2,3
respectively) by a series of two-sample t inferences. First the
default 95% interval produced by twosample is shown. This is most
likely what is commonly done in the literature when such a
comparison is attempted. We know that MUCH better practice would
be to specify (following Bonferroni) a set of 98.5% intervals to
control the overall confidence coefficient to approx 95%. Here I
have my original data set with each cell in its own column; you
can get there by unstacking extraint.dat.
Comparing the two rows at the first level of the column factor:
MTB > twosample c1 c4
TWOSAMPLE T FOR C1 VS C4
N MEAN STDEV SE MEAN
C1 5 9.46 2.84 1.3
C4 5 11.48 1.32 0.59
95 PCT CI FOR MU C1 - MU C4: (-5.6, 1.58)
TTEST MU C1 = MU C4 (VS NE): T= -1.44 P=0.21 DF= 5
Same for the second level of the column factor:
MTB > twosample c2 c5
TWOSAMPLE T FOR C2 VS C5
N MEAN STDEV SE MEAN
C2 5 15.61 2.98 1.3
C5 5 10.49 3.40 1.5
95 PCT CI FOR MU C2 - MU C5: (0.3, 9.9)
TTEST MU C2 = MU C5 (VS NE): T= 2.54 P=0.039 DF= 7
And for the third level of the column factor:
MTB > twosample c3 c6
TWOSAMPLE T FOR C3 VS C6
N MEAN STDEV SE MEAN
C3 5 18.33 2.08 0.93
C6 5 10.87 1.15 0.51
95 PCT CI FOR MU C3 - MU C6: (4.86, 10.07)
TTEST MU C3 = MU C6 (VS NE): T= 7.02 P=0.0004 DF= 6
An interesting note on the above is that the interval estimate at the
second level of the column factor will include 0 for any reasonable
use of an overall confidence coefficient of 95% (i.e. p-value =~ .04)
------
Now let's use a more appropriate confidence
coefficient considering we are doing 3 of these--98.3%, which
is close to Bonferroni with overall 95% for the 3. (I checked that
Minitab really does do .983 confidence here, not just .98 as
the output indicates).
The confidence intervals are as you would expect; the second
includes 0.
MTB > twosample .983 c1 c4.
TWOSAMPLE T FOR C1 VS C4
N MEAN STDEV SE MEAN
C1 5 9.46 2.84 1.3
C4 5 11.48 1.32 0.59
98 PCT CI FOR MU C1 - MU C4: (-6.9, 2.90)
TTEST MU C1 = MU C4 (VS NE): T= -1.44 P=0.21 DF= 5
MTB > twosample .983 c2 c5.
TWOSAMPLE T FOR C2 VS C5
N MEAN STDEV SE MEAN
C2 5 15.61 2.98 1.3
C5 5 10.49 3.40 1.5
98 PCT CI FOR MU C2 - MU C5: (-1.2, 11.4)
TTEST MU C2 = MU C5 (VS NE): T= 2.54 P=0.039 DF= 7
MTB > twosample .983 c3 c6.
TWOSAMPLE T FOR C3 VS C6
N MEAN STDEV SE MEAN
C3 5 18.33 2.08 0.93
C6 5 10.87 1.15 0.51
98 PCT CI FOR MU C3 - MU C6: (3.98, 10.95)
TTEST MU C3 = MU C6 (VS NE): T= 7.02 P=0.0004 DF= 6
----------------------------
Now Compare with pairwise intervals from Tukey with family-wise
95% confidence Here I use the stacked data in the form you have
it in extraint.dat. Oneway is run on the outcome and the
indicator of group (1,...6)
MTB > oneway c10 c11;
SUBC> tukey.
ANALYSIS OF VARIANCE ON C10
SOURCE DF SS MS F p
C11 5 301.97 60.39 10.10 0.000
ERROR 24 143.52 5.98
TOTAL 29 445.49
INDIVIDUAL 95 PCT CI'S FOR MEAN
BASED ON POOLED STDEV
LEVEL N MEAN STDEV --+---------+---------+---------+----
1 5 9.464 2.837 (-----*----)
2 5 15.614 2.982 (-----*-----)
3 5 18.330 2.084 (-----*----)
4 5 11.485 1.323 (-----*----)
5 5 10.488 3.396 (----*-----)
6 5 10.866 1.147 (----*-----)
--+---------+---------+---------+----
POOLED STDEV = 2.445 8.0 12.0 16.0 20.0
Tukey's pairwise comparisons
Family error rate = 0.0500
Individual error rate = 0.00498
Critical value = 4.37
Intervals for (column level mean) - (row level mean)
1 2 3 4 5
2 -10.929
-1.371
3 -13.646 -7.496
-4.087 2.063
4 -6.801 -0.650 2.066
2.758 8.908 11.624
5 -5.803 0.347 3.063 -3.782
3.755 9.905 12.621 5.776
6 -6.182 -0.032 2.685 -4.161 -5.158
3.376 9.527 12.243 5.398 4.401
The 2,5 entry is the most interesting. Tukey provides an interval
that does not include 0 with family-wise confidence coeff 95% that
about matches the two-sample based interval having a far lower
confidence coeff for the set of 3 intervals.
------------------------------------------
Finally, another approach is to construct comparisons using the Bonferroni
method. The potential advantage of this over the
Tukey results above is that we can limit to just the 3 comparisons that
we seek.
Point estimates:
D-hat[1] = 9.464-11.485 = -2.021
D-hat[2] = 15.614 - 10.488 = 5.126
D-hat[3] = 18.330-10.866 = 7.464
Var(D-hat) = 2*MSW/n = 2(5.98)/5 = 2.392
B = t(1-.05/6, 24) = t(.9917,24) = 2.5754
Therefore the width of each interval is 2*2.5754*sqrt(2.392) = 2*3.98 = 7.96
Intervals
MU C1 - MU C4: (-2.021-3.98, -2.021+3.98) = (-6.00,1.96)
MU C2 - MU C5: (1.15,9.11)
MU C3 - MU C6: (3.48,11.44)
As with the Tukey intervals, only the interval for MU C1 - MU C4 contains
zero. Notice that these intervals are somewhat narrower than the Tukey
intervals (7.96 versus 9.56 width). Since we're only interested in 3
of the 16 possible pairwise comparisons, the Bonferroni method appears to
be an improvement over Tukey.
---------------------------
Here's a summary table for the comparison of the rows at each
of the 3 levels of the column factor.
2-sample t Tukey Bonferroni
t-int .983 overall .95 overall .95
------- ------ --------
CI FOR MU C1 - MU C4: (-6.9, 2.90) (-6.801, 2.758) (-6.00,1.96)
CI FOR MU C2 - MU C5: (-1.2, 11.4) ( 0.347, 9.905) (1.15,9.11)
CI FOR MU C3 - MU C6: (3.98, 10.95) ( 2.685, 12.243) (3.48,11.44)
----------------------------------------------------------------------------
problem 6 solution
Hopkins and Glass Problems 1 and 2 on page 527
First things first: We have to enter the data into a useful data
structure. Having entered the vocab scores into c1, we need to
create group membership variables for each of the factors. One
way is to manually enter 12 ones then 12 twos in into c2 and name
it IQ. Then enter the approprate sequence of ones, twos, and threes
in c3 and name it Method. For small datasets like this, this is an
acceptable approach, but it becomes quite tedious and time consuming
with large datasets. Below you will find command that help you with
this process. To create a series of 12 ones and then 12 twos in a
column named IQ, we do this either from the command line or from
the Calc...'make patterned data' menu item:
MTB > Name c2 = 'IQ'
MTB > Set 'IQ'
DATA> 1( 1 : 2 / 1 )12
DATA> End.
Note that your use of this procedure depend on the order in which
you entered the vocab scores. To make a repeating series of
4 ones, 4 twos, and 4 threes in a column named method, we do this:
MTB > Name c3 = 'Method'
MTB > Set 'Method'
DATA> 2( 1 : 3 / 1 )4
DATA> End.
First part of data analysis is to look at at the cell means
and ploting them by factors.
MTB > table c2 c3;
SUBC> mean c1.
Tabulated Statistics
Rows: IQ Columns: Method
1 2 3 All
1 31.000 23.000 24.000 26.000
2 29.000 18.000 19.000 22.000
All 30.000 20.500 21.500 24.000
>From this we might presume that there is a column effect, a possible row
effect, and probably no interaction effect. Typing the following commands
will giveyou a nice plot of cell means by factors. This is also avaliable
from the
Stats..ANOVA.. Interaction Plot Menu.
which show the same features as examination of cell means
MTB > %Interact 'IQ' 'Method';
SUBC> Response 'vocab'.
Now, we can run a two-way ANOVA in a number of different ways. As long
as the cells are balanced, we might as well use the anova command,
although we could use 'twoway' or another technique we haven't talked
about... glm. Here is what the Minitab help file has to say about our
options:
"Two-way analysis of variance performs an analysis of variance for testing
the
equality of populations means when classification of treatments is by two
variables or factors.Data must be balanced (all cells must have the same
number of observations) and factors must be fixed. If you wish to specify
certain factors to be random, use Balanced ANOVA if your data are balanced;
use General Linear Models if your data are unbalanced or if you wish to
compare means using multiple comparisons."
Using anova:
MTB > ANOVA 'vocab' = IQ| Method
Analysis of Variance (Balanced Designs)
Factor Type Levels Values
IQ fixed 2 1 2
Method fixed 3 1 2 3
Analysis of Variance for vocab
Source DF SS MS F P
IQ 1 96.00 96.00 4.11 0.058
Method 2 436.00 218.00 9.34 0.002
IQ*Method 2 12.00 6.00 0.26 0.776
Error 18 420.00 23.33
Total 23 964.00
What sense do we make out of the 3 F values? If we want an experiment-wide
type I error rate of about .05, then the alpha used to calculate the
critical f-value for each hypothesis could be .05/3 = .0167 using the
Bonferroni inequality. (As mention in class, there are other ways of
controlling type one error when several hypotheses are tested. We
could have just as easily decided to use a .02 alpha for each of the
main factors and alpha = .01 for the test of interaction hypothesis.
To be fair, all of this should be done before you see the results of the
test.)
To test the null hypothesis that IQ doesn't matter, we got F(1, 18) = 4.11.
MTB > InvCDF .9833; (this number is 1-alpha)
SUBC> F 1 18.
Inverse Cumulative Distribution Function
F distribution with 1 DF in numerator and 18 DF in denominator
P( X <= x) x
0.9833 6.9601
Thus F < Fcrit, and we fail to reject the hypothesis that IQ doesn't
matter.
Note that this hypothesis can be stated more formally in terms of factor
means(all equal) ,or main effects (all zero).
To test the null hypothesis that Method doesn't matter, we got F(2, 18) =
9.34.
MTB > InvCDF .9833;
SUBC> F 2 18.
Inverse Cumulative Distribution Function
F distribution with 2 DF in numerator and 18 DF in denominator
P( X <= x) x
0.9833 5.1814
Thus F > Fcrit and we reject the null hypothesis that method doesn't matter
and accept the alternative that method does matter. Put more formally, the
treatment effects for levels of method do not all equal zero.
Lastly, we can use the same logic and procedures to fail to reject the
hypotheses
that there is no interaction effect.
(F = .26 < Fcrit = 5.1814)
see HG 18.6, 18.17, 18.19
Question 2) To get mutilple comparisons for factorial designs using minitab,
we have to use the Generalized Linear Model (GLM) function. For balance
designs,
glm should give us virtually the same results as the anova command. See
below
that the F values we get for our 3 hypotheses are the same as we got from
anova.
MTB > GLM 'vocab' = IQ| Method;
SUBC> Brief 2;
SUBC> Pairwise Method;
SUBC> Tukey.
General Linear Model
Analysis of Variance for vocab, using Adjusted SS for Tests
Source DF Seq SS Adj SS Adj MS F P
IQ 1 96.00 96.00 96.00 4.11 0.058
Method 2 436.00 436.00 218.00 9.34 0.002
IQ*Method 2 12.00 12.00 6.00 0.26 0.776
Error 18 420.00 420.00 23.33
Total 23 964.00
Tukey 95.0% Simultaneous Confidence Intervals
Response Variable vocab
All Pairwise Comparisons among Levels of Method
Method = 1 subtracted from:
Method Lower Center Upper ---+---------+---------+---------+---
2 -15.67 -9.500 -3.335 (-------*--------)
3 -14.67 -8.500 -2.335 (--------*--------)
---+---------+---------+---------+---
-14.0 -7.0 0.0 7.0
Method = 2 subtracted from:
Method Lower Center Upper ---+---------+---------+---------+---
3 -5.165 1.000 7.165 (-------*--------)
---+---------+---------+---------+---
-14.0 -7.0 0.0 7.0
>From these 95% CIs for the difference between the means of the 3 levels of
Method, we can see that method 1 is significantly different
from methods 2 and 3, and the method 2 is not significnatly different
from method 3. GLM also gives us the T statistics for each of the
pariwise comparisons, as presented below.
Tukey Simultaneous Tests
Response Variable vocab
All Pairwise Comparisons among Levels of Method
Method = 1 subtracted from:
Level Difference SE of Adjusted
Method of Means Difference T-Value P-Value
2 -9.500 2.415 -3.933 0.0027
3 -8.500 2.415 -3.519 0.0066
Method = 2 subtracted from:
Level Difference SE of Adjusted
Method of Means Difference T-Value P-Value
3 1.000 2.415 0.4140 0.9103
By hand, we can get the intervals by calculating Tukey's HSD (Alpha = .05):
(3.61)(1.71) = 6.17 (see H&G, p.525 for notation)
So, each interval can be constructed by the point estimate
plus and minus 6.17. For example, mean(method1)-mean(method2) = -9.5,
plus and minus 6.17 gives the interval (-15.67, -3.355).
Same answer as above.