Education 161 Winter 2000
Assignment 1 Due Jan 18,2000
Note data files are available in one of two locations:
path: /usr/class/ed161/[data file]
or using web-services at URL
http://www.stanford.edu/class/ed161/hw/[data file]
-----------------------------------
1. The 179 participants in the Cartoon experiment--description below,
data in cartoon.dat--each saw cartoon and realistic slides.
(a) construct an approrpiate statistical test to see if there is any
difference between the scores on the two types of slides each person
obtained immediately after presentation.
(b) construct a 90% confidece interval for the difference between the
these scores on the two types of slides.
------------
Cartoon Data in cartoon.dat
When educators make an instructional film, they have two objectives:
Will the people who watch the film learn the material as efficiently as
possible? Will they retain what they have learned?
To help answer these questions, an experiment was conducted to
evaluate the relative effectiveness of cartoon sketches and realistic pho-
tographs, in both color and black and white visual materials.
A short instructional slide presentation was developed. The topic
chosen for the presentation was the behavior of people in a group situation
and, in particular, the various roles or character types that group members
often assume. The presentation consisted of a five-minute lecture on
tape, accompanied by 18 slides. Each role was identified as an animal.
Each animal was shown on two slides: once in a cartoon sketch and
once in a realistic picture. All 179 participants saw all of the 18 slides.
but a randomly selected half of the participants saw them in black and
white while the other half saw them in color.
After they had seen the slides, the participants took a test (immediate
test) on the material. The 18 slides were presented in a random order,
and the participants wrote down the character type represented by that
slide. They received two scores: one for the number of cartoon characters
they correctly identified and one for the number of realistic characters
they correctly identified. Each score could range from 0 to 9, since there
were nine characters.
Four weeks later, the participants were given another test (delayed
test) and their scores were computed again. Some participants did not
show up for this delayed test, so their scores were given the missing
value code *.
The primary participants in this study were preprofessional and
professional personnel at three hospitals in Pennsylvania involved in an
in-service training program. A group of Penn State undergraduate students
also were given the test as a comparison. All participants were given
the OTIS Quick Scoring Mental Ability Test, which yielded a rough
estimate of their natural ability.
Some questions that are of interest here are as follows: Is there a
difference between color and black and white visual aids? Between cartoon
and realistic? Is there any difference in retention? Does any difference
depend on educational level or location? Does adjusting for OTIS scores
make any difference?
The data are given below. They have been sorted partially so that
various parts may easily be studied separately.
Description of Cartoon Data
Variable Description
C1 ID Identification number
C2 COLOR 0 = black and white, 1 = color (no participant saw
both)
C3 ED Education: 0 = preprofessional, I = professional. 2 =
college student
C4 LOCATION Location: I = hospital A, 2 = hospital B. 3 = hospital
4 = Penn State student
C5 OTIS Score: from about 70 to about 130
C6 CARTOON1 Score on cartoon test given immediately after presenta
tion (possible scores are 0, 1, 2, ..., 9)
C7 REAL1 Score on realistic test given immediately after presenta
tion (possible scores are 0, 1, 2, ..., 9)
C8 CARTOON2 Score on cartoon test given four weeks (delayed) after
presentation (possible scores are 0, 1, 2, ..., 9; * is
used for a missing observation)
C9 REAL2 Score on realistic test given four weeks (delayed) after
presentation (possible scores are 0, 1, 2, ..., 9; * is used
for a missing observation)
==============================================================================
2.
Data in the file rat.dat gives the drop in blood pressure for
three groups of six rats from a strain of hypertensive rats. The six rats in
the first group (C1) were treated with a low dose of an antihypertensive prod-
uct, the second group (C2) with a higher dose of the same antihypertensive
product, and the third group (C3) with an inert control. Note that the
variability in blood pressure decreases, even for rats in the control group.
Also note that negative values represent increases in blood pressure.
Construct a 95% confidence interval for difference in population
means between the low dose group and control group.
Use Minitab with these data to construct the interval
estimate, making no assumption about equality of the group variances.
=============================================
3. Complete the Anova Table given below. Also state and carry
out a test of the omnibus null hypothesis with Type I error rate
.10.
SOURCE SS df MS
Between 80 4 **
Within ** * **
Total 480 44
--------------------------------------------------------------------------
4. Salary disputes and their eventual resolutions often leave both employer
and employees embittered by the entire ordeal. To assess employee reactions
to a recently devised salary and fringe benefits plan, the personnel
department obtained random samples of 15 employees from each of three
divisions: manufacturing, marketing, and research. Each employee sampled
was asked to respond (in confidence) to a series of questions. Several
employees refused to cooperate, as reflected in the unequal sample sizes.
Some data summary is given below.
Manufacturing Marketing Research
Sample Size 12 14 11
Sample mean 25.2 32.6 28.1
Sample Variance 3.6 4.8 5.3
a. Write a model for this data structure
b. Carry out an omnibus test of all three employee groups having equal
population means using a standard one-way analysis of variance procedure.
Use Type 1 error rate .01.
----------------------------------------------------------------------------
5. I've had more knee operations than you've have had statistics
courses....
A rehabililitation center researcher was interested in examining the
relationship between physical fitness prior to surgery of persons
undergoing corrective knee surgery and time required in physical therapy
until sucessful rehabilitation. 24 male subjects ranging in age from 18
to 30 years who had undergone similar corrective knee surgery during the
past year were selected for the study.
In the data file knee.dat
c1 contains the number of days required for sucessful completion of
physical therapy and c2 contains an indicator of prior physical fitness
status-- 1 = below average; 2 = average; 3 = above average.
(So this data set is of the form of a time-to-mastery study.)
a) obtain mean and variance of time to recovery for each group
b) present a graphical look at the scores for the three groups
by constucting aligned dotplots for the three groups
c) carry out an anova for this one-way classification and test the
omnibus null hypothesis of no differences between the group means
using Type I error rate .05.
d) display residuals from the fit of the anova model for each group.
e) carry out the post-hoc pairwise comparison procedure in order to
obtain interval estimates of each pairwise comparison using
experimentwise error rate .05.
==============================================================
6. In the class materials, the file smsg.dat contains the data for the
SMSG versus traditional mathematics instruction evaluation discussed
in the first week of Ed257 (refer to Web site description).
C1 is group (1=SMSG 2=trad.); C2 is class mean mathematics achievement.
Carry out a single classification anova (here the classsification
variable only has two levels) and show the equivalence to a pooled t-test.
Unstacking these data and using aovoneway will avoid an error message
from some Minitab versions about unequal group sizes.
----------------------------------------------------------------
7. An experiment was conducted to compare the effectiveness of
five different weight-reducing agents. A random sample of 50
males was randomly divided into five equal groups, with
preparation A assigned to the first group, B to the second group,
and so on. Each person in the experiment was given a prestudy
physical and told how many pounds overweight he was. A comparison
of the mean number of pounds overweight for the groups showed no
significant differences. The study program was then begun, with
each group taking the prescribed preparation for a fixed period
of time. At the end of the study period, weight losses were
recorded. The data for the 5 groups are given in columns 1-5 in
file weightloss.dat in the class directory.
a. Use standard one-way analysis of variance to carry out a test of the
omnibus null hypothesis of equal effectiveness of the weight-reducing
agents. Use Type I error rate .05
b. Construct interval estimates for all pairwise comparisons using the
Tukey Method with family-wise confidence coeff .95.
-------------------------------------------------------------
8. (former in-class Quiz question) "Don't Sweat."
An experimenter sought to determine the effects
of different levels of anxiety on test scores. Thirty subjects
were randomly assigned to one of the three levels (ten to each
level) (1) low-anxiety, (2) moderate-anxiety, and (3)
high-anxiety conditions. A person's score was the number of
items answered correctly on the test. These scores are given
below (C1-C3 in the minitab output):
1 2 3
Low Moderate High
26 49 51 50 52 53
34 74 50 48 64 77
46 61 33 60 39 56
48 51 28 71 54 63
42 53 47 42 58 59
From the (edited) Minitab output below:
a. Provide the entries in the ANOVA table overwritten by 'a' 'b' 'c' .
b. Carry out a test of the omnibus null hypothesis for the equality of
treatment means using Type I error rate .10.
Provide the details of the test.
c. For the output from the Tukey multiple comparisons procedure
provide the values written over by 'd' and 'e'.
MTB > describe c1-c3
N MEAN MEDIAN TRMEAN STDEV
C1 10 48.40 48.50 48.00 13.33
C2 10 48.00 49.00 47.62 12.26
C3 10 57.50 57.00 57.37 9.79
MTB > stack c1 c2 c3 c10; SUBC> subscripts c11.
MTB > oneway c10 c11;
SUBC> tukey .10.
ANALYSIS OF VARIANCE ON C10
SOURCE DF SS MS
C11 aa bbbb ccc
ERROR 27 3813 141
TOTAL 29 4390
POOLED STDEV = 11.88
Tukey's pairwise comparisons
Family error rate = 0.100
Individual error rate = 0.0413
Intervals for (column level
mean) - (row level mean)
1 2
2 -10.99
11.79
3 -20.49 dddddd
2.29 eeeeee
(of course if you like you can recreate this entire output
using the data in the problem)