Education 161 FINAL PROBLEMS, Winter 2000
Solutions for these problems are to be submitted in hard-copy
form. Given that these problems are untimed, some care should be
taken in presentation, clarity, format. Especially important is
to give full and clear answers to questions, not just to submit
unannotated computer output, although relevant output should
be included.
You may use any inanimate resources--no collaboration. This
work is done under Stanford's Honor Code.
Please read the questions carefully and answer the question that
is asked.
Papers will be scored into 3 categories: "Excellent" indicates
sucessful completion of all parts of all questions (within
perhaps one or two very trivial arithmetic errors);
"Satisfactory" indicates a good attempt was made at all parts of
all problems, but there were some serious errors or omissions;
"Incomplete" indicates inadequate effort or performance.
Problems due Wed March 15 5PM in Rogosa's Cubberley mailbox
or deliver to Alex Harris in his office between 3-5 PM 3/15.
No extensions/exceptions.
Note data files are available in one of two locations:
path: /usr/class/ed161/[data file]
or using web-services at URL
http://www.stanford.edu/class/ed161/hw/[data file]
================================================================
Problem 1
This problem and adapted data are taken from a text by an
illustrious educational researcher.
Research Setting:
The general concern is with childhood aggressive behavior
(and especially its persistence into adult violent behavior
and other negative outcomes).
Hudley&Graham (Child Development, 1993) conducted a study (based
on attribution theory) which had as one of its outcome measures
an assessment of negative intent. The rationale for this measure
being that a boy's aggressive behavior might be a consequence of
their misattributing the intent of others ("actors")in ambiguous
situations. Thus, an intervention that taught these boys to
interpret actors' intent as something other than negative in
ambiguous situations should ultimately lessen their aggressive
behavior.
Data:
In the Hudley&Graham study African American boys. average age
about 10.5 years, were assigned at random to one of three
experimental groups: (1) a 12 session intervention to infer
non-hostile intent in ambiguous situations; (2) attention
training (a lesser program to deal with effects of participating
in a study); (3) control group (no ontervention or training).
Data on 36 subjects are in file aggress.dat .
c1 has Negative Intent Rating and c2 contains membership in the
three experimental groups (intervention = 1; attention = 2;
control = 3).
a) Write a statistical model for this single classification data structure
b) carry out an anova for this one-way classification and test the
omnibus null hypothesis of no differences between the group means
for Negative Intent using Type I error rate .05.
c) carry out a post-hoc pairwise comparison procedure using the
Tukey Method in order to construct interval estimates for all
pairwise comparisons with family-wise confidence coeff .95.
d) in planning a follow-up study which will have equal numbers of
subjects in each group, how many subjects should there be in each
group so that the interval estimate for these pairwise comparisons
will have width of 1.5 units (using experimentwise error rate
.01, i.e. confidence coefficient .99)?
------------------------------------
Problem 2
2. We are far enough away from Los Angeles that we should be able
to view this with detachment.....
Assume that a statistical consultant has been called in to assist
the police department of a large city in evaluating its human
relations course for new officers. After the human relations
course is completed, an outcome measure 'attitude toward minority
groups' is obtained (higher is more positive); assume that an
instrument previously validated by the consultant is being used.
A total of 45 officers are involved in this study. Each officer
in the training has a type of beat (Factor A): upper-class,
middle-class, inner-city. Also, the training program has been
developed in three versions/levels (Factor B): 5 hours of human
relations training, 10 hours of training, 15 hours of training.
The study design has 5 officers for each combination of beat
location and training duration. The data can be arrayed as
follows:
5hrs training 10hrs training 15hrs training
upper-class 24 33 37 29 42 44 36 25 27 43 38 29 28 47 48
middle-class 30 21 39 26 34 35 40 27 31 22 26 27 36 46 45
inner-city 21 18 10 31 20 41 39 50 36 34 42 52 53 49 64
-----------------------------------------------------------------
a. Construct both profile plots for this two-way data structure using
length of course and type of beat on the horizontal axis.
Comment on possible main effects and interactions.
b. Write out the statistical model for this two-way classification
c. Carry out the two-way anova and conduct the series of hypothesis tests
for main effects and interaction.
Keep your overall error rate at or below .05 for the 3 tests.
State conclusions and interpretation.
d. If appropriate, investigate further the interpretation of main
effects analyzing row effects at each level of the column factor and/or
vice versa.
Procedures for doing so were illustrated in problem 5 of HW2 for a
2x3 design. Follow that approach
for this data structure by restricting attention (for purposes of
this problem) to just the upper-class and inner-city beats (temporarily
setting aside the middle-class beat data). For this subset of the
data (now a 2x3 design) compare the results of procedures based on
pairwise Tukey intervals and Bonferroni intervals
[Instructor note: setting aside the middle-class beat data
is a simplification merely for the purposes of keeping the
work of this problem managable. In real life, inferences
comparing all three beats could be constructed at each level
of training duration for example]
-------------------------------------------------------------------------
3.
IQ scores and reading ability
The file readiq.dat contains data (from a text) on 60
elementary school boys, 30 of whom were rated as poor or
very poor readers--at least 2 years below grade level. The
remaining 30 boys read normally, but otherwise resembled
the poor readers in terms of schools, age, family background,
and other variables. The 30 boys with reading problems
consisted of 11 "very poor" readers and 19 who were merely "poor"
readers. In the data file c1= 1 for very poor; c1 = 2 for poor;
c1 = 3 for normal.
The relation of reading disability to IQ measures is currently seen
not to be as simple as "poor readers have lower intelligence".
We have in column c4 the full-scale WISC-R IQ score. In c2 we
have the attention/concentration sub-scale score (composed of
arithmetic, digit-span, coding subtests). In c3 we have the
spatial ability sub-scale score (composed of picture completion,
block design, object assembly subtests).
a) Use the Minitab to obtain a scatterplots
for the attention/concentration and spatial ability scores
for each of the reading ability level (1,2,3) in c1 . Obtain the
sample correlation coefficients for each of the three scatterplots.
For the normal readers construct an interval estimate with .95
confidence coefficient for the correlation.
b)
For the normal readers, use the subscale scores in c2 and
c3 to form a prediction equation for the full-scale WISC-R
scores in c4. What are the coefficients and squared multiple
correlation for this regression fit? Plot the residuals
versus the fits for this regression. Obtain a 95% prediction
interval for the full-scale score for an individual having
attention/concentration score of 32 and spatial ability score
of 30.
----------------------------------------------------------
4.
SLEEP Looking forward to getting more?
A simple experiment compared the effectiveness of two sedatives
in promoting length of sleep, labelled here as Drug A and Drug B.
Two groups of size 10 were formed by random assignment (in c3
group membership is coded A = 1, B =2) . The outcome measure in
c1 is the number of hours of sleep obtained by the subject after
taking the drug; the covariate in c2 is the number of hours of
sleep obtained by the subject normally with no medication.
Data reside in file sleep.dat.
------------
a. Construct a 90% confidence interval for difference of
group means on the outcome measure in c1 (i.e. do not use covariate c2
information).
b. Now consider use of covariate information in c2.
What are the sample within-group c2-on-c1 slopes?
Carry out a preliminary test of the ancova assumption of equal c2 on c1
slopes in each group with Type I error rate .10.
c. Obtain a point and interval estimate for the
analysis of covariance treatment effect. Use confidence coefficient .90.
Compare the width of this confidence interval with part (a).
Did use of the covariate help in the estimation? Comment.
------------------------------------------------
5.
But would you want to matriculate?
We consider data on admissions for Fall 1973 graduate study at
U.C. Berkeley in the six largest departments. The data on each
applicant consists of the applicants gender (G), whether admitted (A)
and major department (D).
Whether admitted, male Whether admitted, female
Dept Yes No Yes No
a 512 313 89 19
b 353 207 17 8
c 120 205 202 391
d 138 279 131 244
e 53 138 94 299
f 22 351 24 317
Total 1198 1493 557 1278
a) For the males, which department has the largest proportion admitted?
Carry out a statistical test that for the males, the proportion
admitted is constant for the six departments considered here.
b) construct a 99% confidence interval for the population proportion of
females admitted (pooling over departments).
c) Construct a 2x2 table of gender by admit status. Carry out a test for
independence for this table. What might this result be taken to
indicate about gender equity etc in the admit process?
-------------------------------------------------
6.
If you're not crazy yet, you'll do ok.
A psychologist conducted a study to examine the nature of the
relation if any, between an employee's emotional stability (C2)
and the employee's ability to perform in a task group (C1). Data
on 27 employees are in file stable.dat .
Emotional stability was measured by a written test, and ability
to perform in a task group (C1 = 1 if able, C1 = 0 if unable) was
evaluated by the supervisor.
a. From an OLS fit for a straight-line relation for predicting
C1 from C2, what level of emotional stability seems necessary
for a probability of successful performance of .75.
b. Use Minitab blog to obtain a fit for a logistic response function
to these data
What is the predicted probability of success for an employee with
the median value of emotional stability?
For the logistic fit, what level of emotional stability seems
necessary for a probability of successful performance of .75?
c. For both the OLS regression and the logistic curve estimation,
list the fitted-values for probability of success using the
emotional stability values in these data (C2). Comment on the
similarity (or lack thereof) of these two fits.
-------------------------
END