Education 257  FINAL PROBLEMS, Winter 2003, 
      March 13, 2003

Solutions for these problems are to be submitted in hard-copy
form. Given that these problems are untimed, some care should be
taken in presentation, clarity, format.  Especially important is
to give full and clear answers to questions, not just to submit
unannotated computer output, although relevant output should
be included.
You may use any inanimate resources--no collaboration.  This
work is done under Stanford's Honor Code.
Please read the questions carefully and answer the question that
is asked.  
Papers will be scored into 3 categories: "Excellent" indicates
successful completion of all parts of all questions (within
perhaps one or two very trivial arithmetic errors);
"Satisfactory" indicates a good attempt was made at all parts of
all problems, but there were some serious errors or omissions;
"Incomplete" indicates inadequate effort or performance. 

Place completed hard copy in Rogosa's Cubberley mailbox 
by 5PM Friday 3/21

As usual
for data sets path is /afs/  or
             /usr/class/ed257/HW   ]
By student request duplicates of these data sets are provided
through http links 

You should note that this turns out to be a very very easy
set of problems, so my expectations are high........

Why Worry?   

Doctoral research from one Pamela S Clute appearing in 
the Journal for Research in Mathematics Education (1984) investigated 
the level between anxiety and mathematics achievement.  The outcome 
("score") is the score on the exam for a College-level mathematics 
survey course.  The design was 2 x 3 x 2 with 7 replications per cell. 
Factor A ("method") was method of instruction-- a standard direct 
instructional method and a direct instruction discovery method (which 
developed the subject by a sequence of questions).  Factor B 
("anxiety") was the anxiety level of the student; students were 
assessed and placed into low, medium, high, anxiety levels.  Factor C 
("college") was the college at which the mathematics course was given: 
U.C. Riverside, CSU San Bernardino.  None of the three factors can be 
considered random factors. 
Based on the summary statistics given in the report of the research, the 
data were recreated (approximately). 
In file mathanxiety.dat are score C1, method C2, anxiety C3, college C4 

a. Construct useful displays of the cell means for this factorial design
   and comment.
b. Carry out tests for the 3 main effects and interactions, 
   controlling your overall error rate so that it does not exceed .05.
   What effects appear significant?
c. Construct a profile plot depicting any sizable two-way or three-way

Subtlety is always lost on me.  

Many studies have supported the assertion that females are superior to 
males in decoding nonverbal cues.  Robert Rosenthal, in various 
studies, has looked at three different types of nonverbal cues, often 
termed Channels: specifically, face, body, and tone of voice.  

The data file is nonverbal.dat. The outcome measure is skill at 
decoding the nonverbal cues (in C1).  The design is a 2x3 with gender 
the row factor (C2) and nonverbal cue (Channel) the column factor 
(C3).  (Data are recreated from summary statistics.)  In our version 
of this study we have 4 female subjects in each level of Channel (C2 = 
1) and 2 male subjects in each level of Channel (C2 = 2). 

a. examine cell means and comment on apparent main effects and interactions.
b. Carry out test for main effects and interactions, controlling your
overall Type I error rate. Describe your results.



This problem and adapted data are taken from a text by an
illustrious educational researcher.
Research Setting:
The general concern is with childhood aggressive behavior
(and especially its persistence into adult violent behavior
and other negative outcomes). 
Hudley&Graham (Child Development, 1993) conducted a study (based
on attribution theory) which had as one of its outcome measures 
an assessment of negative intent. The rationale for this measure
being that a boy's aggressive behavior might be a consequence of 
their misattributing the intent of others ("actors")in ambiguous
situations.  Thus, an intervention that taught these boys to
interpret actors' intent as something other than negative in
ambiguous situations should ultimately lessen their aggressive
In the Hudley&Graham study African American boys. average age
about 10.5 years, were assigned at random to one of three
experimental groups: (1) a 12 session intervention to infer
non-hostile intent in ambiguous situations; (2) attention
training (a lesser program to deal with effects of participating
in a study); (3) control group (no intervention or training).
Data on 36 subjects are in file aggress.dat .
c1 has Negative Intent Rating and c2 contains membership in the
three experimental groups (intervention = 1; attention = 2;
control = 3).

a) Write a statistical model for this single classification data structure
b) carry out an anova for this one-way classification and test the
   omnibus null hypothesis of no differences between the group means
   for Negative Intent using Type I error rate .05.
c) carry out a post-hoc pairwise comparison procedure using the
   Tukey Method in order to construct interval estimates for all 
   pairwise comparisons with family-wise confidence coeff .95.
d) in planning a follow-up study which will have equal numbers of
   subjects in each group, how many subjects should there be in each
   group so that the interval estimate for these pairwise comparisons
   will have width of 1.5 units (using experimentwise error rate 
   .01, i.e. confidence coefficient .99)?


[Instructors note: I don't cotton to the aroma therapy nonsense,
but I have to acknowledge the following story. In the last version
of Ed257 one day Andrew Ho came to class attempting to consume a
late lunch consisting of some of the "food" from the basement
emporium. To my amazement, the pungent smell from Andrew's plate
prevented me from being able to complete a paragraph, and after repeated
attempts, I had to ask Andrew to eat in the hallway so I could continue

Can pleasant aromas help a student learn better? 

Hirsch and Johnston, of the Smell & Taste Treatment and Research 
Foundation, believe that the presence of a floral scent can improve a 
person's learning ability in certain situations. In their experiment, 
22 people worked through a set of two pencil and paper mazes six 
times, three times while wearing a floral-scented mask and three times 
wearing an unscented mask. Individuals were randomly assigned to wear 
the floral mask on either their first three tries or their last three 
tries. Participants put on their masks one minute before starting the 
first trial in each group to minimize any distracting effect. Subjects 
recorded whether they found the scent inherently positive, inherently 
negative, or if they were indifferent to it. Testers measured the 
length of time it took subjects to complete each of the six trials.

In file scent.dat
C1 ID:
C2 Sex: M=male, F=female
C3 Age: Age in years
C4 Smoker: Y if subject smoked, N if did not
C5 Opinion: "pos" if subject found the odor inherently positive, 
            "indiff" if indifferent, "neg" if inherently negative
C6 Order: 1 if did unscented trials first, 2 if did scented trials first
C7 U-Trial 1: length of time required for first unscented trial
C8 U-Trial 2 : length of time required for second unscented trial
C9 U-Trial 3: length of time required for third unscented trial
C10 S-Trial 1 : length of time required for first scented trial
C11 S-Trial 2 : length of time required for second scented trial
C12 S-Trial 3: length of time required for third scented trial 

There are various structures of these data to investigate in understanding
the possible effect of aroma. Work hard to find a good analysis for these
(repeated measures) data. One way to use the outcome measures is to
look at improvement, which could be expressed as the percentage change in 
speed of completion from the first trial to the third trial for each maze.
Are there better approaches?


Average salary paid to teachers and expenditures per pupil are two 
commonly used measures of the amount of money spent on education. Data 
on these two measures are provided by state, and states are classified 
by region of the country.   (Data from the 1980's)
What can be learned from this state-level data? Are these variables
associated? How do you interpret that? Are there regional differences?

Description:file edspend.dat
Average salary paid to teachers and expenditures per pupil on 
education in the 50 states and the District of Columbia. Number of 
cases: 51 Variable Names 
   C1. State: State
   C2. Region: Region
   C3. Pay: Amount of pay in thousands
   C4. Spend: Average amount spent per student in thousands 

Part II
In course example file grow.dat are
data from the Berkeley Growth Study  (Nancy Bailey).  
These data are for Child #8 in the BGS study with age in months in C2
(ranging from 1 to 60) and intellectual performance (outcome) in C1.

Try out methods for straightening the C1 on C2 scatterplot by transformations
of C1. Obtain a prediction equation for intellectual performance as a function
of age.    
Give the fit and an interval estimate for intellectual performance 
at 10 months of age.
Repeat for 60 months of age.