Workshop Statistics: Discovery with Data, Second
Edition
Topic 13: Designing Studies
Activity 13-1: An Apple a Day
(a)
(i) observational study
(ii) survey
(iii) experiment
(iv) anecdote
(b) No, anecdotes aren't of much use because they describe only an isolated
situation which may not be representative of larger groups.
(c) Individuals' opinions
(d) observational/experimental unit: people; variable/type(1):
eat apples?/categorical; variable/type(2): need to visit doctor?/categorical
(e) explanatory: eat apples?; response: need to visit doctor?
(f) (g) No, there may be other factors influencing the response variable
such as those who eat apples tend to have healthier diets in general which
is what actually reduces their need to visit dentist.
(h) Now we've determined who eats apples instead of letting them choose
for themselves. Thus, we should get all types of people (e.g. healthy
and non-healthy eaters). This isolates the apple eating as the only difference
in the two groups, so if we see a difference we can attribute this difference
to the apple eating.
(i) There is no second group of people who are not eating apples to
compare them to.
Activity 13-2: Foreign Language and SATs
(a) explanatory: foreign language study?/categorical; response: SAT
Verbal score/quantitative
(b) This is an observational study because we have no control over
who takes a foreign language.
(c) These sample data seem to indicate that those studying a foreign
language perform better on the verbal portion of the SAT than those who
do not study a foreign language. Every case shows a higher SAT Verbal
score for the students who have studied a foreign language than those who
have not.
(d) Those students studying a foreign language are most likely on a
college bound course track, and are therefore most likely working harder
and taking more challenging courses. Those students not studying
a foreign language are most likely not on a college bound course track,
and are therefore most likely not working as hard, nor are they taking
the more challenging courses. This could explain why those students
studying a foreign language are scoring higher on the SAT Verbal than those
students not studying a foreign language, even if foreign language study
does not improve students' verbal skills. [Could also talk about their
overall verbal ability.]
(e) There appears to be a moderate positive association between verbal
SAT score and prior verbal aptitude.
(f) The verbal aptitude scores of every student who did not study a
foreign language are all lower than the verbal aptitude scores of every
student who studied a foreign language.
(g) We cannot rule out the possibility that foreign language study
has no effect on students' verbal SAT scores but those students do better
because of a higher verbal aptitude that was present even before the foreign
language study began. The data provide us with no way of distinguishing
between the effects of foreign language study and the effects of previous
verbal aptitude.
(h) The proportion of students that did not study a foreign language
that are female is .5. The proportion of students that studied a
foreign language that are female is .6.
(i) Since the groups appear similar with regard to gender, gender variable
is not confounded with studying a language.
(j) Answers will vary from student to student. One example is overall
health habits.
Activity 13-3: Foreign Language and SATs (cont.)
(a) A good experiment would have equal numbers of high and low aptitude
students in each group.
(b) A good method of assigning the students to treatment groups would
be through randomization.
Students' answers to (c)-(d) may differ since
the data is chosen randomly. These are meant to be sample answers.
(c)
No foreign language study
Foreign language study
Name
Aptitude
Name
Aptitude
1
Alice
41
Bob
33
2
Carol
41
Dennis
34
3
Frank
52
Ellen
38
4
Harry
34
Greta
32
5
Isaac
36
Karla
82
6
Julie
37
Peter
78
7
Larry
67
Qian
84
8
Max
65
Randy
67
9
Nancy
89
Sally
82
10
Oscar
74
Tara
75
(d)
Minimum
Mean
Maximum
Foreign language study
32
60.5
84
No foreign language study
34
53.6
89
(e) Randomization should be able to reasonably balance out this variable
between the two groups, even without identifying it ahead of time.
(f) In this case, we would be able to reasonably attribute the difference
to the foreign language study alone because the experiment was controlled,
and the observational units were assigned to the two groups randomly and
randomization should sufficiently equalize the other possible variables
to eliminate them as explanations.
Activity 13-4: Parkinson's Disease and Embryo Treatment
(a) explanatory variable: whether or not the patient received fetal tissue
treatment
(b) randomization
(c) No, there may be lurking variables, or other influences such as
knowing the surgery was supposed to help.
(d) Yes, the patients in this study were blind to whether they were
in the control group or the treatment group. This is good because
they don't know whether they received the treatment or not. Therefore,
the knowledge that they did or did not receive the treatment can't affect
their perception of their health.
(e) Answers will vary from student to student.
(f) It would be detrimental to the experiment if the evaluator might
be biased in his/her judgement. If the evaluator is also blind, then
this bias won't occur.
Activity 13-5: Pregnancy, AZT, and HIV (cont.)
(a)
explanatory: AZT user?, categorical - binary
response: baby HIV positive?, categorical - binary
(b) Since this is a controlled experiment where some women took AZT
and some women did not, there is an easy comparison between the two.
(c) The women were randomly assigned to either the experimental or
control group, so this study made use of the principle of randomization.
(d) By using a placebo group, and not telling the women if they were
in the AZT group or the placebo group, this study takes into account the
principle of blindness.
Activity 13-6: Effectiveness of Gasoline Additive
Students' answers to (a)-(b), and (g) may differ
since the data is chosen randomly. These are meant to be sample answers.
Also, please note that the "12" in part (a) should read "18." These
answers are based on groups of 18 cars.
(a) Treatment: 25, 23, 23, 25, 25, 26, 22, 26, 23, 28, 22, 18, 18,
17, 17, 17, 20, 18
(b) Treatment average: 21.83; Control average: 23.33; Difference
in averages (treatment - control): -1.5
(c) 0
(d) Please note that the dotplot in the text
is incorrect. The following dotplot is a correct simualtion:
Yes, randomization typically
does create similar groups, the dotplot is symmetric with the peak at about
0. The treatment mean and control mean tend to be pretty close to
each other.
(e) Average gas mileage and car type are variables that probably affect
the response variable.
(f) (g) Treatment group average: 22.50; Control group average: 22.66;
Difference (treatment - control): -.16
(h)-(i) Answers will vary from class to class but should show less
variation than (d).