Workshop Statistics: Discovery with Data, Second Edition

Topic 13: Designing Studies

Activity 13-1: An Apple a Day

(a) (b) No, anecdotes aren't of much use because they describe only an isolated situation which may not be representative of larger groups.
(c) Individuals' opinions
(d) observational/experimental unit: people;  variable/type(1): eat apples?/categorical;  variable/type(2): need to visit doctor?/categorical
(e) explanatory: eat apples?;  response: need to visit doctor?
(f)
(g) No, there may be other factors influencing the response variable such as those who eat apples tend to have healthier diets in general which is what actually reduces their need to visit dentist.
(h) Now we've determined who eats apples instead of letting them choose for themselves.  Thus, we should get all types of people (e.g. healthy and non-healthy eaters). This isolates the apple eating as the only difference in the two groups, so if we see a difference we can attribute this difference to the apple eating.
(i) There is no second group of people who are not eating apples to compare them to.
 

Activity 13-2: Foreign Language and SATs

(a) explanatory: foreign language study?/categorical;  response: SAT Verbal score/quantitative

(b) This is an observational study because we have no control over who takes a foreign language.
(c) These sample data seem to indicate that those studying a foreign language perform better on the verbal portion of the SAT than those who do not study a foreign language.  Every case shows a higher SAT Verbal score for the students who have studied a foreign language than those who have not.
(d) Those students studying a foreign language are most likely on a college bound course track, and are therefore most likely working harder and taking more challenging courses.  Those students not studying a foreign language are most likely not on a college bound course track, and are therefore most likely not working as hard, nor are they taking the more challenging courses.  This could explain why those students studying a foreign language are scoring higher on the SAT Verbal than those students not studying a foreign language, even if foreign language study does not improve students' verbal skills. [Could also talk about their overall verbal ability.]
(e) There appears to be a moderate positive association between verbal SAT score and prior verbal aptitude.
(f) The verbal aptitude scores of every student who did not study a foreign language are all lower than the verbal aptitude scores of every student who studied a foreign language.
(g) We cannot rule out the possibility that foreign language study has no effect on students' verbal SAT scores but those students do better because of a higher verbal aptitude that was present even before the foreign language study began.  The data provide us with no way of distinguishing between the effects of foreign language study and the effects of previous verbal aptitude.
(h) The proportion of students that did not study a foreign language that are female is .5.  The proportion of students that studied a foreign language that are female is .6.
(i) Since the groups appear similar with regard to gender, gender variable is not confounded with studying a language.
(j) Answers will vary from student to student. One example is overall health habits.
 

Activity 13-3: Foreign Language and SATs (cont.)

(a) A good experiment would have equal numbers of high and low aptitude students in each group.
(b) A good method of assigning the students to treatment groups would be through randomization.
Students' answers to (c)-(d) may differ since the data is chosen randomly.  These are meant to be sample answers.
(c)
No foreign language study
Foreign language study
 
Name
Aptitude
Name
Aptitude
1
Alice
41
Bob
33
2
Carol
41
Dennis
34
3
Frank
52
Ellen
38
4
Harry
34
Greta
32
5
Isaac
36
Karla
82
6
Julie
37
Peter
78
7
Larry
67
Qian
84
8
Max
65
Randy
67
9
Nancy
89
Sally
82
10
Oscar
74
Tara
75
(d)
 
Minimum
Mean
Maximum
Foreign language study
32
60.5
84
No foreign language study
34
53.6
89
(e) Randomization should be able to reasonably balance out this variable between the two groups, even without identifying it ahead of time.
(f) In this case, we would be able to reasonably attribute the difference to the foreign language study alone because the experiment was controlled, and the observational units were assigned to the two groups randomly and randomization should sufficiently equalize the other possible variables to eliminate them as explanations.
 

Activity 13-4: Parkinson's Disease and Embryo Treatment

(a) explanatory variable: whether or not the patient received fetal tissue treatment

(b) randomization
(c) No, there may be lurking variables, or other influences such as knowing the surgery was supposed to help.
(d) Yes, the patients in this study were blind to whether they were in the control group or the treatment group.  This is good because they don't know whether they received the treatment or not.  Therefore, the knowledge that they did or did not receive the treatment can't affect their perception of their health.
(e) Answers will vary from student to student.
(f) It would be detrimental to the experiment if the evaluator might be biased in his/her judgement.  If the evaluator is also blind, then this bias won't occur.
 

Activity 13-5: Pregnancy, AZT, and HIV (cont.)

(a)
(b) Since this is a controlled experiment where some women took AZT and some women did not, there is an easy comparison between the two.
(c) The women were randomly assigned to either the experimental or control group, so this study made use of the principle of randomization.
(d) By using a placebo group, and not telling the women if they were in the AZT group or the placebo group, this study takes into account the principle of blindness.
 

Activity 13-6: Effectiveness of Gasoline Additive

Students' answers to (a)-(b), and (g) may differ since the data is chosen randomly.  These are meant to be sample answers.  Also, please note that the "12" in part (a) should read "18."  These answers are based on groups of 18 cars.
(a) Treatment: 25, 23, 23, 25, 25, 26, 22, 26, 23, 28, 22, 18, 18, 17, 17, 17, 20, 18
(b) Treatment average: 21.83;  Control average: 23.33;  Difference in averages (treatment - control): -1.5
(c) 0
(d) Please note that the dotplot in the text is incorrect.  The following dotplot is a correct simualtion:

        Yes, randomization typically does create similar groups, the dotplot is symmetric with the peak at about 0.  The treatment mean and control mean tend to be pretty close to each other.
(e) Average gas mileage and car type are variables that probably affect the response variable.
(f)
(g) Treatment group average: 22.50;  Control group average: 22.66;  Difference (treatment - control): -.16
(h)-(i) Answers will vary from class to class but should show less variation than (d).