INVESTIGATING STATISTICAL CONCEPTS, APPLICATIONS, AND METHODS
BRIEF SOLUTIONS TO INVESTIGATIONS
Last Updated April 17,
2008
CHAPTER 1
Investigation 1.1.1: Popcorn Production and Lung Disease
(a) 21/116 = .181
(b) proportion in each group
(c)
|
|
Low exposure |
High exposure |
Total |
|
Airway obstructed |
6 |
15 |
21 |
|
Airway not obstructed |
52 |
43 |
96 |
|
Total |
58 |
58 |
116 |
(e) There appears to be a higher rate of airway obstruction in the “high exposure” group.
(f) Low exposure: 6/58 =.103; High exposure: 15/58 = .259
(g) .259-.103 = .156, seems reasonably large
(h) .650-.494 = .156, same difference but doesn’t “feel” as large?
(i) .259/.103 = 2.51
(j) 21/95 = .22
(k) (15/43)/(6/52) = 3.02
Investigation 1.2.1:
Smoking and Lung Cancer
(a) males
(b) EV = amount of smoking (categorical); RV = whether have lung cancer (categorical)
(c)

(d) 14/90 = .156; 8/114 = .070; ratio = 2.217
(e) (14´114)/(8´90)
(f) (213´114)/(8´278)=10.92
(g) (122´114)/(8´60)=28.98, the odds of lung cancer are almost 30 times higher for the chain smokers compared to the non-smokers
(h) The odds of lung cancer are 12.77 times higher for the smokers compared to the non-smokers
(i) Yes, as the amount of smoking increases so does the odds ratio (compared to non-smokers)
(j) There could be something else different about those who choose to smoke, e.g., diet, exercise
(k) Older people are more likely to smoker (before all the negative publicity) and to have cancer (just by being around longer!)
(l) No, the researchers forced the amounts of patients with and without lung cancer to be similar instead of seeing how often these outcomes occurred “naturally.”
(m) No, can always be other explanations (e.g., diet, exercise)
(n) Not clear how representative these patients were…
Investigation 1.2.2: Lung Cancer and Smoking (cont.)
(a) EV = smoking; RV = lung cancer death or not.

(b) Cohort study since identified and followed the explanatory variable groups and observed the resulting response.
(c) .005 - .00047 = .0046, a very small difference
(d) RR = (.005/.00047) = 10.64, OR = 10.77 (will be some rounding differences)
(e) Don’t have to rely on memory, can see how health changes over time, all patients are healthy to begin with
(f) Same as before, could be other differences about those who smoke
(g) Yes
Investigation 1.3.1:
Near-Sightedness and Night-Lights
(a) ou = children; variables = eye condition (categorical) and light condition (categorical)
(b) EV = lighting, RV = eye condition
(c) Probably best described cross-classified since both variables were recorded about each child simultaneously
(d)
|
|
Room light |
Night-light |
Darkness |
Total |
|
Far-sighted |
12 |
39 |
40 |
91 |
|
|
22 |
115 |
114 |
251 |
|
Near-sighted |
41 |
78 |
18 |
137 |
|
Total |
75 |
232 |
172 |
479 |
(e)

The occurrence of myopia (near-sightedness) appears to increase as the amount of light in the child’s room increases.
(f) .286, .55, .336, .105, .16, .168, .232
About 29% of children were near-sighted, but this proportion increased to .55 for the children with a room light, but was only .105 when no lighting was used. The occurrence of hyperopia was fairly constant with a slightly increased proportion among children who slept in darkness.
(g) Could be other causes such as genetics, other child-rearing issues that are related to both the type of lighting used and the eye condition of the children.
Investigation 1.3.2: Graduate Admissions Discrimination
(a) men: .445, women: .252
(b) Yes, men were accepted to these
(c) program, gender, whether accepted
(d) .619, .059, .824, .070
(e) the issue is that women applied more often to the program that was harder to get into overall.
(f) Since more women applied to program F than program A, the overall acceptance rate for women will be closer to that of program F than that of program A.
(g) (108/449)(.824) + (341/449)(.070) = .25
(h) [825(.619)+373(.059)]/1198 = .44
(i) The two equations will be AmPm + Fm(1-Pm) and AwPm + Fw(1-Pm). Since Am < Aw and Fm < Fw, the first term is guaranteed to be smaller.
(j) The two equations will be AmPm +Am(1-Pm) = Am and AwPw + Aw(1-Pw) = Aw. Since Aw > Am, this will be true about the overall rate as well.
Investigation 1.4.1:
Foreign Language and SAT Scores
(a) EV = foreign language study (categorical); RV = SAT verbal (quantitative)
(b) Possibilities include ambition, overall academic achievement, verbal ability. For example, maybe those who take a foreign language are more likely to be interested in attending college and therefore study harder for the SAT.
(c) Randomly assign students to take a foreign language or not
(d) Want the two groups to be as similar as possible.
(e) The power of suggestion could be enough to help improve their performance.
Investigation 1.4.2:
Have a Nice Trip
(a) This would be a problem as gender would be confounded with the recovery strategy employed. If one group did better you wouldn’t be able to decide whether it was the strategy used or their gender.
(b) Want everything about the two groups to be as similar as possible.
(c)-(d) Results will vary
(e) Difference won’t always be zero but distribution should be centered around zero and should be equally likely to be positive as negative.
(f)-(g) Results will vary but the two outcomes will probably not be identical.
(h) Distribution should center symmetrically around zero.
(i) Center: 0, Largest: around .67, smallest: around -.67
(j) No, but most randomizations produce a difference that is close to zero
(k) Yes, as seen by the distribution being centered around zero
(l) Yes, as seen by the distribution being centered around zero
(m) Yes, as seen by the distribution being centered around zero
Investigation 1.4.3: Have a Nice Trip (cont.)
(a) Make sure you have the same number of men and women in the two groups
(b) Equal
(c) The difference in proportions will always be zero, by your design.
(d) Should be less variation than when didn’t block on gender
(e) Since height is related to gender, by making the groups more similar with respect to gender, will also be more similar with respect to height.
(f) This time, the distributions look pretty similar. Presumably gender is not related to either of these two variables.
Investigation 1.5.1:
Friendly Observers
(a) The subjects were assigned to group A or group B and were not told how the two groups were being treated differently. Since the response variable (score on game) was measured objectively, there is not really a subjective rater who should be blind to group membership.
(b) EU = subjects, var1 = vested interest or not (categorical, EV), var 2 = beat threshold or not (categorical, RV)
(c) .25, .67; 6
(d)
(e) .25-.67 =-.42
We observe a smaller proportion of successes (threshold beaters) in Group A (observer with vested interest) as conjectured by the researchers.
(f) Yes, randomization may not have completely balanced out the variables in the two groups and the difference we are seeing could be based on some of these extraneous variables and not on the observer’s interest level.
(g)-(j) Answers will vary
(k) 5 or 6, half of the 11 total
(l) somewhat
(m) somewhat
(n) yes, since it would be very unlikely to be a product of an “unlucky” randomization (as judged by the dotplot, a result this extreme is unlikely to happen the randomization process alone)
(o) results will vary
(p)-(q) example results

relative frequency: 0, 0, .004, .045, .159, .299, .277, .173, .042, .001, 0, 0, 0
(r) About 5.5
(s) about .05
(u) some evidence since it’s unlikely to get that few successes in Group A when there really is no difference between the two groups.
Investigation 1.6.1:
Random Babies
(a) answers will vary
(b) probably not
(c) example results

(d) Most likely: 0 or 1,
least likely: 4
(e) should be close to 1
(f) Graph bounces around when the number of trials is small but then begins to converge to .375.
(g) results will vary, should be around .04.
(h) impossible since if 3 mom’s match, the fourth must as well.
(i) should eventually converge to 1.
(j)
1234 1243 1324 1342 1423 1432
2134 2143 2314 2341 2413 2431
3124 3142 3214 3241 3412 3421
4123 4132 4213 4231 4312 4321
(k) 1/24
(l) 2143, 2341, 2413, 3142, 3412, 3421, 4123, 4312, 4321
(m)
4 2 2 1 1 2
2 0 1 0 0 1
1 0 2 1 0 0
0 1 1 2 0 0
(n) There are 9 zero’s so the probability is 9/24.
(o) P(X=1) = 8/24
P(X=2) = 6/24
P(X=3) = 0/24
P(X=4) = 1/24
(p) Answers will vary
(q) should be similar
(r) 15/24
(s) 15/24 = 1-(9/24)
(t) 0(9/24) + 1(8/24) + 2(6/24) + 3(0/24) + 4(1/24) = 24/24 = 1.
(u) should be similar
(v) no, no
Investigation 1.6.2:
Animal Models for Stroke Treatment
(a) X can range from 3 to 7 (since are at most 7 rats in either group)
(b)-(c) results will vary
Example results

(d) It is very surprising to find all 7 in one group (happens about 3% of the time by chance alone)
(e) C(14,7) = 3432
(f) C(10,7) = 120
(g) P(X=7) = 120/3432 = .035, close to the above simulation results
(h) We would be willing to draw a cause and effect conclusion since we have evidence this result didn’t happen just by chance and since it was a randomized comparative experiment, there shouldn’t be any confounding variables.
Investigation 1.7.1: More Friendly Observers
(a) 2,704,156; no
(b) C(11,3) = 165
(c) Also need to consider the number of ways to assign the 9 of the failures to group A.
(d) C(13,9)
(e) C(11,3)C(13,9)
(f) P(X=3) = C(11,3)C(13,9)/C(24,12) = .0436
(g) This is just 3 exactly, we want 3 or fewer (a result at least as extreme as what was observed)
(h) C(11, x)C(13, 12-x)/C(24,12)
(i) C(M, x)C(24-M, 12-x)/C(24,12)
(j) C(M, x)C(N-M, 12-x)/C(M,12)
(k) .00582, .00032, .0000048
(l) .0498
(m) Rather unlikely to occur as a result of the randomization process alone
(n)

(o)
Hypergeometric with N = 24,
M = 12, and n = 11
x P( X <= x )
3 0.0497664
(p) should be similar
(q) probabilities should sum to 1
(r) E(X) = 5.5
12(11/24) = 5.5
(s) {Y
7}, 7
(t)
Hypergeometric with N = 24,
M = 12, and n = 11
x P( X <= x )
7 0.950234
1-.9502 = .0498.
(u)
|
|
Group A |
Group B |
Total |
|
Beat threshold |
6 |
16 |
22 |
|
Did not beat threshold |
18 |
8 |
26 |
|
Total |
24 |
24 |
48 |
(v) 6/24 = .25; 16/24 = .67
(w) Would look identical
(x) prediction
(y) Let X = number of successes in Group A. Want P(X< 6) = .0042
(z) This p-value is quite a bit smaller and provides much stronger evidence that the experimental results did not happen by chance alone.
Investigation 1.7.2:
Minority Baseball Coaches
(a)
|
|
Minority |
Not minority |
Total |
|
1st base |
15 |
15 |
30 |
|
3rd base |
6 |
24 |
30 |
|
Total |
21 |
39 |
60 |
X = number of minorities at 3rd, want P(X< 6) = .015
This p-value is small enough to convince us that these results would not arise from a chance mechanism alone.
(b) This was an observational study (since race was not imposed by the researchers) so we can’t conclude “cause-and-effect” but we can say that the race and base position variables appear to be related.
CHAPTER 2
Investigation 2.1.1: Anticipating Variable Behavior
Answers will vary but should be justified, e.g., the number of possible distinct outcomes, the shape of the distribution, the perceived variability in the distribution, the frequency of the category corresponding to the value of zero…
Investigation 2.1.2: Cloud Seeding
(a) This is an experiment since the researchers imposed the seeded/unseeded condition on the clouds (the experimental units).
(b) EV = whether or not seeded (categorical); RV = volume of rain (quantitative)

(c) Randomization was used so that the characteristics of the cloud groupings would be as similar as possible prior to imposing the treatment.
(d) To prevent any hidden “bias” that could creep into the pilots’ behavior or those making the measurements. Seems less of an issue in this context, but doesn’t hurt.
(e) The seeded clouds show a slight tendency for larger volumes of rainfall. The distribution is centered at a slightly higher value and has more of the extreme results (e.g, 1600 and above).
(f) unseeded: min = 1.0, Q1 = 24.4, median = (41.1+47.3)/2 = 44.2, Q3 = 163, max = 1202.6
seeded: min = 4.1, Q1 = 92.4, median = (200.7+242.5)/2 = 221.6, Q3 = 430, max = 2745.6
All values are in units of acre-feet.
(g) The seeded clouds have higher values for all 5 numbers in the five-number summary indicating a tendency for larger amounts of rainfall.
(h) 1.5(430-92.4) = 506.4
92.4-506.5 < 0, no low outliers
430+506.4=936.
Any clouds with more than 936.4 acre-feet of rainfall are outliers. There are four such outliers.
(i) Show min at 4.1, box from 92.4 to 430 with line at 221.6, whisker to 703.4 and then outliers at 978, 1656, 1697.8, and 2745.6.
(j) The boxplots show graphically that the distribution of the seeded clouds is shifted slightly to the right from the unseeded clouds. The box is also wider indicating more variability in the rainfall volumes.
(k) Asks for prediction
(l) The means are larger than the respective medians.
(m) 6 out of 26 (23%) in both cases. This indicates that the mean is not falling in the “middle” of the distribution as the median would
(n) possibly not as well as the median which is guaranteed to be “in the middle” of all the data values.
(o) Using Minitab:

(p) The spreads of the distributions (as judged by the width of the boxes and the whiskers themselves) are more similar, and the shapes are slightly more similar (both a bit more symmetric).
(q) Yes, the seeded clouds show a higher tendency for log(rainfall) as well.
Investigation 2.1.3: Geyser Eruptions
(a) This is an observational study since the researchers did not randomly impose the year on some eruptions, but observed the eruptions as they occurred.
(b) Also transposing the variables, the boxplots are:

These boxplots show a tendency for longer intereruption times in 2003 as the box is shifted to the right and the lower quarter of 2003 is still above the upper quartile of 1978.
(c) Yes since the boxwidth (the interquartile range) is smaller in 2003, this is evidence that the times are less variable/more consistent. There are 2 outliers in 2003 of unusually short intereruption times for that year.
(d) 1978: 95-42 = 53; 2003: 110-56 = 54 minutes.
(e) new 2003 range = 39, much smaller than before.
(f) No, because based on (e), the range appears to be highly sensitive to outliers in the data set.
(g) From Minitab: 1978: 23; 2003: 11
(h) yes, 2003 has a smaller interquartile range so it appears to have more consistent times. Smaller spread corresponds to smaller IQR.
(i) minutes2
(j) 1978: 12.97 minutes; 2003: 8.46 minutes
(k) smaller spread corresponds to a smaller standard deviation value.
(l) new SD = 6.87, new IQR = 11.
The IQR hasn’t changed but the SD is now almost 2 minutes smaller.
(m) These approximations should be read from the graph and five number summary. About 25% of the 1978 intereruption times were less than 60 minutes compared to all but 2 of the 2003 values. Similarly, 50% of 1978 eruptions were less than 75 minutes, and even less than 25% of the 2003 eruptions were.
(n) Histograms:


We get roughly the same percentages as above.
(o) Both the histograms (especially 1978) do reveal a bimodal shape that was hidden in the boxplot display.

The distribution of intereruption times is bimodal. The second, very short, peak is around 60 minutes.
(p)

This histogram is also bimodal with a peak around 60 minutes and a much larger concentration of intereruption times around 85-105 minutes. There are a few extreme outlying times below 50 minutes and around 154 minutes.
Investigation 2.1.4: Hypothetical Quiz Scores
(a)-(d) Asks for prediction.
(e)
|
|
Class A |
Class B |
Class C |
Class D |
Class E |
Class F |
|
Q1 |
4 |
2 |
3 |
1 |
5 |
6 |
|
Q3 |
7 |
8 |
7 |
9 |
5 |
8 |
|
IQR |
3 |
6 |
4 |
8 |
0 |
2 |
Class A has the least variability of A-C. Class D has more variability than class C. Based on the IQR, Class E has the least variability of all.
(f) This results are consistent, with Class F having the least, then class A.
Investigation 2.1.5:
Body Temperatures
(a) Calls for personal opinion.
(b) Could look at d