INVESTIGATING STATISTICAL CONCEPTS, APPLICATIONS, AND METHODS
BRIEF SOLUTIONS TO INVESTIGATIONS
Last Updated Nov. 26
Investigation 1-1: Popcorn Production and Lung Disease
(a) 21/116 = .81
(b) proportion in each group
(c)
|
|
Low exposure |
High exposure |
Total |
|
Airway obstructed |
6 |
15 |
21 |
|
Airway not obstructed |
52 |
43 |
96 |
|
Total |
58 |
58 |
116 |
(e) There appears to be a higher rate of airway obstruction in the “high exposure” group.
(f) Low exposure: 6/58 =.103; High exposure: 15/58 = .259
(g) .259-.103 = .156, seems reasonably large
(h) .650-.494 = .156, same difference but doesn’t “feel” as large?
(i) .259/.103 = 2.51
(j) 21/95 = .22
(k) (15/43)/(6/52) = 3.02
Investigation 1-2:
Smoking and Lung Cancer
(a) males
(b) EV = amount of smoking (categorical); RV = whether have lung cancer (categorical)
(c)

(d) 14/90 = .156; 8/114 = .070; ratio = 2.217
(e) (14´114)/(8´90)
(f) (213´114)/(8´278)=10.92
(g) (122´114)/(8´60)=28.98, the odds of lung cancer are almost 30 times higher for the chain smokers compared to the non-smokers
(h) The odds of lung cancer are 12.77 times higher for the smokers compared to the non-smokers
(i) Yes, as the amount of smoking increases so does the odds ratio (compared to non-smokers)
(j) There could be something else different about those who choose to smoke, e.g., diet, exercise
(k) Older people are more likely to smoker (before all the negative publicity) and to have cancer (just by being around longer!)
(l) No, the researchers forced those amounts to be similar instead of seeing how often these outcomes occurred “naturally.”
(m) No, can always be other explanations (e.g., diet, exercise)
(n) Not clear how representative these patients were…
(o) (114´14)/(8´90), the same
(p) (14´114)/(8´90), the same
(r) (14/104)/(8/122) = 2.05
(q) (114/122)/(90/104) = 1.08
(s) (8/22)/(114/204) = 1.54
(t) odds ratio did not change but the relative risk did
Investigation 1-3: Lung Cancer and Smoking (cont.)
(a) EV = smoking; RV = lung cancer death or not.

(b) Cohort study since identified and followed the explanatory variable groups and observed the resulting response.
(c) .005 - .00047 = .0046, a very small difference
(d) RR = (.005/.00047) = 10.64, OR = 10.77 (will be some rounding differences)
(e) Don’t have to rely on memory, can see how health changes over time, all patients are healthy to begin with
(f) Same as before, could be other differences about those who smoke
(g) Yes
(i) .002386, .0045, 10.7, 10.7
(j) Bars look much more similar, .5, .0045, 1.009, 1.018
Risk is not as dramatic and the RR and OR are similar.
(k).56, .4, 2, 6
Baseline is similar but now a bigger difference and the odds ratio and relative risk are not similar to each other.
(l) approximately 0, approximately 1
Are less likely to have died from lung cancer than to be in the other response variable category but the rate of lung cancer death is essentially the same between the two groups.
(m) The difference in the proportions could be the same in different tables but the odds ratio and relative risk can tell a different story. This arises based on how different the baseline risk is from .5 (when the conditional proportions are close to 0 or 1, the relative risk and odds ratio will appear more extreme and will be closer to each other in value than when the baseline risk is close to .5). When the conditional proportions are similar, both the odds ratio and relative risk will be close to 1.
Investigation 1-4:
Near-Sightedness and Night Lights
(a) ou = children; variables = eye condition (categorical) and light condition (categorical)
(b) EV = lighting, RV = eye condition
(c) cross-classified since both variables were recorded about each child simultaneously
(d)
|
|
Room light |
Night light |
Darkness |
Total |
|
Far-sighted |
12 |
39 |
40 |
91 |
|
|
22 |
115 |
114 |
251 |
|
Near-sighted |
41 |
78 |
18 |
137 |
|
Total |
75 |
232 |
172 |
479 |
(e)

The occurrence of myopia (near-sightedness) appears to increase as the amount of light in the child’s room increases.
(f) .286, .55, .336, .105, .16, .168, .232
About 29% of children were near-sighted, but this proportion increased to .55 for the children with a room light, but was only .105 when no lighting was used. The occurrence of hyperopia was fairly constant with a slightly increased proportion among children who slept in darkness.
(g) Could be other causes such as genetics, other child-rearing issues that are related to both the type of lighting used and the eye condition of the children.
Investigation 1-5: Graduate Admissions Discrimination
(a) men: .445, women: .252
(b) Yes, men were accepted to these
(c) program, gender, whether accepted
(d) .619, .059, .824, .070
(e) the issue is that women applied more often to the program that was harder to get into overall.
(f) (108/449)(.824) + (341/449)(.070) = .25
(g) [825(.619)+373(.059)]/1198 = .44
(h)
|
|
Program A |
Program F |
Total |
|
Accepted |
27 |
86 |
113 |
|
Denied |
81 |
255 |
336 |
|
Total |
108 |
341 |
449 |
(i) The weighted average is equal to the average of the two acceptance rates between the two programs.
(j)
|
|
Program A |
Program F |
Total |
|
Accepted |
70 |
43 |
113 |
|
Denied |
154 |
182 |
336 |
|
Total |
224 |
225 |
449 |
(k) The weighted average is equal to the average of the two acceptance rates between the two programs.
Investigation 1-6:
Foreign Language and SAT Scores
(a) EV = foreign language study (categorical); RV = SAT verbal (quantitative)
(b) Possibilities include ambition, overall academic achievement, verbal ability. For example, maybe those who take a foreign language are more likely to be interested in attending college and therefore study harder for the SAT.
(c) Randomly assign students to take a foreign language or not
(d) Want the two groups to be as similar as possible.
Investigation 1-7:
Have a Nice Trip
(a) This would be a problem as gender would be confounded with the recovery strategy employed. If one group did better you wouldn’t be able to decide whether it was the strategy used or their gender.
(b) Want everything about the two groups to be as similar as possible.
(c)-(d) Results will vary
(e) Difference won’t always be zero but distribution should be centered around zero and should be equally likely to be positive as negative.
(f)-(g) Results will vary.
(h) Distribution should again center around zero.
(i) Center: 0, Largest: around .67, smallest: around -.67
(j) No, but most randomizations produce a difference that is close to zero
(k) Yes, as seen by the distribution being centered around zero
(l) Yes, as seen by the distribution being centered around zero
(m) Yes, as seen by the distribution being centered around zero
(n) Answers will vary but does seem tricky to force individuals to study certain subjects and especially to smoke or not.
(o) Power of suggestion can influence how well they do.
(p) Could assign the other group to take more English classes without telling them why, could give people cigarettes that do not contain tobacco? (Debatable)
Investigation 1-8: Have a Nice Trip
(a) Make sure you have the same number of men and women in the two groups
(b) Equal
(c) The difference in proportions will always be zero, by your design.
(d) Should be less variation than when didn’t block on gender
(e) Since height is related to gender, by making the groups more similar with respect to gender, will also be more similar with respect to height.
(f) This time, the distributions look pretty similar. Presumably gender is not related to either of these two variables.
Investigation 1-9:
Friendly Observers
(a) The subjects were assigned to group A or group B and were not told how the two groups were being treated differently. Since the response variable (score on game) was measured objectively, there is not really a subjective rater who should be blind to group membership.
(b) EU = subjects, var1 = vested interest or not (categorical, EV), var 2 = beat threshold or not (categorical, RV)
(c) .25, .67, 6
(d)
(e) .25-.67 =-.42
We observe a smaller proportion of successes (threshold beaters) in Group A (observer with vested interest) as conjectured by the researchers.
(f) Yes, randomization may not have completely balanced out the variables in the two groups and the difference we are seeing could be based on some of these extraneous variables and not on the observer’s interest level.
(g)-(j) Answers will vary
(k) 5 or 6, half of the 11 total
(l) somewhat
(m) somewhat
(n) yes, since it would be very unlikely to be a product of an “unlucky” randomization (as judged by the dotplot, a result this extreme is unlikely to happen the randomization process alone)
(o) results will vary
(p) example results

(r) About 5.5
(s) about .05
(u) some evidence since it’s unlikely to get that few successes in Group A when there really is no difference between the two groups.
Investigation 1-10:
College Committee Formation
(a)-(c) Results will vary.
(d) Most likely: 0, least likely: 2, average around 2/3
(e) Answers will vary
(f) To increase the precision of the estimates, we would want to do more randomizations
(h) Answers will vary
(i) Answers will vary but should see a similar pattern as before. These results should be more “precise.”
(j) Most: 2, least: 0
(k) It is rather surprising as you should find it does not occur very often as a product of the randomization process alone.
(l) Example results:

(m) Appears to be converging to around .07.
(n) Should be converging to around 2/3.
(o) AB, AC, AD, AE, AF, BC, BD, BE, BF, CD, CE, CF, DE, DF, EF
(p) 15
(q) 1/15
(r) 1/15 = .067
(s) Should be similar
(t) 8/15, 6/15
(u) 1/15+8/15 = 9/15
(v) 6!/(2!4!) = 15
(w) C(6,3) = 20
Investigation 1-11:
Selecting Senators
(a) 0, 1, 2, 3, 4, 5
(b) Calls for prediction.
(c) C(100,5) = 75,287,520
(d) C(14,1) = 14
(e) Also need to randomly decide which men will be on the subcommittee
(f) C(86,4) = 2,123,555
(g) 29,729,770
(h) P(X=1) = .395
(i) P(X=2) = C(14,2)C(86,3)/C(100,5) = .124
(j) P(X=x) = C(14,x)(86,5-x)/(100,5)
(k) P(X=x) = C(r,x)C(100-r, 5-x)/C(100,5)
(l) P(X=x) = C(r,x)C(N-r, 5-x)/C(N,5)
(k)
|
0 |
1 |
2 |
3 |
4 |
5 |
|
.463 |
.395 |
.124 |
.018 |
.001 |
.000 |

(l) sum to one
(m) E(X) = .70 = 5(14/100)
(n) 0, which is not equal to the expected value
(o) P(X=3, 4, 5) = P(X=3)+P(X=4)+P(X=5) = .0188
(p) Larger as it will be more unlikely to get an “unusual” mix
(q) P(X=2, 3) = .0484+ .002251 = .051
Investigation 1-12: More Friendly Observers
(a) 2,704,156; no
(b) P(X=3) = C(11,3)C(13,9)/C(24,12) = .0436
(c) .00582, .00032, .0000048
(d) .0498
(e) Rather unlikely to occur as a result of the randomization process alone
(f)
|
|
Group A |
Group B |
Total |
|
Beat threshold |
6 |
16 |
22 |
|
Did not beat threshold |
18 |
8 |
26 |
|
Total |
24 |
24 |
48 |
(g) 6/24 = .25; 16/24 = .67
(h) Would look identical
(i) prediction
(j) Let X = number of successes in Group A. Want P(X< 6) = .0042
(k) This p-value is quite a bit smaller and provides much stronger evidence that the experimental results did not happen by chance alone.
Investigation 1-13:
Minority Baseball Coaches
(a)
|
|
Minority |
Not minority |
Total |
|
1st base |
15 |
15 |
30 |
|
3rd base |
6 |
24 |
30 |
|
Total |
21 |
39 |
60 |
X = number of minorities at 3rd, want P(X< 6) = .015
This p-value is small enough to convince us that these results would not arise from a chance mechanism alone.
(b) This was an observational study (since race was not imposed by the researchers) so we can’t conclude “cause-and-effect” but we can say that the race and base position variables appear to be related.
CHAPTER 2
Investigation 2-1: Anticipating Variable Behavior
Answers will vary but should be justified, e.g., the number of possible distinct outcomes, the shape of the distribution, the perceived variability in the distribution, the frequency of the category corresponding to the value of zero…
Investigation 2-2: Cloud Seeding
(a) This is an experiment since the researchers imposed the seeded/unseeded condition on the clouds (the experimental units).
(b) EV = whether or not seeded (categorical); RV = volume of rain (quantitative)

(c) Randomization was used so that the characteristics of the cloud groupings would be as similar as possible prior to imposing the treatment.
(d) To prevent any hidden “bias” that could creep into the pilots’ behavior or those making the measurements. Seems less of an issue in this context, but doesn’t hurt.
(e) The seeded clouds show a slight tendency for larger volumes of rainfall. The distribution is centered at a slightly higher value and has more of the extreme results (e.g, 1600 and above).
(f) unseeded: min = 1.0, Q1 = 24.4, median = (41.1+47.3)/2 = 44.2, Q3 = 163, max = 1202.6
seeded: min = 4.1, Q1 = 92.4, median = (200.7+242.5)/2 = 221.6, Q3 = 430, max = 2745.6
All values are in units of acre-feet.
(g) The seeded clouds have higher values for all 5 numbers in the five-number summary indicating a tendency for larger amounts of rainfall.
(h) 1.5(430-92.4) = 506.4
92.4-506.5 < 0, no low outliers
430+506.4=936.
Any clouds with more than 936.4 acre-feet of rainfall are outliers. There are four such outliers.
(i) Show min at 4.1, box from 92.4 to 430 with line at 221.6, whisker to 703.4 and then outliers at 978, 1656, 1697.8, and 2745.6.
(j) The boxplots show graphically that the distribution of the seeded clouds is shifted slightly to the right from the unseeded clouds. The box is also wider indicating more variability in the rainfall volumes.
(k) Asks for prediction
(l) The means are larger than the respective medians.
(m) 6 out of 26 (23%) in both cases. This indicates that the mean is not falling in the “middle” of the distribution as the median would
(n) possibly not as well as the median which is guaranteed to be “in the middle” of all the data values.
(o) Using Minitab:

(p) The spreads of the distributions (as judged by the width of the boxes and the whiskers themselves) are more similar, and the shapes are slightly more similar (both a bit more symmetric).
(q) Yes, the seeded clouds show a higher tendency for log(rainfall) as well.
Investigation 2-3: Geyser Eruptions
(a) This is an observational study since the researchers did not randomly impose the year on some eruptions, but observed the eruptions as they occurred.
(b) Also transposing the variables, the boxplots are:

These boxplots show a tendency for longer intereruption times in 2003 as the box is shifted to the right and the lower quarter of 2003 is still above the upper quartile of 1978.
(c) Yes since the boxwidth (the interquartile range) is smaller in 2003, this is evidence that the times are less variable/more consistent. There are 2 outliers in 2003 of unusually short intereruption times for that year.
(d) 1978: 95-42 = 53; 2003: 110-56 = 54 minutes.
(e) new 2003 range = 39, much smaller than before.
(f) No, because based on (e), the range appears to be highly sensitive to outliers in the data set.
(g) From Minitab: 1978: 23; 2003: 11
(h) yes, 2003 has a smaller interquartile range so it appears to have more consistent times. Smaller spread corresponds to smaller IQR.
(i) minutes2
(j) 1978: 12.97 minutes; 2003: 8.46 minutes
(k) smaller spread corresponds to a smaller standard deviation value.
(l) new SD = 6.87, new IQR = 11.
The IQR hasn’t changed but the SD is now almost 2 minutes smaller.
(m) These approximations should be read from the graph and five number summary. About 25% of the 1978 intereruption times were less than 60 minutes compared to all but 2 of the 2003 values. Similarly, 50% of 1978 eruptions were less than 75 minutes, and even less than 25% of the 2003 eruptions were.
(n) Histograms:


We get roughly the same percentages as above.
(o) Both the histograms (especially 1978) do reveal a bimodal shape that was hidden in the boxplot display.

The distribution of intereruption times is bimodal. The second, very short, peak is around 60 minutes.
(p)

This histogram is also bimodal with a peak around 60 minutes and a much larger concentration of intereruption times around 85-105 minutes. There are a few extreme outlying times below 50 minutes and around 154 minutes.
Investigation 2-4: Bumpiness, Variety, and Variability
(a)-(d) Asks for prediction.
(e)
|
|
Class A |
Class B |
Class C |
Class D |
Class E |
Class F |
|
Q1 |
3.5 |
2 |
3 |
1 |
1 |
6 |
|
Q3 |
6.5 |
8 |
7 |
9 |
9 |
8 |
|
IQR |
3 |
6 |
4 |
8 |
8 |
2 |
Class A has the least variability of A-C. Class D has more variability than class C. Based on the IQR, Class D and E have the same variability. Class F has the least variability of all.
(f) This results are consistent, with Class F having the least, then class A. Here we do see a difference between classes D and E, with D having a slightly smaller standard deviation.
Investigation 2-5:
Body Temperatures
(a) Calls for personal opinion.
(b) Could look at dotplots, boxplots, or histograms.
With dotplots:

We see that both distributions are rather symmetric, with the females appearing to have a slight tendency for higher body temperatures. The mean body temperature for the females in this sample is 98.394 degrees compared to 98.105 degrees for the males (median 98.40 vs. 98.10). The female body temperatures also show slightly more variability (SD=.743 degrees vs. .699 degrees, though the IQR has .8 for the females and 1.0 for the males). If we look at the boxplots, we see that the larger standard deviation for the females arises in large part from about 5 outliers.
(c) A temperature of 98.6o appears rather typical for the females but is close to the upper quartile (98.6) for males. Would be nice to know the conversion between the Fahrenheit and Celsius scales to answer the second question.

(d) female: (98.6-98.394)/.743 = .277
male: (98.6-98.105)/.699 = .708
(e) With a higher z-score, a temperature of 98.60 is “further” above the male average than the female average.
(f) female: (98-98.394)/.743 = -.53
male: (98-98.105)/.699 = -.15
A temperature of 980 appears to be more unusual for the females since the absolute value of the z-score is larger.
(g) A negative z-score indicates the observation lies below the mean.
(h)
|
|
Mean |
Standard dev |
|
Female |
36.885 |
.413 |
|
Male |
36.725 |
.388 |
(i) The new mean is (5/9)(98.395-32) for the women and (5/9)(98.105-32) for the men, transformations of the means on the Fahrenheit scale. For the standard deviations, we use just the scale term: (5/9)(.743) and (5/9)(.699).
(j) (5/9)(98.6-32) = 37
(k) female: z = (37-36.885)/.413 = .28
male: z = (37-36.725)/.388 = .71
These are the same (apart from some rounding discrepancies) as the z-scores obtained on the Fahrenheit scale.
(l) 0
(m) 68%
Investigation 2-6:
The Fan Cost Index
(b)

(c)

(d) The five number summary (in dollars) and mean/SD are below.
Variable League
Minimum Q1 Median
Q3 Maximum
2003 fci A
112.02 130.37 143.69
163.73 248.44
N 94.61
127.32 147.32 165.11
182.56
Variable League
Mean StDev
2003 fci A
151.92 34.60
N 145.81
24.88
(e) The costs are rather similar in
that there is much overlap of the boxes and while the median FCI value is
slightly higher for the National League, the mean American League FCI value is
higher. The standard deviation for the
American League is slightly larger though the IQR is slightly lower ($33.36 vs.
$37.79). Both distributions appear
fairly symmetric.
(f) American; National; The FCI for
(g) National; American; The FCI for
(h) Calls for predictions.
(i)

Now
(j) Median since it is calculated
based on the position of the observations and not their numerical values. An extreme numerical value will always affect
the calculation of the mean.
(k) The IQR since it is calculated
based on the position of the observations and not their numerical values. An extreme numerical value will always affect
the calculation of the standard deviation and the range..
(l)

mean=$3.45, sd = $8.93, median =
$2.13, IQR = $13
The distribution of price
differences is fairly symmetric, centered near zero, but with a fairly large
spread. If we compare the two leagues:

There is much more variation in the
differences for the American League than the National League (SD $11.08 vs.
$6.88, IQR $15.76 vs. $11.35). Both
distributions center around 3 dollars, although the median
(m) Largest percentage change:
Largest 2003 FCI:
Largest change:
While
(n) Also shifting to a more sensible
scale:


These prices tend to occur at
integer values. This makes sense as they
are often sold by vendors walking the stands and it is more convenient to not
have to make change.
(o) There is a $4.08 program (
(p) They are the Canadian teams and
the prices have been converted to US dollars.
These values are probably integers in Canadian dollars.
(q) No
(r) They are not all actually the
same size.
(s)
Investigation
2-7: House Prices
(a) Answers will vary but should
look for a “typical” value.
(b) Answers will vary but could look
at the “prediction” errors.
(c) answers will vary
(d) ideally zero!
(e) 582.5
(f) 4660-8m = 0 yields m = 582.5
(g) mean = 582.5, median = 507
the mean balances all of the
prediction errors.
(h) Sxi – nm = 0 yields m = Sxi/n