INVESTIGATING STATISTICAL CONCEPTS, APPLICATIONS, AND METHODS

ERRATA

Last updated January 21

(these errata reflect changes that were not made in the preliminary edition)

p. 5, paragraph after (h)

insert “the” between “that is” and “risks”

p. 6, study conclusions

insert “of” between “a difference in the risk” and “airway”

p. 14, question (l)

Instead of asking if this is an estimate of the percentage that "are smokers," should ask if this is an estimate of the percentage that "have lung cancer."

p. 15, definition of cohort study

Current definition is a little simplified.  The subjects do not have to be “selected according to” the explanatory variable (though that is one example), but the explanatory variable should be observable at the start of the study and the outcomes occur “naturally” (chosen by the subjects, not by the researchers).  In a prospective cohort study, the subjects are then followed over a period of time and the response variable then observed.

p. 18, question (s) – as labeled

Might want to reverse order of intermediate risk calculations

Note: Questions (q)-(s) can be relabeled and you might want to say “calculate” for the first one and then “compare this” for the next two.

p. 19, discussion

the last relative risk calculation should be 1.54 not 1.44

p. 32, before (f)

You might want to insert “Weighted Average” as a new section heading

p. 33-34, questions (i) and (k)

Replace with “Is the overall acceptance rate for women equal to the average of their acceptance rates in each program?”  This will also affect the conclusion statement.  The easiest fix is to delete the first sentence.

You may prefer to replace these pages with these.

p. 37, missing figure

p. 54, last lines

change the inequalities on the right hand side to less than or equal to inequalities.

p. 60, before question (i)

Discussion and confidence interval formula should be removed

p. 65, introduction

Delete second sentence and change paragraph to: “In the previous investigation there were only 15 outcomes in our sample space and it was feasible to list each outcome.  The 2004 U.S. Senate consists of 86 men and 14 women.  Suppose that they are randomly assigned to a subcommittee of 5 members (and a second group of 95 non-subcommittee members).  Let the random variable X represent the number of women on the subcommittee.”

p. 65, before part (d)

change “interesting” to “interested”

p. 67, part (k)

MTB> plot c2*c1;

SUBC> project.

p. 69, practice 1-20(i)

change “bottom” to “button”

p. 73, calculation details

we intended for the calculation details to be displayed

In the above table, we could have found the p-value as the probability of obtaining 8 or more successes in Group B:

P(Y>8)=

Or the probability of 9 or more “non-threshold beaters” in Group A:

P(Z>9)=

p. 78, practice problem 1-23

you should probably clarify that the files were assigned “gender” at random

p. 83

you will want to number the figures 1-6 for easier reference

p. 92, 2nd bullet

They need to click OK after they select the With Groups option to move to the next window.

p. 95, practice problem 2-2

Consider asking students to pick one comparison in (a) and in (b) and asking if the wooden vs. steel comparison is ask they expected.

p. 96, part (b)

After choosing the “Multiple Y’s, Simple” option, will need to click OK to move to the next window.

p. 101, Histogram of class A can be rescaled so the vertical axis matches that of all the other graphs.

p. 103, practice 2-3

These are the numerical values:

 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Raleigh 39 42 50 59 67 74 78 77 71 60 51 43 SF 49 52 53 56 58 62 63 64 65 61 55 49

p. 104, practice 2-4(b)

The initial number of intervals is 12, not 14.

p. 109 practice 2-7

For the ACT scores, consider changing the parameters to a mean of 21 and a standard deviation of 5.  Since the ACT max composite score is 36, can change Jeff’s score to 28.

p. 110, part (a)

will need to make sure they do not type in the quotation marks

p. 112, question (l)

change to “and examine numerical and graphical summaries to analyze the distribution of the price differences between the two years.  Write a paragraph summarizing how the years compare.”

p. 119, part (q)

By “generic,” we mean for a generic data set with n=8 values.

p. 127, part (l)

may need to change first Minitab line to: MTB> let c8 = (c6>=15.92)

p. 128, figure of t distribution can be improved:

p. 131, part (a)

Can also ask them to calculate the mean and standard deviation as part of their numerical summaries as well as will need that information later.

p. 133, practice 2-13

need to reletter the questions

p. 150, question (a)

Have them describe the sampling distribution in words (not the shape, center, and spread)

p. 160, practice 3-10

Could consider using n=20 rather than n=100

p. 164 figure

The spikes at 22 and above should be highlighted

p. 166, investigation 3-6

replace with this

p. 174, part (u)

need to reletter (t)-(bb), including on p. 175 referring back to new part (x)

p. 176

Change title to “Inverse Cumulative Probabilities”

The Minitab screen capture should refer to 10 as the number of trials

p. 194, 198

answers will vary slightly depending on decimal expression of 2/3 used.

p. 197

need to reletter (d)-(g) including in new (e) refer back to (d)

p. 202, part (c)

use 1000 different sessions instead of 200

bad page break between p. 202 and p. 203

p. 204, part (o)

Will click Count instead of Calculate

p. 205, part (r)

change number of samples to 1000

p. 207 reletter second part  (y) as part (z)

p. 214, part (f)

p. 240, part (o)

Use a height of 152 instead of 151.8

p. 246, practice 4-9, part (a)

change “of” to “or” after “Minitab output”

p. 253, figure

q is playing the role of p

p. 254, part (g)

p. 265, figures

.583 should be .584 and .746 should be .75

p. 266, part (k) should be part (m)

last paragraph: delete “in” after “the same as those made”

p. 266-7

Some additional practice problems are available here

p. 268, part (i)

pi should be p

p. 269

need to reletter

p. 270

Under the first set of technical conditions, delete the parenthetical remark

p. 271, practice 4-15

Parts (a) and (b) should have the same confidence level

p. 273, part (i), considering adding Hint: What is true about their  values?

p. 292, CI box

change “fs” if “if”

p. 300, part (t)

need square bracket at end.

In screen capture of Test of Significance Calculator, need mean = 195.88

p. 301 study conclusions, paragraph 3

The 90% confidence interval for m is (188.9, 202.8) and the 90% prediction interval is (160.51, 231.25).

p. 308, first line

need space in “and the”

p. 309, part (d)

change “from this population” to “from this sample”

p. 313, part (m)

delete this question as every student has the same sample

p. 317, practice problem

Should be 4-27, not 4-26

p. 319, table

could add hypergeometric distribution in addition to binomial distribution for the case the CLT does not apply

p. 321, The observed number of responses in 2002 should be 591 instead of 587

p. 323, we assume phat1 is for 1998 and phat2 is for 2002

p. 325, study conclusion

p. 327, The observed number of responses in 2002 should be 591 instead of 587

p. 329, part (l), compare back to the interval in (k) instead (95%)

Study Conclusions: When using 591, the one-sided p-value will be .051 and the 90% confidence interval is essentially 0%-7.2%.

p. 347, page reference

page 5-6 should be page 326

p. 360, practice 5-13

should refer to practice 3-31 (p. 220)  not investigation 3-13

p. 372, table notation

log odds ratio formulas should use  for the standard error

p. 377, in first full paragraph, line 6

“excepted” should be “expected”

p. 378, probability fact

could replace the condition of at least 20% of the expected counts with the average expected count should be at least 5

p. 391, part (e) graphs

crutches is misspelled in both graphs

p. 393, missing formulas

p. 399, study conclusions

F should be 29.23

p. 405, figures

p. 406, part (e)

The bars are missing on  and  (or delete the word “average” in each definition)

p. 410, part (a)

Another version of the applet is at http://www.rossmanchance.com/applets/GuessCorrelation/GuessCorrelation.html (this is the version that the http://www.rossmanchance.com/iscam/files page refers to.

In this applet, once the Enter button is pressed, the true correlation value appears.  Below is a screen capture of this applet:

p. 414, scatterplot

The correlation coefficient (r = .711) is not displayed

p. 429

Sentences are repeated from bottom p. 428.