CHAPTER 1
Predicting Elections from Faces?
Do voters make judgments about
political candidates based on his/her facial appearance? Can you correctly
predict the outcome of an election, more often than not, simply by choosing the
candidate whose face is judged to be more competent-looking? Researchers
investigated this question in a study published in Science (Todorov,
Mandisodka, Goren, and Hall, 2005).
Participants were shown pictures of
two candidates and asked who has the more competent looking face. Researchers
then predicted the winner to be the candidate whose face was judged to look
more competent by most of the participants. For the 32 U.S. Senate races in
2004, this method predicted the winner correctly in 23 of them.
(a) In what proportion of the races
did the “competent face” method predict the winner correctly?
(b) Describe (in words) the null model
to be investigated with this study.
(c) Describe how you could (in
principle) use a coin to produce a simulation analysis of whether these data
provide strong evidence that the “competent face” method would correctly
predict the election winner more than half the time. Include enough detail that
someone else could implement the full analysis and draw a reasonable
conclusion.
(d) Use the Coin Tossing applet to
conduct a simulation (using 1000 repetitions), addressing the question of
whether the researchers’ results provide strong evidence in support of the
researchers’ conjecture that the “competent face” method would correctly predict
the election winner more than half the time. Submit a print-out of the applet
output (you can use the “print screen” key), and indicate where the observed
research result falls in that distribution. Also report the approximate p-value
from this simulation analysis.
(e) Write a paragraph, as if to the
researchers, describing what your simulation analysis reveals about whether the
data provide strong evidence in support of their conjecture.
These researchers also predicted the
outcomes of 279 races for the U.S. House of Representatives in 2004. The
“competent face” method correctly predicted the winner in 189 of those races.
(f) Use the applet to conduct a
simulation analysis of these data. Again submit a print-out of the “what if?”
distribution, and indicate where the observed research result falls in that
distribution. Also report the approximate p-value, and summarize your
conclusion, again as if to the researchers.
Dolphin communication?
A famous study from
the 1960s explored whether two dolphins (Doris and Buzz) could communicate
abstract ideas. The dolphins were trained to push the right button if a
headlight was shone steadily, but the left button if the headlight blinked on
and off. Then the researcher placed a large wooden wall in the middle of
the pool. Doris was on one side of the wall and could see the headlight,
whereas Buzz was on the other side of the wall where he could not see the
headlight. When the light was shone, Doris would swim near the wall and
whistle. Buzz would then whistle back and press a button. If he pushed
the correct button (corresponding to the light Doris was shown), they both got
a fish. Dr. Bastian repeated this procedure again and again, and Buzz
pushed the correct button 22 out of 25 times. Is this convincing evidence that
Buzz and Doris could communicate?
(a) Calculate the statistic for this study.
(b) State the null and alternative hypotheses (in symbols and in words) to
investigate this research question.
(c) Use technology to calculate an exact p-value for this significance
test. Include a copy of your computer results (e.g., screen capture).
(d) Write a one-sentence interpretation of this p-value.
(e) Summarize the conclusions you would draw about the research question based
on this p-value.
Two versions
Which Tire?
A statistics class at Cal Poly
recently found 24 of 54 students in class chose the right front tire. You will
conduct a test of whether the data provide evidence that Cal Poly students tend
to choose the right front tire more often than would be expected if the four
tire choices were equally likely.
a) Identify the observational units
and variable in this study. Also classify the variable as categorical or
quantitative. If the variable is categorical, also indicate whether it is
binary.
b) State the appropriate null and
alternative hypothesis, in symbols and in words.
c) Use software (either R or Minitab)
to produce a bar graph of the student responses. Submit this graph, and comment
on what it reveals.
d) Use software (either R or Minitab)
to determine the (exact binomial) p-value for the test of your hypotheses in
b).
e) Write a sentence describing what
this p-value is the probability of.
f) Write a couple of sentences
summarizing the conclusion that you would draw from this analysis and also explaining
the reasoning process that underlies your conclusion.
g) Suppose that a colleague of mine
conducts this same study in her class, which has exactly half as many students
as our class. Suppose further that her class obtains the same proportion of students
choosing the right front tire. Determine the exact p-value in this case.
Describe how the p-value and your conclusion would be different in her class as
opposed to our class, and comment on why this makes intuitive sense.
Which Tire?
We collected
data in class related to a well-known campus legend. Each of you was asked to
specify one of four tires to answer in a situation where you have to make up
which tire had recently been flat on your car.
My prior conjecture is that a higher number of you than would be
expected due to chance alone would pick the right
front tire.
(a) Identify
the observational units and variable of interest in this study.
(b) Define
the parameter of interest in this study, assuming that your results are
representative of an overall tire picking process by Cal Poly students.
(c) State the
null and alternative hypotheses to reflect my conjecture, in symbols and in
words.
Here are some
class results:
Left front |
Left rear |
Right front |
Right rear |
11 |
5 |
19 |
2 |
(d) What
proportion of students picked the right front tire? Is this in the direction of
my research conjecture?
(e) Would it
be valid (see p. 51) apply the one-proportion z-test with these data? Explain.
(f)
Regardless of your answer to (e), calculate the one-proportion z-test using technology (see p. 51).
Include your output and report the test statistic and p-value (both with and
without a continuity correction).
Hint: To do the continuity correction in Minitab or applet,
go back to the normal distribution, using the mean and SD specified by the CLT.
(g) Write a
one-sentence interpretation of the test statistic in this context.
(h) What test decision (reject or fail to reject
the null hypothesis) would you make based on this p-value?
(i)
Regardless of your answer to (e), calculate and interpret a 95% confidence
interval for the parameter of interest (see p. 57).
(j) Explain
what is meant by the phrase “95% confidence” in this context.
(k) A
colleague of mine conducted this study with 2 sections of Stat 301 students
last winter. Of his 54 students, 24
picked the right front tire. Without carrying out any new
inference procedure calculations, explain how the test statistic, p-value,
and confidence interval for his data will compare to ours. Justify your answers.
Plausible values?
In Investigation 1.3, you considered the sample of 361 patients with 71 deaths and whether .15 was a plausible value for the probability of a death during a heart transplantation at St. George’s hospital at this time.
(a) Use the Binomial
Distribution applet to test other values for ,
e.g., .42, .43, …. Create a list of which values (multiples of .01 are
fine) that have a p-value of at least .05. That is, what are the plausible
values of
based
on the observed sample result? Include a screen capture of your applet results
for the smallest value of
you
consider plausible and the largest value of
you
consider plausible.
Hint: You will
want to change the direction of the alternative so that the observed result is
always in the tail of the distribution. In other words, to test values larger
than 71/361, use greater than, when testing values smaller than 71/361, use
less than.
Two versions
Competitive Advantage of
Uniform Color?
Do uniform color give athletes an advantage
over their competitors? To investigate this question, Hill and Barton (Nature,
2005) examined the records in the 2004 Olympic Games for four combat sports:
boxing, tae kwon do, Greco-Roman wrestling, and freestyle wrestling.
Competitors in these sports were randomly assigned to wear either a red or a
blue uniform. The researchers investigated whether competitors wearing one
color won significantly more often than those wearing the other color. They
analyzed results for a total of 457 matches.
Of these, red won the match 248 times,
while blue won 209 times.
(a) Identify
the observational units and binary, categorical variable of interest. Indicate which outcome you will consider
“success.”
(b) Explain
how and why randomness was used in this study.
(c) State the
null and alternative hypotheses based on this research question. (Hint: Before you look at the data)
(d) Calculate
a binomial p-value for investigating this research conjecture.
(e) Compute a
95% binomial confidence interval for the parameter. Write a sentence
interpreting what this interval says, including how you are defining the
parameter.
(f) Is the
confidence interval result consistent with your p-value result? Explain.
(g) Now determine a 90% confidence
interval for the parameter. Comment on how it differs from the 95% interval. [Hint:
Refer to both the midpoints of the intervals and their widths.]
(h) Summarize
your results as if to an athletic director at a university. Include discussion about how you are willing
to “generalize” these results beyond these 457 matches.
Competitive Advantage of Uniform
Colors (cont.)
For the Competitive
Advantage of Uniform Color data, assuming = .5:
(a) Is the sample size in this study large enough for the Central Limit Theorem for a sample proportion to apply?
(b) Assuming the CLT applies, specify the shape, mean, and standard deviation predicted by the CLT for the distribution of sample proportions. Include a well-labeled “sketch” (by hand or computer rendered) of this distribution. Hint: By well-labeled, I mean a horizontal axis label and indication of scaling on the horizontal axis showing the mean and one standard deviation on each side of the mean.
(c) Compare the predicted distribution with simulation results from either the Reese’s Pieces applet or this version of the Coin Tossing applet (comment on each of shape, mean, and SD). Be sure to include your output.
(d) Use technology (see p. 44 or just use the Reese’s applet) to estimate the probability of obtaining a sample proportion of at least .543 using the normal approximation. How does this compare to the p-value you found using the binomial distribution?
(second version)
a) State the appropriate null and
alternative hypotheses, both in symbols and in words.
b) Use the Binomial Distribution
applet to simulate 1000 repetitions of these matches, under the null
hypothesis. Submit a screen capture of the applet results.
The researchers found
that the competitor wearing red defeated the competitor wearing blue in 248
matches, and the competitor wearing blue emerged as the winner in 209 matches.
c) Use the applet simulation results
to approximate the two-sided p-value from these data. Also report which values
are being counted to determine this approximate p-value.
d) Use R or Minitab to determine the
exact binomial (two-sided) p-value. Submit the output with your answer.
e) Summarize what your analysis
reveals about how much evidence the data provide for concluding that uniform
color does give one athlete an advantage over the other.
f) Use R or Minitab to determine a 95%
confidence interval for the parameter. Also write a sentence interpreting what
this interval says.
g) Now determine a 99% confidence
interval for the parameter. Comment on how it differs from the 95% interval. [Hint:
Refer to both the midpoints of the intervals and their widths.]
h) Are these
confidence intervals consistent with your earlier test (parts a-e)? Explain
briefly.
Rock-Paper-Scissors
Play the game rock/paper/scissors against the computer using the
website http://www.nytimes.com/interactive/science/rock-paper-scissors.html.
Select the novice version of the computer to play against. Play for at
least 30 rounds, but keep going for as long as you’d like. Keep track of
which option you choose (rock or paper or scissors) for every round that you
play (the computer will record this for you but that information will soon
scroll off the screen, so make your own notes). Try to recreate how you would play
against a person and don’t view your prior results when making your next
selection.
(a) Identify the observational units in this study.
(b) Identify the variable on interest.
An article published in College Mathematics Journal
(Eyler, Shalla, Doumaux, and McDevitt, 2009) found that players tend to not
prefer scissors, choosing it less than 1/3 of the time. We will
investigate whether your data suggest that you
tend to choose scissors less than one-third of the time.
(c) Calculate the statistic in this study and create a bar graph of your
results. [In R and
Minitab, you can do this with just the "summarized" data, you don't
have to enter the individual outcomes.] Are your results in the
direction conjectured by these researchers (choosing scissors less than 1/3 of
the time)?
(d) Define the parameter of interest in this study.
(e) State appropriate null and alternative hypotheses about this parameter
according to the theory suggested in the CMJ
article.
(f) Explain how you could use an ordinary six-sided die to simulate a what-if
distribution under this null hypothesis. Be sure to indicate what each possible
outcome of the die (1, 2, 3, 4, 5, 6) would represent.
(g) Based on your sample results, are you convinced that you choose scissors
less than one-third of the time in the long run? Clearly explain your
reasoning.
Standard Deviation
Properties
The standard
deviation of a sample proportion is given by SD() =
.
(a) Take the
derivative of this function with respect to and find the value of
that maximizes SD(
) for a fixed value of n. Hint: You may also
want to graph the function to verify you have found a maximum.
(b) Now
consider changing the sample size for a fixed value of .
Does the standard deviation decrease more by adding 500 subjects to a
sample size of 500 or to sample size of 2500?
Explain.
(c) Take the
(first and) second derivative of SD() as a function of n
to determine whether this function of n
is concave up or concave down.
(d) Explain
what your analysis in (b) and (c) reveal about the “diminishing returns” of
increasing sample size. Hint: you may
want to create and examine at a graph of SD() vs. n for
values of n from say 0 to 3000.
Halloween Treat Choices
Obesity has become a widespread health
concern, especially in children. Researchers believe that giving children easy
access to food increases their likelihood of eating extra calories. A recent
study (Schwartz, Chen, and Brownell, 2003) examined whether children would be
willing to take a small toy instead of candy at Halloween. They had seven homes
in 5 different towns in Connecticut present children with a plate of 4 toys
(stretch pumpkin men, large glow-in-the-dark insects, Halloween theme stickers,
and Halloween theme pencils) and a plate of 4 different name brand candies
(lollipops, fruit flavored chewy candies, fruit-flavored crunchy wafers, and
“sweet and tart” hard candies) to see whether children were more likely to
choose the candy or the toy. The houses alternated whether the toys were on the
left or on the right. Data were recorded for 284 children between the ages of 3
and 14 (who did not ask for both types of treats).
(a) State an appropriate null and
alternative hypothesis involving this parameter (in symbols and in words), for
testing whether there is strong evidence of a preference for either the toys or
the candy.
(b) Explain what a type I error would
represent in this context.
(c) Explain what a type II error would represent in this context.
(d) Suppose 60% of trick-or-treaters prefer the toy. Explain what “power” would represent in this context (loosely outline how you would determine it, you do not need to perform the calculations).
Margin-of-Error
Properties
The margin of
error of a confidence interval for a population proportion using the Wald procedure is
.
Some books recommend a shortcut formula that approximates this margin of
error for a 95% CI for
quite simply by 1/
.
(a) Explain
why this is a reasonable approximation. Hint:
Refer to question 1 as well.
(b) Show that
this approximation is conservative, in that it slightly overestimates the
actual margin of error.
(c)
Reconsider the practice problem on p. 60 (Muffin) and re-answer part (b) –
solving for the necessary sample size - using this approximation.
(d) Suggest
two different ways (that the researcher has direct control over) to reduce the
margin of error in a study.
Cola Discrimination?
A teacher doubted whether his students
could distinguish between two different brands of cola soft drink (say, Coke
and Pepsi). He presented each of his 48 students with three cups of cola. Two
contained the same brand, and the third contained the other brand. Each student
was asked to identify the cup containing cola that differed from the other two
cups. Let π represent the probability that a student can correctly
identify the “odd” brand. The hypotheses to be tested are H0: π = 1/3 vs. Ha: π > 1/3.
a) Describe (in words) what Type I
error means in this situation.
b) Describe (in words) what Type II
error means in this situation.
c) Describe (in words) what power
means in this situation.
For the remaining questions, you may
use either the Power Simulation applet for an approximate answer or R/Minitab
for an exact answer. (Include screen captures of applet results or R/Minitab
output with your answers.)
d) Determine the rejection region for
this test, using the α = .05 significance level.
e) Calculate the power of this test,
using the α = .05 significance level, when the success probability is
actually π = .5. Also be sure to write this probability as Pr(X ___ k),
where you indicate the appropriate probability distribution of X, and you will
in the blank with the appropriate inequality, and you indicate the appropriate
value of k.)
f) How would the power change if the
success probability were larger? Explain why this makes sense intuitively. Then
calculate the power when π = 2/3, and comment on whether this supports
your answer.
g) How would the power change if the
significance level were smaller? Explain why this makes sense intuitively. Then
calculate the power using α = .01 (for an alternative value of π =
.5), and comment on whether this supports your answer.
h) How would the power change if the
sample size were larger? Explain why this makes sense intuitively. Then
calculate the power using n = 96 (with α = .05 for an alternative
value of π = .5), and comment on whether this supports your answer
Baseball Big Bang?
A reader wrote in to the “Ask Marilyn”
column in Parade magazine to say that his grandfather told him that in
3/4 of all baseball games, the winning team scores more runs in one inning than
the losing team scores in the entire game. (This phenomenon is known as a “big
bang.”) Marilyn responded that this probability seemed to be too high to be
believable. Let π denote the actual probability that a Major League
Baseball game results in a “big bang.”
a) Restate the grandfather’s assertion
as a null hypothesis, in symbols and in words.
b) Report Marilyn’s response as an
alternative hypothesis, in symbols and in words.
To investigate this claim, I examined
the 45 Major League baseball games played on September 17 – 19, 2010. I found
that 21 of these 45 games contained a big bang.
c) Calculate the sample proportion of
games that had a big bang, and denote it with the appropriate symbol.
d) If the grandfather’s claim is true,
how many standard deviations below the mean is the observed sample proportion?
Also denote this with the appropriate symbol.
e) Use the normal distribution to
determine the approximate p-value, first without using the continuity
correction and then with using the continuity correction. Also produce (and
submit) an appropriately labeled shaded graph for each of these normal
calculations.
f) Would you conclude that the sample
data provide strong evidence to support Marilyn’s contention that the
proportion cited by the grandfather is too high to be the actual value? Explain
your reasoning, as if writing to the grandfather, who has never taken a
statistics course.
g) Marilyn went on to assert that she
believes the actual probability of a big bang to be .5. Conduct a two-sided
test of this hypothesis. Report the hypotheses, test statistic and p-value.
Again perform the calculations with and without using the continuity
correction. Also calculate the exact p-value from the binomial distribution.
Produce (and submit) appropriately labeled shaded graphs for all of these
calculations. Comment on whether the continuity correction is helpful here.
State the test decision at the α = .10 significance level, and summarize
your conclusion.
Competitive Advantage from Uniform
Color? (cont.)
Recall the study of 457 matches in
four combat sports (boxing, tae kwon do, Greco-Roman wrestling, freestyle
wrestling) at the 2004 Olympic Games. Competitors in these sports were randomly
assigned to wear either a red or a blue uniform. The researchers found that the
competitor wearing red defeated the competitor wearing blue in 248 matches, and
the competitor wearing blue emerged as the winner in 209 matches.
a) Identify the observational units
and variable in this study.
b) Verify the conditions for using the
Wald (z-) procedure to determine a 95% confidence interval for the
probability that the competitor wearing red wins a match.
c) Calculate this 95% confidence
interval.
d) Interpret what this interval
reveals: We are 95% confident that …
e) Interpret what the 95% confidence level
means in this context.
f) Repeat c) for a 99% confidence
interval.
g) Describe how these two intervals
compare, in terms of both their midpoints and widths.
h) Do these intervals suggest that one
uniform color or the other provides a competitive advantage? Explain.
i) Suppose that the sample size had
been four times as large, and the sample proportion had been identical to the
actual study. Determine a 95% confidence interval in this case, and comment on
how it compares to the interval in (c). [Hint: Be as specific as
possible, and be sure to comment on both midpoint and width.]
j) Determine how large a sample size
would be necessary to estimate the actual probability to within 5 percentage
points with 90% confidence.
Emotional Support?
In the mid-1980s sociologist Shere
Hite undertook a study of women’s attitudes toward relationships, love, and sex
by distributing 100,000 questionnaires through women’s groups. Of the 4500
women who returned the questionnaires, 96% said that they give more emotional
support than they receive from their husbands or boyfriends. Around the same
time, an ABC
News/Washington Post poll
surveyed a national random sample of 767 women, finding that 44% claimed to
give more emotional support than they receive.
Consider the population of interest
for both surveys to be all American women.
a) Identify (in words) the parameter
of interest for both polls.
b) With each poll, determine a 90%,
95%, and 99% confidence interval for the parameter.
Calculate one of these confidence intervals
by hand, and feel free to use technology (applet, R, Minitab) for the others.
c) Which poll has the smaller
margin-of-error? Explain why this poll has the smaller margin-of-error.
d) Which poll’s results do you think
are more representative of the truth about the population of all American
women? Explain.
e) Which polling method do you think
is more likely to be biased in a particular direction?
Explain your answer, and also indicate
whether you think that poll’s statistic is an overestimate or underestimate of
the population parameter?
f) Determine the sample size that
would be needed to estimate the population parameter to within ±.025 with 95%
confidence. Use both .5 and the statistic from the ABC News/Washington Post poll
to perform this calculation, and comment on how your answers differ.
g) Based only on your confidence
intervals from the ABC News/Washington Post poll, does .5 appear to be a
plausible value for the proportion of all American women who would answer “yes”
to this question? Explain.
h) Conduct a (normal-based)
significance test of whether .5 is a plausible value for the proportion of all
American women who would answer “yes” to this question, based on the data from
the ABC News/Washington Post poll. Report the hypotheses, test
statistic, and p-value. State your test decision at the α = .10, .05, and
.01 significance levels, and summarize your conclusion.
Penny For Your Thoughts?
In June of 2004, the Harris
organization asked a random sample of 2136 adult Americans:
“Would you favor or oppose abolishing
the penny so that the nickel would be the lowest denomination coin?” It turned
out that 41% responded in favor of abolishing the penny.
a) Is this an example of random
sampling from a finite population or from an ongoing process? Explain briefly.
b) Identify the parameter of interest,
both in words and with a symbol.
c) Verify the conditions for using the
Wald (z-) procedure to determine a confidence interval based on the
sample result. Then calculate this confidence interval by hand, using the 99%
confidence level. Also interpret what this interval means.
d) Suppose that you want to conduct a
study to estimate the population proportion of Cal Poly students who favor
abolishing the penny to within ± .04 with 90% confidence. Determine the
smallest sample size that would achieve this goal. [Hint: Use the sample
proportion from the Harris poll in determining this sample size.]
e) Would your answer to d) depend upon
whether the population of interest was all American adults or all California
adults? Explain.
Now suppose that you want to use a
random sample of 250 Cal Poly students to test whether less than half of all
Cal Poly students favor abolishing the penny, using the α = .05
significance level.
f) State the appropriate null and
alternative hypotheses, in symbols.
g) Based on the normal approximation,
determine the values of the sample proportion that would lead to rejecting the
null hypothesis at the α = .05 significance level. Justify your answer
with appropriate calculations and/or graphs, perhaps including computer output.
h) Again using the normal
approximation, determine the power of this test when the actual proportion of
Cal Poly students who favor abolishing the penny is .45. Justify your answer
with appropriate calculations and/or graphs, perhaps including computer output.