Terminology Detour

What you have approximated here is called the p-value of the study.  The p-value measures the likelihood of observing a result at least as extreme as that actually observed by the researchers when the null hypothesis is assumed to be true.  We approximated this p-value by repeating the random sampling process from the process specified by the null hypothesis (50/50) a large number of times and determining how often we get a result at least as extreme as the researchers (14 or more). You can obtain better and better approximations of this p-value by using more and more repetitions in your simulation of 16 infants. A small p-value indicates that the observed data would be surprising to occur by random chance alone when the null hypothesis is true.  Such a result is said to be statistically significant, meaning it provides convincing evidence against the null hypothesis.  This means we are no longer comfortable believing that the null hypothesis is true and we got a fluke outcome.  Instead, we think the more believable conclusion is that the null hypothesis is not true and something other than random chance is at play.

The smaller the p-value, the stronger the evidence against the null hypothesis.  There are no hard-and-fast cut-off values for gauging the smallness of a p-value, but generally speaking:

  • A p-value above .10 constitutes little or no evidence against the hypothesis hypothesis.
  • A p-value below .10 but above .05 constitutes moderately strong evidence against the null hypothesis.
  • A p-value between .01 and .05 constitutes reasonably strong evidence against the null hypothesis. (Most people consider this convicing.)
  • A p-value below .01 constitutes very strong evidence against the null hypothesis.
Back Next