Terminology Detour

What you have approximated here is called the p-value of the study.  The p-value measures the likelihood of observing a result at least as extreme as that actually observed in the research study under the assumption that there is no tendency for infants to pick the helper toy in the long-run. We approximated this p-value by repeating the random choice process -- assuming the probability was 50/50 for each infant -- a large number of times and determining how often we get a result at least as extreme as the actual study result under the "random chance alone" model. You can obtain better and better approximations of this p-value by using more and more repetitions in your simulation of 16 random choices. A small p-value indicates that the observed data would be surprising to occur by random chance alone when the underlying probability is 50/50.  Such a result is said to be statistically significant, meaning the observed result provides convincing evidence against the random choice alone explanation.  This means we are no longer comfortable believing that we got a fluke outcome by random chance alone.  Instead, we think the more believable conclusion is that something other than random chance is at play here.

Mostly, keep in mind that, the smaller the p-value, the stronger the evidence against the random choice alone explanation.

 

Back Next