ISCAM III Glossary
Terms 
investigation 
Definition 
Inv B 
The
probability of the union of disjoint events (no shared outcomes) is the sum
of the probabilities of the individual events. 

Inv 1.2 
A statement of the parameter
values specified by the research conjecture. 

Inv 1.1 
A
graphical display of categorical data with a bar for each category. The
height of the bar indicates the frequency or the proportion of observations
in that category. Bars are typically the same width and with gaps
between bars. 

Inv 5.12 
The model assuming linearity
between x and y, equal variance in responses at each x,
and normality of the responses at each x. 

Inv
1.12 
A
sampling method that consistently overrepresents or underrepresents distinct
segments of the population 

Inv 1.2 
A categorical variable with only
two possible outcomes (e.g., heads, tails) 

Inv 1.1
Expl 
A
probability distribution modeling the number of successes in a fixed number
of independent trials with a constant probability of success. 

Inv 2.9 
Selects a
sample of size n from the original sample with replacement. 

Inv 2.2 
A
graphical display of the five number summary. The box extends from the lower
quartile to the upper quartile with a vertical line at the median. Whiskers
extend to min and max values or to the most extreme nonoutlier values (using
the 1.5IQR rule) 

Inv 1.2 
A variable that places
observational units into categories (e.g., small, medium, or large), rather
than measuring a numerical value. 

Inv 1.7 
The
sampling distribution of a sample proportion is approximately normal for
large sample sizes with mean equal to the population proportion/process
probability and standard deviation equal to . 

Inv 5.1 
A rightskewed probability
distribution that models the behavior of the chisquare statistic under the
null hypothesis. 

Inv 5.1 
A
statistic summarizing the discrepancies between the observed counts in a
twoway table and the expected counts under the null hypothesis. 

Inv 5.8 
The percentage of variability in
the response variable that is explained by the regression on the explanatory
variable. 

Inv B 
The
probability of the complement of the event equals one minus the probability
of the event. 

Inv 3.1 
Calculating separate proportions
for each category of the explanatory variable 

Inv 1.5 
A set
of plausible values of the parameter based on the observed sample statistic. 

Inv 1.5 
The longrun proportion of
intervals that capture the parameter value. If the procedure is valid the
observed coverage rate under repeated random sampling will match the stated
confidence level. 

Inv 3.2 
A variable
that changes between the explanatory variable groups and potentially impacts
the response variable. 

Inv 2.5 
A group in a comparative
experimental study that receives no treatment or a placebo treatment. 

Inv
1.12 
A
sample selected from a population using the most readily available
observational units or process; generally not considered representative of
the population or process. 

Inv 5.7 
A numerical measure of the
linear association between two quantitative variables. 

Inv 1.1 
The
distribution of a simulated sample of data, generated according to an assumed
null model 

Inv 1.10 
The multiplier of the standard
error in a confidence interval corresponding to the nominal confidence level. 

Inv 2.2 
A function applied to data that rescales the variable,
often changing the shape and spread of the distribution. Transformations can
be useful for normalizing a distribution to allow use of normalbased methods
or for linearizing bivariate data to allow use of regression models. 

Inv 2.5, Inv 4.2, Inv 5.1 
A number related to the
number of “independent” observations in the calculation of a statistic. It is
used to index a particular member of a probability distribution family. 

Inv B 
A
random variable that can take on a finite number or a countable number of
possible values. 

Inv A, Inv 1.1 
A graphical display of
quantitative data where each observational unit is represented by a dot above
the horizontal axis. 

Inv 1.5 
The
correspondence between a twosided test of significance and a confidence
interval. 

Inv 1.8 
For any moundshaped, symmetric
distribution, approximately 68% of observations fall within one standard
deviation of the mean, 95% within 2 standard deviations, and 99.7% within
three standard deviations. 

Inv 5.2 
The
expected number of observations in a cell of a twoway table, assuming
independence between the row and column variables = (row total)x(column
total)/Table total 

Inv B 
In a probability distribution, a
weighted average of possible outcomes of a random variable, with weights
determined by the probability (or density) of the outcome, representing the
longrun average outcome of the random variable. 

Inv 3.3 
A study
that actively imposes the explanatory variable (or “treatments”) on the
observational (“experimental”) units. 

Inv 3.2 
The variable in a study that we
believe may be explaining the variation/behavior of the response variable.
In an experiment, this is the variable manipulated by the researchers. 

Inv 5.8 
Making
predictions at explanatory variable values far outside the range used to
derive the regression equation 

Inv 3.7 
Fixes the marginal totals in a
twoway table and uses the hypergeometric distribution to calculate the
probability of at least as many successes in group A as observed in the
actual research study. 

Inv 2.2 
The
minimum, lower quartile, median, upper quartile, and maximum 

Inv A, Inv 2.1 
A graphical display of
quantitative data that groups the values into bins and then displays bars for
each bin with height equal to the frequency or relative frequency of the
observations in that bin 

Inv
1.15, Inv 3.7 
A
probability distribution that models the probability of observing X
successes being selected randomly in a sample of n objects from a
population with M successes and NM failures. 

Inv 1.1 Prob Detour 
Random trials from a random
process where the probability of success or failure on a trial does not
depend on the outcomes of any other trials. 

Inv 5.9 
An
observation whose removal substantially changes the association between two
variables. 

Inv 2.2 
The difference between the upper
quartile (75^{th} percentile) and the lower quartile (25^{th}
percentile); a measure of variability 

Inv 5.8 
The
line that minimizes the sum of the squared residuals (aka regression line) 

Inv 1.5 
The cutoff for the pvalue that
leads us to reject the null hypothesis. The probability of a type I
error. 

Inv
1.10 
The
halfwith of a confidence interval; the value that is added to and subtracted
from the value of the statistic to determine the endpoints of the confidence
interval. 

Inv A, Inv 2.2 
A value such that at least 50%
of the observations in the data set are smaller than that value and at least
50% of the observations in the data set are larger than that value. 

Inv 2.2 
A
boxplot that extends the whiskers to the most extreme nonoutlying values and
displays outliers (according to 1.5IQR) separately 

Inv B, Inv 1.1 Prob Det. 
Sets of outcomes of a random
process that do not share any outcomes in common. 

Inv
1.15 
An
error in the data collection process that is not related to how the sample
was selected (e.g., poor question wording) 

Inv 1.7 
A probability model for
moundshaped symmetric, continuous distributions. Completely characterized
by the mean and standard deviation. Probabilities correspond to areas
under the curve; typically found using technology. 

Inv 1.1 
A
distribution of statistics where the statistics have been randomly generated
based on an assumed chance model 

Inv 1.2 
A statement of the parameter
values specified by the null model., typically representing "no
effect" or "no difference" 

Inv 1.1 
A
chance model associated with a null hypothesis. Usually the “by chance
alone” model. 

Inv 3.3 
A study in which no variables
are manipulated by the researchers. Instead data is recorded as it occurs
naturally. 

Inv 1.2 
The
people or objects about which data are recorded. 

Inv 3.10 
The ratio of the number of
successes to the number of failures; equivalently the ratio of the
probability of success to the probability of failure. 

Inv
3.10 
The
ratio of the odds of success between two groups. 

Inv 1.10 
For estimating a process
probability or a population proportion: + z*; valid when have at least 10 successes and at least
10 failures. 

Inv 2.2 
An
observation that does not follow the general pattern of the other
observations, typically an extreme minimum or maximum value. One way to
"test" for outliers is identifying any observations that fall more
than 1.5 × IQR from the nearest quartile as outliers. 

Inv 4.9 
A confidence interval for the
mean difference in response from a paired study design. 

Inv 4.9 
A test
of the mean of the differences in response in a paired study. 

Inv 1.2 
A numerical summary describing
the larger process than generated the data or to the population from which
the sample was selected. 

Inv 3.5 
The
potential effect on the response variable of the power of suggestions (e.g.,
patients feeling better because they are told they are receiving medicine to
help them feel better). 

Inv 1.1 
A believable or reasonable
claim, often about a parameter value. For example, a null model that
is not rejected because the result of the study is not surprising under the
null model. 

Inv
1.11 
Adding
two successes and two failues to the sample before computing a onesample zinterval
to improve the longrun coverage rate of the procedure. 

Inv 3.8 
A ttest for comparing
two means assuming the two population standard deviations are equal and using
the pooled estimate of the standard deviation in the standard error
calculation 

Inv
1.12 
The
entire collection of observational units we are interested in. 

Inv 1.6 
The probability of rejecting the
null hypothesis at a particular alternative value of the parameter 

Inv
1.17 
The
consideration of whether an “effect” has meaning in a practical sense, given
the context and the magnitude of the effect 

Inv 2.6 
A confidence interval for
individual (future) observations (rather than the population mean) 

Inv B 
Longrun
proportion of times that an event occurs when its random process is repeated
indefinitely 

Inv B 
See random process: A
sequence of outcomes generated under identical conditions, usually with
outcomes that cannot be perfectly predicted in advance. 

pvalue 
Inv 1.1 
Probability
that a random process alone would produce a statistic as (or more) extreme as
the observed statistical in the actual study 
Quantitative variable 
Inv 1.2 
A variable that takes on
numerical characteristics (where it makes sense to average the values of the
outcomes) 
Inv 3.4 
Assigning
experimental units to treatments at random, each unit is equally likely to
receive each of the treatments; goal is to create treatment groups that are
balanced on all potential confounding variables. 

Inv B 
A sequence of outcomes generated
under identical conditions, usually with outcomes that cannot be perfectly
predicted in advance. 

Inv B 
A
variable that assigns numbers to outcomes from a random process. For
example, X = number of heads in 5 tosses of a fair coin. 

Inv 2.5 
A study in which the
researchers decide, using random assignment, which explanatory variable group
each experiment unit will be in. 

Inv 5.8 
See Least Squares Line. 

Inv 1.6 
The values of the statistic that
lead us to reject the null hypothesis for a particular level of significance 

Inv 3.9 
The
ratio of the conditional proportions of successes between two groups. 

Inv 5.8 
The “prediction error” between
the observed result and the predicted result 

Inv 2.2 
A
numerical summary that t is not strongly affected by extreme observations
(e.g., the median is a resistant measure of center) 

Inv 3.2 
In a study, the variable that we
think of as being explained by the explanatory variable. In an
experiment, this is the outcome variable of interest. 

Inv
1.12 
The observational
units for which we obtain measurements, a subset of the observational units
in the population. 

Inv 1.2 
The number of observational
units in the study (for which data have been recorded). Typically denoted by n. 

Inv
1.12 
An
enumerated list of every member of the population used to select the sample. 

Inv B 
The list of all possible
outcomes of a random process 

Inv
1.12 
The
property that the value of a statistic will vary from sample to sample but
with a predictable pattern. 

Inv 5.6 
A graphical display of the
association between two quantitative variables. 

Inv 3.1 
A graph
for displaying a categorical response variable, with a separate bar for each
category of the explanatory variable. 

Inv 2.7 
A test of significance using the
binomial distribution to count the number of quantitative values above a
certain number (e.g., number of positive differences in paired study). 

Inv
1.12 
A
sampling method that gives every sample of size n an equal chance of
being the selected sample. 

Inv B 
Artificially recreating the
outcomes of a random process, often using technology. 

Inv A,
Inv B, Inv 1.7 
The
square root of the variance; a measure of spread in the outcomes of a
distribution or random variable; roughly the average deviation from the mean
of the distribution. 

Inv 1.10 
An estimate of the standard
deviation of a statistic based on sample data. 

Inv 1.8 
Calculates
the number of standard deviations an observation lies from the mean of the
distribution. 

Inv 1.1 
A numerical summary of a sample
of data. Common examples are the sample proportion (categorical data) or the
sample mean (quantitative data) 

Inv 1.1 
An
observed result that is found to be unlikely to happen by chance alone under
the null model (small pvalue). 

Inv 2.7 
Selects observations from
a sampling frame at fixed intervals (e.g., every kth observation) 

Inv 1.9 
A
measure of the discrepancy between the observed statistic and the parameter
value(s) specified by the null hypothesis 

Inv A 
A graph of the variable vs. the
time order of the observations 

Inv 3.1 
A
test/interval comparing two sample proportions using the normal approximation
(aka two proportion ztest) 

Inv 1.4 
A significance test for which no
particular direction is specified in the alternative hypothesis, using
"not equal to" in the alternative hypothesis. 

Inv 3.1 
A
summary of counts crossreferenced by two categorical variables. Typically
the explanatory variable is used as the column variable. 

Inv 1.6 
Rejecting the null hypothesis
when it is true. 

Inv 1.6 
Failing
to reject the null hypothesis when it is false. 

Unbiased sampling method 
Inv 1.12 
A sampling method for which the
generated statistics average out to the population parameter of interest. 
Inv 1.2 
Any
characteristic that varies from observational unit to observational unit 

Inv B 
A weighted average of the
squared deviations from the outcomes of the random variable and the expected
value. 

A
distribution of statistics where the statistics have been generated according
to an assumed null model. 

Inv 1.8 
Calculates the number of
standard deviations that an observation lies from the mean of the
distribution. 