Investigating the Effects of Task Characteristics and Educational Interventions on Intuitive Statistical Biases
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Past research indicates people have some capacity to intuitively (i.e., informally) detect,
estimate, and apply information derived from important data characteristics (e.g., mean,
sample size, SD) in various statistical judgment tasks. This research has also found that
people are generally capable of making intuitive between-group comparisons, which
involve comparing groups of data and making judgments on these data. However, people
also tend to exhibit a bias in such comparisons against normatively integrating the withingroup
variability (e.g., standard deviation; SD) of the groups with the between-group
variability (e.g., mean differences) while making between-group comparisons. This can
lead to errors in judgment, because accurate inferences about any differences between the
groups requires considering the between-group variability in relation to the within-group
variability. Few studies have investigated targeted educational interventions for
overcoming this intuitive statistical bias. These studies have tended to be based on
incorporating targeted educational material specific to the bias in a semester-long
introductory statistics course. A brief laboratory-based educational intervention would
allow for a more tightly controlled experiment with a larger sample size. For this purpose,
a brief computer-delivered educational intervention was developed to offer subjects
practice and feedback opportunities while engaging in between-group comparison tasks.
These experimental learning condition tasks, and associated assessment tasks, were
presented in the context of making judgments between products based on customers’
product ratings. The ratings for the groups were displayed visually on frequency
distribution graphs. Practice opportunities consisted of allowing subjects to manipulate
relevant statistical characteristics of the between-group comparisons (i.e., the mean
ii
difference, SD, sample size), one-at-a-time. Feedback opportunities provided subjects
information on how many manipulations of the statistical characteristic being currently
focused on would be necessary for there to be a significant difference between the
groups. A series of three experiments using a pretest–posttest design were run to test the
hypotheses that: (a) subjects will exhibit a pretest bias against normatively integrating
within-group variability into their between-group comparisons; (b) that subjects receiving
practice, feedback, or both opportunities would show greater improvement on their
between-group comparisons at posttest than those in a control condition; and (c) that
subjects receiving both practice and feedback will show greater improvement at posttest
than those who received either practice or feedback alone. Subjects were randomly
assigned to receive practice and/or feedback, or to a control condition with neither
practice nor feedback. The dependent measures for the experiments included: (a) novel
between-group comparison tasks designed for these experiments (Forced-Choice Task;
used in all three experiments; Strength-Of-Evidence Task, used in Experiment 1), which
displayed the data for the groups visually on frequency distribution graphs; and (b) a
between-group comparison task that has been shown to elicit the bias in previous
research (Intuitive ANOVA task, used in Experiments 2 and 3). The findings of these
experiments revealed that subjects exhibited the bias strongly when measured with the
Intuitive ANOVA task but that the bias was either extremely diminished, or
unobservable, when measured with the Forced-Choice Task or the Strength-Of-Evidence
Task, consistent with recent findings by other researchers. These experimental learning
conditions were also found to have no reliable effect on subjects’ performance on any of
the dependent measures. Implications, limitations, and future directions are discussed.