Investigating the Effects of Task Characteristics and Educational Interventions on Intuitive Statistical Biases

Date

2019-11

Journal Title

Journal ISSN

Volume Title

Publisher

Faculty of Graduate Studies and Research, University of Regina

Abstract

Past research indicates people have some capacity to intuitively (i.e., informally) detect,

estimate, and apply information derived from important data characteristics (e.g., mean,

sample size, SD) in various statistical judgment tasks. This research has also found that

people are generally capable of making intuitive between-group comparisons, which

involve comparing groups of data and making judgments on these data. However, people

also tend to exhibit a bias in such comparisons against normatively integrating the withingroup

variability (e.g., standard deviation; SD) of the groups with the between-group

variability (e.g., mean differences) while making between-group comparisons. This can

lead to errors in judgment, because accurate inferences about any differences between the

groups requires considering the between-group variability in relation to the within-group

variability. Few studies have investigated targeted educational interventions for

overcoming this intuitive statistical bias. These studies have tended to be based on

incorporating targeted educational material specific to the bias in a semester-long

introductory statistics course. A brief laboratory-based educational intervention would

allow for a more tightly controlled experiment with a larger sample size. For this purpose,

a brief computer-delivered educational intervention was developed to offer subjects

practice and feedback opportunities while engaging in between-group comparison tasks.

These experimental learning condition tasks, and associated assessment tasks, were

presented in the context of making judgments between products based on customers’

product ratings. The ratings for the groups were displayed visually on frequency

distribution graphs. Practice opportunities consisted of allowing subjects to manipulate

relevant statistical characteristics of the between-group comparisons (i.e., the mean

ii

difference, SD, sample size), one-at-a-time. Feedback opportunities provided subjects

information on how many manipulations of the statistical characteristic being currently

focused on would be necessary for there to be a significant difference between the

groups. A series of three experiments using a pretest–posttest design were run to test the

hypotheses that: (a) subjects will exhibit a pretest bias against normatively integrating

within-group variability into their between-group comparisons; (b) that subjects receiving

practice, feedback, or both opportunities would show greater improvement on their

between-group comparisons at posttest than those in a control condition; and (c) that

subjects receiving both practice and feedback will show greater improvement at posttest

than those who received either practice or feedback alone. Subjects were randomly

assigned to receive practice and/or feedback, or to a control condition with neither

practice nor feedback. The dependent measures for the experiments included: (a) novel

between-group comparison tasks designed for these experiments (Forced-Choice Task;

used in all three experiments; Strength-Of-Evidence Task, used in Experiment 1), which

displayed the data for the groups visually on frequency distribution graphs; and (b) a

between-group comparison task that has been shown to elicit the bias in previous

research (Intuitive ANOVA task, used in Experiments 2 and 3). The findings of these

experiments revealed that subjects exhibited the bias strongly when measured with the

Intuitive ANOVA task but that the bias was either extremely diminished, or

unobservable, when measured with the Forced-Choice Task or the Strength-Of-Evidence

Task, consistent with recent findings by other researchers. These experimental learning

conditions were also found to have no reliable effect on subjects’ performance on any of

the dependent measures. Implications, limitations, and future directions are discussed.

Description

A Thesis Submitted to the Faculty of Graduate Studies and Research In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Clinical Psychology, University of Regina. xxii, 287 p.

Keywords

Citation