It is essential when teaching students about the use of assessment instruments that one also teaches them the importance of sound psychometric properties for any measure used. By learning what qualities make an instrument useful and meaningful, students can be more discerning when confronted with new instruments or modifications of traditional measures. "In the absence of additional interpretive data, a raw score on any psychological test is meaningless" (Anastasi & Urbina, 1998, p. 67). This statement attests to the true importance of gathering appropriate normative data for all assessment instruments. Without a reference sample with which to compare individual scores, a single raw score tells the examiner little of scientific value. Likewise, information concerning the reliability of a measure is essential in understanding each individual score that is generated. If the measure has been found to be reliable, this then allows the examiner increased accuracy in the interpretation of variations in scores, such that differences between scores are more likely to result from individual differences than from measurement error (Nunnally & Bernstein, 1994). Furthermore, reliability is essential for an instrument to be valid.
The assessment instruments considered most useful are those that accurately measure the constructs they intend to measure, demonstrating both sensitivity, the true positive rate of identification of the individual with a particular trait or pattern, and specificity, the true negative rate of identification of individuals who do not have the personality trait being studied. In addition, the overall correct classification, the hit rate, indicates how accurately test scores classify both individuals who meet the criteria for the specific trait and those who do not. A measure can demonstrate a high degree of sensitivity but low specificity, or an inability to correctly exclude those individuals who do not meet the construct definition. When this occurs, the target variable is consistently correctly classified, but other variables that do not truly fit the construct definition are also included in the categorization of items. As a result, many false positives will be included along with the correctly classified variables, and the precision of the measure suffers. Therefore, it is important to consider both the sensitivity and the specificity of any measure being used. One can then better understand the possible meanings of their findings. For a more detailed discussion of these issues, see the chapter by Wasserman and Bracken in this volume.
Was this article helpful?