Self-report methods offer clear advantages over other assessment techniques. These methods are simple, quick, inexpensive, flexible, and often provide information that would be difficult or impossible to obtain any other way Yet each advantage corresponds to specific disadvantages that may go unnoticed by researchers. For example, the ubiquity of self-report techniques results from the fact that they are so easy to administer. However, this ease of use may result in an overreliance on self-reports even when more appropriate but more difficult-to-obtain methods are available. Similarly, the simplicity of self-reports may belie the complex processes that underlie self-reported judgments. Researchers may take self-reports at face value and ignore the subtle ways that unwanted method variance sneaks into these reports. Self-reports are also very flexible. Researchers can choose open-ended questions or closed-ended response scales; they can vary the time frame of the question, the specific response options used, and the precise wording of the questions. The drawback of this flexibility is that these seemingly unimportant decisions can have serious consequences for the results of the self-report assessment.
As this volume makes clear, conducting multi-method investigations of validity can increase confidence in any single method of assessment. However, in situations where multimethod assessment cannot be used, the choice of a specific method must be guided by an explicit consideration of the advantages and disadvantages of that approach. There are a number of types of self-reports, and there are different advantages and disadvantages depending on the purpose of the assessment.
One major distinction is between self-reports of objectively verifiable phenomena like behaviors and events and self-reports of psychological constructs (e.g., beliefs, intentions, and attitudes; Schwarz, Groves, & Schuman, 1998). Different processes likely operate when constructing these two types of judgments, and thus, different concerns may arise depending on how the measure is being used. Presumably, when participants are asked to report on behaviors, an objective criterion exists and the validity of the self-report can be assessed by determining the extent to which the self-report matches the criterion. For example, a researcher may be interested in the number of alcoholic beverages a person consumes over the course of a week. Rather than following that individual over time and recording these instances, the researcher may simply ask the person to retrospectively report on this behavior. The validity of this report can be assessed by comparing it to an objective measure.
Self-reports of attitudes, intentions, and other psychological variables are somewhat more complicated. In this case, there is no objective criterion to verify the self-reports, and errors in self-reports are difficult to detect. As Schwarz et al. (1998) have noted in the context of attitude research, "If we want to talk of 'errors' in attitude measurement at all, we can only do so relative to what we were trying to measure in the questionnaire, not relative to any objective standard that reflects respondents' 'true' attitudes" (p. 158). Thus, a flawed self-report is one that is not logical or one shown to be influenced by some feature or stimulus that is theoretically unrelated to the attitude in question. A number of experimental studies have shown that such errors do occur. Participants often respond in illogical ways or they may respond differently depending on irrelevant contextual factors.
Finally, a self-report can be used as a form of behavior, in and of itself (Critchfield et al., 1998). When researchers use self-reports in this way, they are not interested in the extent to which the report is "correct." Instead, they are solely interested in the ways that variations in responses correlate with relevant predictor or outcome variables. In fact, much of the research investigating the cognitive processes underlying self-report methodology uses self-report methodology in this way. Researchers in these studies are not interested in the content of the responses per se, but in the ways that those responses are affected by various experimental factors. For example, in their famous study examining the way mood affects life satisfaction judgments, Schwarz and Clore (1983) found that individuals reported higher life satisfaction on a warm, sunny day than on a cold, rainy one. Schwarz and Clore were not interested in life satisfaction (i.e., they were not interested in getting a true measure of an individual's standing on this construct). Instead, they were interested in the cognitive processes that individuals used to construct satisfaction judgments, and the satisfaction reports themselves were a form of behavior that indicated the underlying process.
The distinctions among the various types of self-report methodology matter because the factors that influence the validity of self-reports and the ways in which we validate self-report measures often vary depending on how the measure is being used. For example, a personality researcher interested in assessing extraversión may ask participants to respond to an item like, "I enjoy going to parties." The researcher may have one of three expectations about responses to this item. First, he or she may expect responses on this item to be similar to self-reports of behavior. If so, responses to the item should strongly correlate with the frequency with which a person goes to parties. If the response does not correlate with the behavior, this suggests that the item is not valid.
Alternatively, the item could be thought of as a self-report of an attitude toward parties. In this case, responses to the item are not necessarily expected to correlate strongly with the number of times that a person goes to parties, but should predict the enjoyment a person experiences when he or she does go to parties. Validation of the global self-report could be accomplished by comparing responses on this item to online assessment of enjoyment actually experienced during a party.
Finally, responses to the item "I enjoy going to parties" may be seen as a form of behavior that can predict some other criterion, even if the item is not a valid measure of the behavior or attitude it appears to tap. For example, a respondent may consider himself or herself to be an extravert and recognize that the item "I enjoy going to parties" is an extraversión item. This respondent may then respond positively to the item, even if he or she does not particularly enjoy parties. Alternatively, the respondent may try to answer the question accurately but because of flawed memory or judgment processes he or she may make a mistake. In either case, if responses to the item predict relevant outcomes like the number of sensation seeking behaviors in which people engage or the number of friends that individuals have, then the item holds some degree of validity. This is the principle behind empirical criterion keying, in which items are selected based on the extent to which they can predict some meaningful criterion (Anastasi, 1988; Meehl, 1945). Thus, even if we can show flaws in the processes that lead to self-reported judgments, these flaws do not necessarily invalidate the self-report measure. To assert that a measure lacks validity, researchers must also show that the measure fails to predict relevant criteria. Often, studies that purport to demonstrate the invalidity of self-report measures do so by showing that participants use irrelevant sources of information when constructing judgments. However, the measures themselves may still be valid, even if judgments are constructed in a nonintuitive or flawed manner.
Much of the research on the fallibility of self-reports comes from survey research, and the goals of survey research often differ from the goals of other forms of psychological measurement. In survey research, researchers often focus on mean levels or frequencies within a specific population. For instance, researchers may wish to assess the likelihood of a certain population voting for a particular political candidate. If some feature of the questionnaire leads to an overestimation of support for a candidate, then the self-reported survey response is invalid. But in much psychological research, the absolute level of a characteristic is not meaningful, and researchers use scores on a self-report inventory as correlates or predictors of other outcomes. As Schwarz et al. (1998) have noted, many of the response effects identified in the survey literature have a larger effect on mean levels and other characteristics of item distributions than on correlational results. Thus, when possible, we will distinguish between these two types of effects.
Was this article helpful?