As the preceding review demonstrates, many health psychology studies use multiple measures. But only those described under (b) through (d) truly reflect the use of multimethod strategies in the sense of Campbell and Fiske (1959). To date, only a few studies have been conducted in each of these categories. This is also apparent in the low incidence rate that has been found up to now for the topic of multimethod strategies in the major health psychology journals such as Health Psychology, the British Journal of Health Psychology, or Psychology & Health.
Particularly evident in health psychology is the predominance of self-report measures. Chapter 3 discusses the benefits and drawbacks of using self-report measures in psychological studies. The predominant use of these measures causes a variety of possible problems, including (a) shared response bias, (b) lack of construct validity, (c) method specificity, and (d) tainted predictor-criterion relationships (i.e., conceptual overlap between predictors and criterion). First, certain constant sources of error can bias reports to all the different self-report measures used in a study. These can be response styles or response sets such as acquiescence, self-deception, social desirability, defensiveness, or idiosyncrasies in the use of numbers. For example, defensiveness could lead certain individuals to underreport both perceived stress and perceived symptoms, thereby falsely increasing the correlation between the two variables. If other, non-self-report measures were not simultaneously assessed in the study, the possibility arises that high correlations simply reflect common method biases (Spector, 1994). This has, for example, been brought forward by Larsen (1992), who found an association between neuroticism and inflated self-reports of the frequency and severity of gastrointestinal, respiratory, and depressive symptoms at both the time of encoding and at later recall. In other words, individuals high in neuroticism showed inflated scores on self-reports, thereby creating a common method bias in the data (Larsen, 1992). The inclusion of an objective measure such as medical records could prevent erroneous inferences drawn from self-report data.
Second, scores of self-report measures may in some circumstances not be a valid reflection of the construct that the instrument purportedly measures (e.g., hostility) but may rather reflect an individual's standing on an unrelated construct, (e.g., defensiveness). This could be the case, for example, when people respond to items in a certain way (e.g., responding defensively to a measure of hostility) but the use of this response style is not discovered by the researchers. The scores on the measure are then interpreted as measuring hostility, whereas, in fact, they reflect defensiveness. In this case, associations found between the predictor (e.g., the hostility measure) and an outcome (e.g., cardiovascular disease) may in fact reflect an independent association between defensiveness and cardiovascular disease. Indeed, such associations between, for example, defensive responding and hypertensive status (e.g., Mann & James, 1998) and between defensive responding and higher blood pressure (e.g., Shapiro, Goldstein, &Jamner, 1995) have been found. Rutledge, Linden, and Davies (2000) demonstrated the problem just outlined nicely in a study predicting cardiovascular health. They found that response styles (e.g., self-deception) in personality questionnaires in fact were themselves predic tive of poor cardiovascular health. Response styles were found to be important independent predictors of blood pressure changes across a 3-year interval, leading the authors to conclude that they are important personality traits that play a role in the regulation of blood pressure levels, rather than confounds in the prediction of cardiovascular health.
Third, if solely self-report measures are used to establish validity, important facets of the theoretical construct may be overlooked because the self-report measure might simply not be able to capture this particular aspect of the construct. For example, there might be aspects of quality of life or pain that individuals cannot easily express in verbal terms. The most commonly cited examples to illustrate this problem, as well as the most rigorous attempts to resolve this measurement problem, can be found in the area of stress research and therein particularly the measurement of stressors (Hurrell et al., 1998). Different types of indicators (self-report, proxy report, observational, quantitative measures of the work environment) are increasingly used in combination to address this problem. It is now commonly recognized that perceptions of the work environment are not a proxy for the objective work environment and that both objective and subjective concepts of stress deserve attention on their own and in combination (Spector, 1994).
A final problem of the sole use of self-report measures is the possibility of conceptual (i.e., item) overlap between the predictor variables and the criterion, meaning that the items might essentially be assessing the same construct, which may then be falsely interpreted as a psychologically meaningful correlational or even causal predictor-outcome relationship (Burns, 2000; Hurrell et al., 1998). Kasl (1978) referred to this problem as the "triviality trap" (p. 14). An example, again from stress research, would be if measures assessing stressors (aspects of work and work environment) and measures assessing strain (reactions to stress) have overlapping items (Hurrell et al., 1998). Furthermore, in cross-sectional studies, respondents' answers to self-report measures assessing the predictor variables can affect their responses to subsequent self-report measures assessing the criterion variable and vice versa. For instance, filling out a psychometric scale measuring perceived self-efficacy regarding exercising can affect the exercise frequency or endurance that people report when asked in the context of the same questionnaire. Hence, the relationship between predictors and criterion variable becomes tainted, again leading to the false belief that meaningful, valid relationships between independent constructs were found when, in fact, the associations are not genuine (Hurrell et al., 1998). One measure to safeguard against tainted predictor-criterion relationships is better construct explications. If the predictor constructs and the criterion construct are each clearly defined and clearly delineated from each other and other constructs in the study, the problem of conceptual overlap is less likely to occur. If, however, items are not unique to a certain measure, this results in poor discriminant validity of the assessed constructs and their association with the criterion. In sum, more careful construct explication at the design stage (where measures are chosen) is required to secure the detection of valid associations.
A strategy for overcoming the problems that have been described in the sole use of self-report measures is triangulation, which simply means that a particular phenomenon is assessed in multiple modalities. In the area of stress research, for example, self-report measures of strain (reactions to stressful work conditions) can be backed up with more objective indicators such as physiological measures or observational data (Hurrell et al., 1998). If the multimodal assessment methods all yield the same result, one can be quite sure that the observed associations are valid. If discrepancies emerge, they will require follow-up investigations, and those may lead to further insights into the phenomenon under study. In fact, convergent validity between measures assessing the same construct using different modalities are as a rule relatively modest in health psychology, often not exceeding r = .20. This reflects not only that the individual measures assess different aspects in different modalities, but also the unreliability in the measures themselves.
Strong data analytic techniques may take care of some of the aforementioned problems, namely, method bias and predictor-criterion overlap. For example, multivariate data analysis techniques involving structural equation modeling (SEM; e.g., confirmatory factor analysis, regression models, or path analysis; see Eid, Lischetzke, & Nussbeck, chap. 20, this volume) explicitly recognize measurement as difficult and potentially biased. In SEM, measurement error is explicidy modeled so that unbiased estimates for the relations between theoretical constructs, represented by latent (i.e., unmeasured) factors, can be derived. This is accomplished by requiring researchers to start by specifying and testing a measurement model before proceeding on to examining the structural relationships that their theory suggests. Convergent and discriminant validity can be assessed by estimating the goodness of fit of the measurement model (Anderson & Gerbing, 1988). SEM thus allows an estimate of how much the model is affected by the way the constructs are measured.
SEM or other powerful data analysis techniques, however, cannot take care of the basic problem that the specific measures or combinations of measures used may not capture all relevant dimensions of the predictors (e.g., Cohen, Kessler, & Gordon, 1995). In other words, SEM cannot "repair" the damage caused if measures were chosen that are not good indicators of the theoretical constructs or if the measures are unreliable. Thus, for valid theory testing, a well-thought-out choice of measures and an improvement in the (self-report) measures themselves is essential.
In addition to being aware of problems using self-report measures in research and addressing them with modern data analytical techniques, health psychology could benefit from more meta-analytic studies. Meta-analyses can provide critical information for the design of correlational or experimental studies. Specifically, meta-analysis allows an estimation of the relations among constructs much more reliably than can be done in single studies. The results of meta-analyses can thus reveal which theoretical constructs consistently show reliable relations with other constructs and can thereby help formulating a meaningful nomological network for the prediction of health behavior or health behavior change. Based on the results of meta analyses, theories of health behavior and health behavior change can be modified and refined, and then exposed to renewed empirical testing.
Was this article helpful?