Guidelines For Using Experimental Assessment Methods

The previous sections of this chapter have shown how experimental methods can help address problems of psychological assessment for which alternative solutions based on more traditional methods still appear to be lacking. Therefore, it should not come as a surprise that experimental assessment methods have become more and more frequent in psychological research during the past two decades.

In the area of memory assessment, for example, both the process dissociation procedure of measuring controlled ("explicit") and automatic ("implicit") memory processes (e.g., Jacoby, 1991, 1998) and the source monitoring paradigm (e.g., Johnson, Hashtroudi, & Lindsay, 1993) designed to assess simultaneously item memory (i.e., memory for a piece of information) and source memory (i.e., memory for the source or the context of a piece of information) have become frequently used tools. Most of the measurement models developed for these and other cognitive paradigms belong to a very general class called multinomial processing tree models (Batchelder & Riefer, 1999; Riefer, Knapp, Batchelder, Bamber, & Manifold, 2002).

As another example from the field of biopsy-chology, consider the subtraction method routinely used in neuroimaging studies to detect brain regions that are associated with specific cognitive processes. The difference method also belongs to the class of experimental assessment techniques because it is based on the within-subjects comparison of brain activities under two experimental conditions that presumably differ only in the cognitive activity that is performed in response to the demands of a task. Still another example is the Implicit Association Test (IAT) recently introduced by Greenwald, McGhee, and Schwartz (1998) to assess "implicit" or unconscious attitudes in social psychological and personality research. Like some of the techniques considered previously, the IAT is based on the wi thin-subjects comparison of response times registered under two experimental conditions, a congruent condition that favors fast responding and an incongruent condition that hinders fast responding to the extent that there is an implicit association between two concepts of interest. These examples may suffice to illustrate that the list of possible applications of experimental assessment techniques is indeed long and includes all branches of psychology.

As we have seen, however, experimental assessments, like more traditional assessment methods, are not without problems. First, the validity of an experimental assessment method depends on the validity of the measurement model or law on which it is based. Therefore, these techniques should not be applied in practice unless strong evidence supporting the underlying model or law has been accumulated. This necessary process of model validation may even lead to better measurement models than the one the validation process has started with. In the area of memory assessment, this has happened quite frequently, for example, in the case of measurement models developed for the process dissociation procedure (e.g., Buchner, Erdfelder, Steffens, <&r Martensen, 1997; Buchner et al., 1995; Erdfelder & Buchner, 1998b; Steffens, Buchner, Martensen, & Erdfelder, 2000; Yonelinas & Jacoby, 1996; Yu & Bellezza, 2000) or source monitoring tasks (Batchelder & Riefer, 1990; Bayen, Murnane, & Erdfelder, 1996; Dodson, Holland, & Shimamura, 1998; Klauer & Wegener, 1998; Meiser & Bröder, 2002; Riefer, Hu, & Batchelder, 1994). The randomized response and polygraph lie detection techniques have taken similar routes (see earlier discussion, this chapter).

In testing measurement models or laws for purposes of psychological assessment, three aspects should be kept in mind. First, one should try to avoid saturated models that can fit any empirical data structure simply because the number of estimated parameters equals the number of data points to which the model is being fitted. To provide for testable models, it is much better to ensure that the number of independent data points exceeds the number of parameters estimated from these data. Second, in testing nonsaturated measurement models, one should refer to the empirical level that is implied by the assessment technique. If the assessment technique refers to aggregates, then the empirical tests should refer to the same aggregates. In contrast, if the assessment method refers to individuals, then the empirical model tests should also pertain to individuals. Model validity on one level does not imply validity on the other level. Third, model validation requires more than just establishing acceptable goodness-of-fit indices. Systematic validation studies have to establish the construct validity of the model parameters by showing that they differ between certain populations or treatment conditions in a way that is consistent with their psychological interpretation (see Bayen et al., 1996; Buchner et al., 1995; Erdfelder & Buchner, 1998a; Klauer & Wegener, 1998; Meiser & Bröder, 2002).

If a model has passed all these validation hurdles, reliability of parameter estimates is a final issue. The standard way of enhancing reliability by increasing the number of data points is not always applicable because it can be costly, time consuming, or even interfere with the validity of the assessment method. In such a situation it is useful to study how the confidence intervals of the to-be-assessed parameters depend on the values of other model parameters that are of minor importance in the assessment context. In Clark and Desharnais' (1998) cheater detection RRT model, for example, the test administrator may choose any pair of probabilities pj and p2 underlying the two random devices required for this method. Although the values p1 and p2 might be less relevant psychologically, they do affect the error of the estimate of the target parameter jt and thus the reliability of the assessment method in total. By carefully selecting both the context of the test and the values of background parameters such as p1 and p2 in the model of Clark and Desharnais (1998), test administrators often can maximize the reliability of experimental assessment at no additional cost.

Chapter 16

Was this article helpful?

0 0

Post a comment