Basis of Inferences and Impressions

The interpretation of assessment data involves four sets of alternatives with respect to how assessors go about drawing inferences and forming impressions about what these data indicate. Interpretations can be based on either empirical or conceptual approaches to decision making; they can be guided either by statistically based decision rules or by clinical judgment; they can emphasize either nomothetic or idio-graphic characteristics of respondents; and they can include more or less reliance on computer-generated interpretive statements. Effective assessment usually involves informed selection among these alternatives and some tailoring of the emphasis given each of them to fit the particular context of the individual assessment situation.

Empirical and Conceptual Guidelines

The interpretation of assessment information can be approached in several ways. In what may be called an intuitive approach, assessment decisions stem from impressions that have no identifiable basis in the data. Instead, interpretations are justified by statements like "It's just a feeling I have about her," or "I can't say where I get it from, but I just know he's that way." In what may be called an authoritative approach, interpretations are based on the pronouncements of well-known or respected assessment psychologists, as in saying, "These data mean what they mean because that's what Dr. Expert says they mean." The intuition of unusually em-pathic assessors and reliance on authority by well-read practitioners who choose their experts advisedly may on occasion yield accurate and useful impressions. Both approaches have serious shortcomings, however. Unless intuitive assessors can identify specific features of the data that help them reach their conclusions, their diagnostic sensitivity cannot be taught to other professionals or translated into scientifically verifiable procedures. Unless authoritative assessors can explain in their own words the basis on which experts have reached the conclusions being cited, they are unlikely to impress others as being professionally knowledgeable themselves or as knowing what to think in the absence of being told by someone else what to think.

Moreover, neither intuitive nor authoritative approaches to interpreting assessment information are likely to be as consistently reliable as approaches based on empirical and conceptual guidelines. Empirical guidelines to decision making derive from the replicated results of methodologically sound research. When a specific assessment finding has repeatedly been found to correlate highly with the presence of a particular psychological characteristic, it is empirically sound to infer the presence of that characteristic in a respondent who displays that assessment finding. Conceptual guidelines to decision making consist of psychological constructs that provide a logical bridge between assessment findings and the inferences drawn from them. If subjectively felt distress contributes to a person's remaining in and benefiting from psychotherapy (for which there is considerable evidence; see Garfield, 1994; Greencavage & Norcross, 1990; Mohr, 1995), and if a test includes a valid index of subjectively felt distress (which many tests do), then it is reasonable to expect that a positive finding on this test index will increase the predicted likelihood of a favorable outcome in psychotherapy.

Both empirical and conceptual guidelines to interpretation bring distinct benefits to the assessment process. Empirical perspectives are valuable because they provide a foundation for achieving certainty in decision making. The adequacy of psychological assessment is enhanced by quantitative data concerning the normative distribution and other psychometric properties of measurements that reflect dimensions of psychological functioning. Lack of such data limits the confidence with which assessors can draw conclusions about the implications of their findings. Without being able to compare an individual's test responses with normative expectations, for example, or without a basis for estimating false positive and false negative possibilities in the measures they have used, assessors can only be speculative in attaching interpretive significance to their findings. Similarly, the absence of externally validated cutting scores detracts considerably from the certainty with which assessors can translate test scores into qualitative distinctions, such as whether a person is mildly, moderately, or severely depressed.

Conceptual perspectives are valuable in the assessment process because they provide some explanation of why certain findings are likely to identify certain kinds of psychological characteristics or predict certain kinds of behavior. Having such explanations in hand offers assessors the pleasure of understanding not only how their measures work but also why they work as they do; they help assessors focus their attention on aspects of their data that are relevant to the referral question to which they are responding; and they facilitate the communication of results in terms that address characteristics of the person being examined and not merely those of the data obtained. As a further benefit of conceptual formulations of assessment findings, they foster hypotheses concerning previously unknown or unexplored linkages between assessment findings and dimensions of psychological functioning and thereby help to extend the frontiers of knowledge.

Empirical guidelines are thus necessary to the scientific foundations of assessment psychology, as a basis for certainty in decision making, but they are not sufficient to bring this assessment to its full potential. Conceptual guidelines do not by themselves provide a reliable basis for drawing conclusions with certainty. However, by enriching the assessment process with explanatory hypotheses, they point the way to advances in knowledge.

For the purposes that each serves, then, both empirical and conceptual guidelines have an important place in the interpretation of assessment information. At times, concerns about preserving the scientific respectability of assessment have led to assertions that only empirical guidelines constitute an acceptable basis for decision making and that unvalidated conceptual guidelines have no place in scientific psychology. McFall and Treat (1999), for example, maintain that "the aim of clinical assessment is to gather data that allow us to reduce uncertainty concerning the probability of events" (p. 215). From their perspective, the information value of assessment data resides in scaled numerical values and conditional probabilities.

As an alternative point of view, let it be observed that the river of scientific discovery can flow through inferential leaps of deductive reasoning that suggest truths long before they are confirmed by replicated research findings. Newton grasped the reason that apples fall from trees well in advance of experiments demonstrating the laws of gravity, Einstein conceived his theory of relativity with full confidence that empirical findings would eventually prove him correct, and neither has suffered any challenges to his credentials as a scientist. Even though empirical guidelines are, on the average, more likely to produce reliable conclusions than are conceptual formulations, as already noted, logical reasoning concerning the implications of clearly formulated concepts can also generate conclusions that serve useful purposes and stand the test of time.

Accordingly, the process of arriving at conclusions in individual case assessment can involve creative as well as confirmatory aspects of scientific thinking, and the utilization of assessment to generate hypotheses and fuel speculation may in the course of scientific endeavor increase rather than decrease uncertainty in the process of identifying new alternative possibilities to pursue. This perspective is echoed by DeBruyn (1992) in the following comment: "Both scientific decision making in general, and diagnostic decision making in particular, have a repetitive side, which consists of formulas and algorithmic procedures, and a constructive side, which consists of generating hypotheses and theories to explain things or to account for unexpected findings" (p. 192).

Statistical Rules and Clinical Judgment

Empirical guidelines for decision making have customarily been operationalized by using statistical rules to arrive at conclusions concerning what assessment data signify. Statistical rules for interpreting assessment data comprise empirically derived formulas, or algorithms, that provide an objective, actuarial basis for deciding what these data indicate. When statistical rules are applied to the results of a psychological evaluation, the formula makes the decision concerning whether certain psychological characteristics are present (as in deciding whether a respondent has a particular trait or disorder) or whether certain kinds of actions are likely to ensue (as in predicting the likelihood of a respondent's behaving violently or performing well in some job). Statistical rules have the advantage of ensuring that examiners applying a formula correctly to the same set of data will always arrive at the same conclusion concerning what these data mean. As a disadvantage, however, the breadth of the conclusions that can be based on statistical rules and their relevance to referral questions are limited by the composition of the database from which they have been derived.

For example, statistical rules may prove helpful in determining whether a student has a learning disability, but say nothing about the nature of this student's disability; they may predict the likelihood of a criminal defendant's behaving violently, but offer no clues to the kinds of situations that are most likely to evoke violence in this particular criminal defendant; or they may help identify the suitability of a person for one type of position in an organization, but be mute with respect to the person's suitability for other types of positions in the same organization. In each of these instances, moreover, a statistical rule derived from a group of people possessing certain demographic characteristics (e.g., age, gender, socioeconomic status, cultural background) and having been evaluated in a particular setting may lack validity generalization to persons with different demographic characteristics evaluated in some other kind of setting. Garb (2000) has similarly noted in this regard that "statistical-prediction rules are of limited value because they have typically been based on limited information that has not been demonstrated to be optimal and they have almost never been shown to be powerful" (p. 31).

In other words, then, the scope of statistical rules is restricted to findings pertaining to the particular kinds of persons, psychological characteristics, and circumstances that were anticipated in building them. For many of the varied types of people seen in actual assessment practice, and for many of the complex and specifically focused referral questions raised about these people, then, statistical rules that by themselves provide fully adequate answers may be in short supply.

As a further limitation of statistical rules, they share with all quantified assessment scales some unavoidable artificiality that accompanies translating numerical scores into qualitative descriptive categories. On the Beck Depression Inventory (BDI; Beck, Steer, & Garbin, 1988), for example, a score of 14 to 19 is taken to indicate mild depression and a score of 20 to 28 indicates moderate depression. Hence two people who have almost identical BDI scores, one with a 19 and the other with a 20, will be described much differently by the statistical rule, one as mildly depressed and the other as moderately depressed. Likewise, in measuring intelligence with the Wechsler Adult Intelligence Scale-III (WAIS-III; Kaufman, 1990) a Full Scale IQ score of 109 calls for describing a person's intelligence as average, whereas a person with almost exactly the same level of intelligence and a Full Scale IQ of 110 falls in the high average range. According to the WAIS-III formulas, a person with a Full Scale IQ of 91 and a person with a Full Scale IQ of 119 would also be labeled, respectively, as average and high average. Some assessors minimize this problem by adding some further specificity to the WAIS-III categories, as in labeling a 109IQ as the high end of the average range and a 110 IQ as the low end of the high average range. Although additional categorical descriptions for more narrowly defined score ranges can reduce the artificiality in the use of statistical rules, there are limits to how many quantitative data points on a scale can be assigned a distinctive qualitative designation.

Conceptual guidelines for decision making have been op-erationalized in terms of clinical judgment, which consists of the cumulative wisdom that practitioners acquire from their experience. Clinical guidelines may come to represent the shared beliefs of large numbers of practitioners, but they emerge initially as impressions formed by individual practitioners. In contrast to the objective and quantitative features of statistical rules, clinical judgments constitute a subjective and qualitative basis for arriving at conclusions. When clinical judgment is applied to assessment data, decisions are made by the practitioner, not by a formula. Clinical judgments concerning the interpretive significance of a set of assessment data are consequently less uniform than actuarial decisions and less likely to be based on established fact. On the other hand, the applicability of clinical judgments is infinite, and their breadth and relevance are limited not by any database, but only by the practitioner's capacity to reason logically concerning possible relationships between psychological characteristics identified by the assessment data and psychological characteristics relevant to addressing referral questions, whatever their complexity and specificity.

The relative merit of statistical rules and clinical judgment in the assessment process has been the subject of considerable debate since this distinction was first formulated by Meehl (1954) in his book Clinical Versus Statistical Prediction. Subsequent publications of note concerning this important issue include articles by Grove and Meehl (1996), Grove, Zald, Lebow, Snitz, and Nelson (2000), Holt (1958, 1986), Karon (2000), Meehl (1986), and Swets, Dawes, and Monahan (2000), and a book by Garb (1998) entitled Studying the Clinician. Much of the literature on this topic has consisted of assertions and rebuttals concerning whether statistical methods generally produce more accurate assessment results than clinical methods. In light of the strengths and weaknesses inherent in both statistical prediction and clinical judgment, as elaborated in the chapter by Garb in this volume, such debate serves little purpose and is regrettable when it leads to disparagement of either approach to interpreting assessment data.

As testimony to the utility of both approaches, it is important to note that the creation of good statistical rules for making assessment decisions typically begins with clinically informed selection of both (a) test items, structured interview questions, and other measure components to be used as predictor variables, and (b) psychological conditions, behavioral tendencies, and other criterion variables to which the predictor variables are expected to relate. Empirical methods of scale construction and cross-validation are then employed to shape these clinically relevant assessment variables into valid actuarial measures of these clinically relevant criterion variables. Hence good statistical rules should almost always produce more accurate results than clinical judgment, because they encompass clinical wisdom plus the sharpening of this wisdom by replicated research findings. Clinical methods of assessment at their best depend on the impressions and judgment of individual practitioners, whereas statistical methods at their best constitute established fact that has been built on clinical wisdom. To rely only on clinical judgment in decision-making situations for which adequate actuarial guidelines are available is tantamount to playing cards with half a deck. Even the best judgment of the best practitioner can at times be clouded by inadvertent bias, insufficient awareness of base rates, and other sources of influence discussed in the final section of this chapter and elaborated in the chapter by Reynolds and Ramsay in this volume. When one is given a reasonable choice, then, assessment decisions are more advisedly based on established fact rather than clinical judgment.

On the other hand, the previously noted diversity of people and of the circumstances that lead to their being referred for an evaluation mean that assessment questions regularly arise for which there are no available statistical rules, and patterns of assessment data often resemble but do not quite match the parameters for which replicated research has demonstrated certain correlates. When statistical rules cannot fully answer questions being asked, what are assessors to do in the absence of fully validating data? Decisions could be deferred, on the grounds that sufficient factual basis for a decision is lacking, and recommendation could be delayed, pending greater certainty about what recommendation to make. Alternatively, assessors in a situation of uncertainty can supplement whatever empirical guidelines they do have at their disposal with logical reasoning and cumulative clinical wisdom to arrive at conclusions and recommendations that are more responsive and at least a little more likely to be helpful than saying nothing at all.

As these observations indicate, statistical rules and clinical judgment can properly be regarded as complementary components of effective decision making, rather than as competing and mutually exclusive alternatives. Each brings value to assessment psychology and has a respectable place in it. Geisinger and Carlson (2002) comment in this regard that the time has come "to move beyond both purely judgmental, speculative interpretation of test results as well as extrapolations from the general population to specific cases that do not much resemble the remainder of the population" (p. 254).

Assessment practice should accordingly be subjected to and influenced by research studies, lest it lead down blind alleys and detract from the pursuit of knowledge and the delivery of responsible professional service. Concurrently, however, lack of unequivocal documentation should not deter assessment psychologists from employing procedures and reaching conclusions that in their judgment will assist in meeting the needs of those who seek their help. Commenting on balanced use of objective and subjective contributions to assessment decision making, Swets et al. (2000) similarly note that "the appropriate role of the SPR [Statistical Prediction Rule] vis-à-vis the diagnostician will vary from one context to another" and that the most appropriate roles of each "can be determined for each diagnostic setting in accordance with the accumulated evidence about what works best" (p. 5). Putting the matter in even simpler terms, Kleinmuntz (1990) observed that "the reason why we still use our heads, flawed as they may be, instead of formulas is that for many decisions, choices and problems, there are as yet no available formulas" (p. 303).

Nomothetic and Idiographic Emphasis

Empirical guidelines and statistical rules constitute a basically nomothetic approach to interpreting assessment information, whereas conceptual guidelines and clinical judgment underlie a basically idiographic approach. Nomothetic interpretations address ways in which people resemble other kinds of people and share various psychological characteristics with many of them. Hence, these interpretations involve comparisons between the assessment findings for the person being examined and assessment findings typically obtained from groups of people with certain known characteristics, as in concluding that "this person's responses show a pattern often seen in people who feel uncomfortable in social situations and are inclined to withdraw from them." The manner in which nomothetic interpretations are derived and expressed is thus primarily quantitative in nature and may even specify the precise frequency with which an assessment finding occurs in particular groups of people.

Idiographic interpretations, by contrast, address ways in which people differ from most other kinds of people and show psychological characteristics that are fairly unique to them and their particular circumstances. These interpretations typically comprise statements that attribute person-specific meaning to assessment information on the basis of general notions of psychological processes, as in saying that "this person gives many indications of being a passive and dependent individual who is more comfortable being a follower than a leader and will as a consequence probably have difficulty functioning effectively in an executive position." Deriving and expressing idiographic interpretations is thus a largely qualitative procedure in which examiners are guided by informed impressions rather than by quantitative empirical comparisons.

In the area of personality assessment, both nomothetic and idiographic approaches to interpretation have a long and distinguished tradition. Nomothetic perspectives derive from the work of Cattell (1946), for whom the essence of personality resided in traits or dimensions of functioning that all people share to some degree and on which they can be compared with each other. Idiographic perspectives in personality theory were first clearly articulated by Allport (1937), who conceived the essence of personality as residing in the uniqueness and individuality of each person, independently of comparisons to other people. Over the years, assessment psychologists have at times expressed different convictions concerning which of these two traditions should be emphasized in formulating interpretations. Practitioners typically concur with Groth-Marnat (1997) that data-oriented descriptions of people rarely address the unique problems a person may be having and that the essence of psychological assessment is an attempt "to evaluate an individual in a problem situation so that the information derived from the assessment can somehow help with the problem" (p. 32). Writing from a research perspective, however, McFall and Townsend (1998) grant that practitioners must of necessity provide idiographic solutions to people's problems, but maintain that "nomo-thetic knowledge is a prerequisite to valid idiographic solutions" (p. 325). In their opinion, only nomothetic variables have a proper place in the clinical science of assessment.

To temper these points of view in light of what has already been said about statistical and clinical prediction, there is no reason that clinicians seeking solutions to idiographic problem cannot or should not draw on whatever nomothetic guidelines may help them frame accurate and useful interpretations. Likewise, there is no reason that idiography cannot be managed in a scientific fashion, nor is a nomothetic-idiographic distinction between clinical science and clinical practice likely to prove constructive in the long run. Stricker (1997) argues to the contrary, for example, that science incorporates an attitude and a set of values that can characterize office practitioners as well as laboratory researchers, and that "the same theoretical matrix must generate both science and practice activities" (p. 442).

Issues of definition aside, then, there seems little to be gained by debating whether people can be described better in terms of how they differ from other people or how they resemble them. In practice, an optimally informative and useful description of an individual's psychological characteristics and functioning will encompass the person's resemblance to and differences from other people in similar circumstances about whom similar referral questions have been posed. Nomothetic and idiographic perspectives thus complement each other, and a balanced emphasis on both promotes the fullest possible understanding of a person being examined.

Computer-Generated Interpretive Statements

Most published tests include software programs that not only assist in the collection of assessment data, as already discussed, but also generate interpretive statements describing the test findings and presenting inferences based on them. Like computerized data collection, computer-based test interpretation (CBTI) brings some distinct advantages to the assessment process. By virtue of its automation, CBTI guarantees a thorough scan of the test data and thereby eliminates human error that results from overlooking items of information in a test protocol. CBTI similarly ensures that a pattern of test data will always generate the same interpretive statement, uniformly and reliably, thus eliminating examiner variability and bias as potential sources of error. CBTI can also facilitate the teaching and learning of assessment methods, by using computergenerated narratives as an exercise requiring the learner to identify the test variables likely to have given rise to particular statements. The potential benefits of computerizing test interpretations, as well as some drawbacks of doing so, are elaborated in the chapter by Butcher in this volume (see also Butcher, 2002). Four limitations of CBTI have a particular bearing on the extent to which examiners should rely on computer-generated statements in formulating and expressing their impressions.

First, although test software generates interpretive statements by means of quantitative algorithmic formulas, these computer programs are not entirely empirically based. Instead, they typically combine empirically validated correlates of test scores with clinical judgments about what various patterns of scores are likely to signify, and many algorithms involve beliefs as well as established fact concerning what these patterns mean. Different test programs, and even different programs for the same test, vary in the extent to which their interpretive statements are research based. Although CBTI generally increases the validity and utility of test interpretations, then, considerable research remains to be done to place computerized interpretation on a solid empirical basis (see Garb, 2000). In the meantime, computer-generated interpretations will embody at least some of the strengths and weaknesses of both statistical and clinical methods of decision making.

Second, the previously noted limitation of statistical rules with respect to designating quantitative score ranges with qualitative descriptors carries over into CBTI algorithms. Cutting points must be established, below which one kind or degree of descriptive statement is keyed and above which a different kind or degree of description will be generated. As a consequence, two people who show very similar scores on some index or scale may be described by a computer narrative in very different terms with respect to psychological characteristics measured by this index or scale.

Third, despite often referring specifically to the person who took the test (i.e., using the terms he, she, or this person) and thus giving the appearance of being idiographic, computergenerated interpretations do not describe the individual person who was examined. Instead, these interpretations describe test protocols, in the sense that they indicate what research findings or clinical wisdom say about people in general who show the kinds of test scores and patterns appearing in the protocol being scanned. Hence computer narratives are basically nomothetic, andmostofthem phrase at least some interpretive statements in terms of normative comparisons or even, as previously noted, specific frequencies with which the respondent's test patterns occur in certain groups of people. However, because no two people are exactly alike and no one person matches any comparison group perfectly, some computer-generated interpretive statements may not describe an individual respondent accurately. For this reason, well-developed test software narratives include a caveat indicating that (a) the interpretive statements to follow describe groups of people, not necessarily the person who took the test; (b) misleading and erroneous statements may occur as a reflection of psychological characteristics or environmental circumstances unique to the person being examined and not widely shared within any normative group; and (c) other sources of information and the assessor's judgment are necessary to determine which of the statements in an interpretive narrative apply to the respondent and which do not.

Fourth, the availability of computer-generated interpretive statements raises questions concerning their proper utilization in the preparation of an assessment report. Ideally, assessors should draw on computer narratives for some assistance, as for example in being sure that they have taken account of all of the relevant data, in checking for discrepancies between their own impressions and the inferences presented by the machine, and perhaps in getting some guidance on how best to organize and what to emphasize in their report. Less ideal is using CBTI not merely for supportive purposes but as a replacement for assessors' being able and willing to generate their own interpretations of the measures they are using.

Most of the assessment psychologists responding to the previously mentioned McMinn et al. (1999) survey reported that they never use CBTI as their primary resource for case formulation and would question the ethicality of doing so.

Even among ethical assessors, however, CBTI can present some temptations, because many computerized narratives present carefully crafted sentences and paragraphs that communicate clearly and lend themselves to being copied verbatim into a psychological report. Professional integrity would suggest that assessors relying on computer-generated conclusions should either express them in their own words or, if they are copying verbatim, should identify the copied material as a quotation and indicate its source. Beyond ethicality and integrity, unfortunately, the previously mentioned software accessibility that allows untrained persons to collect and score test protocols by machine also makes it possible for them to print out narrative interpretations and reproduce them fully or in part as a report, passing them off as their own work without any indication of source. Aside from representing questionable professional ethics, the verbatim inclusion of computer-generated interpretations in assessment reports is likely to be a source of confusion and error, because of the fact that these printouts are normatively rather than idio-graphically based and hence often include statements that are not applicable to the person being examined.

Malingering and Defensiveness

Malingering and defensiveness consist of conscious and deliberate attempts by persons being examined to falsify the information they are giving and thereby to mislead the examiner. Malingering involves intent to present oneself as being worse off psychologically than is actually the case and is commonly referred to as faking bad. Defensiveness involves seeking to convey an impression of being better off than one actually is and is commonly called faking good. Both faking bad and faking good can range in degree from slight exaggeration of problems and concerns or of assets and capabilities, to total fabrication of difficulties never experienced or accomplishments never achieved. These two types of efforts to mislead examiners arise from different kinds of motivation, but both of them can usually be detected from patterns of inconsistency that appear in the assessment data unless respondents have been carefully coached to avoid them.

Identifying Motivations to Mislead

People who fake bad during psychological assessments are usually motivated by some specific reason for wanting to appear less capable or more disturbed than they really are. In clinical settings, for example, patients who are concerned about not getting as much help or attention as they would like to receive may exaggerate or fabricate symptoms in order to convince a mental health professional that they should be taken into psychotherapy, that they should be seen more frequently if they are already in outpatient treatment, or that they should be admitted to an inpatient facility (or kept in a residential setting if they are already in one). In forensic settings, plaintiffs seeking damages in personal injury cases may malinger the extent of their neuropsychological or psychosocial impairments in hopes of increasing the amount of the settlement they receive, and defendants in criminal actions may malinger psychological disturbance in hopes of being able to minimize the penalties that will be imposed on them. In employment settings, claimants may malinger inability to function in order to begin or continue receiving disability payments or unemployment insurance.

People who fake good during psychological assessments, in an effort to appear more capable or better adjusted than they really are, also show a variety of motivations related to the setting in which they are being evaluated. Defensive patients in clinical settings may try to conceal the extent of their difficulties when they hope to be discharged from a hospital to which they were involuntarily committed, or when they would like to be told or have others told that they do not have any significant psychological problems for which they need treatment. In forensic settings, making the best possible impression can be a powerful inducement to faking good among divorced parents seeking custody of their children and among prison inmates requesting parole. In personnel settings, applicants for positions, candidates for promotion, and persons asking for reinstatement after having been found impaired have good reasons for putting their best foot forward during a psychological evaluation, even to the extent of overstating their assets and minimizing their limitations.

Detecting Malingering and Defensiveness

Attempts to mislead psychological assessors usually result in patterns of inconsistency that provide reliable clues to malingering and defensiveness. In the case of efforts to fake bad, these inconsistencies are likely to appear in three different forms. First, malingerers often produce inconsistent data within individual assessment measures. Usually referred to as intratest scatter, this form of inconsistency involves failing relatively easy items on intelligence or ability tests while succeeding on much more difficult items of the same kind, or responding within the normal range on some portions of a personality test but in an extremely deviant manner on other portions of the same test.

A second form of inconsistency frequently found in the assessment data of malingerers occurs between test results and the examiner's behavioral observations. In some instances, for example, people who appear calm and relaxed during an interview, talk clearly and sensibly about a variety of matters, and conduct themselves in a socially appropriate fashion then produce test protocols similar to those seen in people who are extremely anxious or emotionally upset, incapable of thinking logically and coherently, out of touch with reality, and unable to participate comfortably in interpersonal relationships. Such discrepancies between test and interview data strongly suggest the deployment of deceptive tactics to create a false impression of disturbance.

The third form of inconsistency that proves helpful in detecting malingering consists of a sharp discrepancy between the interview and test data collected by the examiner and the respondent's actual circumstances and past history as reported by collateral sources or recorded in formal documents. In these instances, the person being evaluated may talk and act strangely during an interview and give test responses strongly suggestive of serious psychological disturbance, but never previously have seen a mental health professional, received counseling or psychotherapy, been prescribed psychotropic medication, or been considered by friends, relatives, teachers, or employers to have any emotional problems. Such contrasts between serious impairments or limitations suggested by the results of an examination and a life history containing little or no evidence of these impairments or limitations provide good reason to suspect malingering.

Defensiveness in an effort to look good is similarly likely to result in inconsistencies in the assessment data that help to detect it. Most common in this regard are guarded test protocols and minimally informative interview responses that fall far short of reflecting a documented history of psychological disorder or problem behavior. Although being guarded and tight-lipped may successfully conceal difficulties, it also alerts examiners that a respondent is not being forthcoming and that the data being obtained probably do not paint a full picture of the person's psychological problems and limitations. As another possibility, fake-good respondents may, instead of being guarded and closed-mouthed, become quite talkative and expansive in an effort to impress the examiner with their admirable qualities and many capabilities, in which case the assessment information becomes noteworthy for claims of knowledge, skills, virtues, and accomplishments that far exceed reasonable likelihood. These and other guidelines for the clinical detection of efforts to mislead assessors by faking either good or bad are elaborated by Berry, Wetter, and Baer (2002), McCann (1998, chapters 3-4), and Rogers (1997a).

Most self-report inventories include validity scales that are based on inconsistent and difficult-to-believe responses that can often help to identify malingering and defensiveness. (Greene, 1997; see also the chapter by Naglieri and Graham in this volume). A variety of specific interview, self-report, and ability measures have also been developed along these lines to assist in identifying malingering, including the Structured Interview of Reported Symptoms (SIRS; Rogers, Gillis, Dickens, & Bagby, 1991; see also Rogers, 1997b), the M test for detecting efforts to malinger schizophrenia (Beaber, Marston, Michelli, & Mills, 1985; see also Smith, 1997), and the Test of Memory Malingering (TOMM; Tombaugh, 1997; see also Pankratz & Binder, 1997). Commonly used projective and other expressive measures do not include formal validity scales, but they are nevertheless quite sensitive to inconsistencies in performance that suggest malingering or defensiveness (Schretlen, 1997; see also the chapter by Ben-Porath in the present volume). Moreover, because relatively unstructured expressive measures convey much less meaning to respondents than self-report questionnaires concerning what their responses might signify, there is reason to believe that they may be less susceptible to impression management or even that the fakability of an assessment instrument is directly related to its face validity (Bornstein, Rossner, Hill, & Stepanian, 1994). This does not mean that unstructured measures like the Rorschach Inkblot Method and Thematic Apperception Test are impervious to malingering and defensiveness, which they are not, but only that efforts to mislead may be more obvious and less likely to convey a specific desired impression on these measures than on relatively structured measures.


A companion issue to the ease or difficulty of faking assessment measures is the extent to which respondents can be taught to deceive examiners with a convincingly good-looking or bad-looking performance. Research findings indicate that even psychologically naive participants who are given some information about the nature of people with certain disorders or characteristics can shape their test behaviors to make themselves resemble a target group more closely than they would have without such instruction. Misleading results are even more likely to occur when respondents are coached specifically in how to answer certain kinds of questions and avoid elevating validity scales (Ben-Porath, 1994; Rogers, Gillis, Bagby, & Monteiro, 1991; Storm & Graham, 2000). The group findings in these research studies have not yet indicated whether a generally instructed or specifically coached respondent can totally mislead an experienced examiner in actual practice, without generating any suspicion that the obtained results may not be valid, and this remains a subject for further investigation.

With further respect to individual assessments in actual practice, however, there are reports in the literature of instances in which attorneys have coached their clients in how to answer questions on self-report inventories (e.g., Lees-Haley, 1997; Wetter & Corrigan, 1995; Youngjohn, 1995), and a Web site available on the Internet claims to provide a list of supposed good and bad responses for each of the 10 Rorschach inkblots. As previously mentioned in discussing test security, prior knowledge of test questions and answers can detract from the practical utility of psychological assessment methods that feature right and wrong answers. The confounding effect of pretest information on unstructured measures, for which correct or preferable answers are difficult to specify out of context, may be minimal, but the susceptibility of these measures to successful deception by well-coached respondents is another topic for future research. Less uncertain are the questionable ethics of persons who coach test-takers in dishonesty and thereby thwart the legitimate purposes for which these respondents are being evaluated.

Integrating Data Sources

As noted at the beginning of this chapter, psychological assessment information can be derived from administering tests, conducting interviews, observing behavior, speaking with collateral persons, and reviewing historical documents. Effective integration of data obtained from such multiple sources calls for procedures based on the previously described additive, confirmatory, and complementary functions served by a multimethod test battery. In some instances, for example, are-spondent may during an interview report a problem for which there is no valid test index (e.g., having been sexually abused), and may demonstrate on testing a problem that is ordinarily not measured by interview data (e.g., poor perceptual-motor coordination). These two data sources can then be used addi-tively to identify that the person has both a substance use disorder and a neuropsychological impairment. In another instance, a person who describes himself or herself during an interview as being a bright, well-educated individual with good leadership skills and a strong work ethic, and who then produces reliable documents attesting these same characteristics, offers assessors an opportunity for confirmatory use of these different data sources to lend certainty to a positive personnel report.

A third and somewhat more complicated set of circumstances may involve a respondent who behaves pleasantly and deferentially toward the assessor, reports being a kindly and even-tempered person, and produces limited and mostly conventional test responses that fall in the normal range. At the same time, however, the respondent is described by friends and relatives as a rageful and abusive person, and police reports show an arrest record for assault and domestic violence. Familiar to forensic psychologists consulting in the criminal justice system, this pattern of discrepant data can usually be explained by using them in a complementary fashion to infer defensiveness and a successful fake-good approach to the interviewing and testing situations. As a further example in educational settings, a student whose poor grades suggest limited intelligence but whose test performance indicates considerable intelligence gives assessors a basis for drawing in a complementary fashion on the divergent data to infer the likelihood of psychologically determined underachievement.

Because of the increased understanding of people that can accrue from integrating multiple sources of information, thorough psychological evaluation utilizes all of the available data during the interpretation phase of the assessment process. This consideration in conducting psychological assessments touches on the question of how much data should be collected in the first place. Theoretically, there can never be too much information in an assessment situation. There may be redundant information that provides more confirmatory evidence than is needed, and there may be irrelevant information that serves no additive function in answering the referral question, but examiners can choose to discard the former and ignore the latter. Moreover, all test, interview, and observational data that may be collected reflect some psychological characteristics of the person showing this behavior and therefore signify something potentially helpful to know about the person being assessed.

On the other hand, there are practical limits to how much assessment information should be collected to guide the formulation of interpretations. Above all, psychological assessors are responsible for conducting evaluations in a cost-effective manner that provides adequate responses to referral questions with the least possible expense of time and money. As noted previously, practitioners who provide and charge for services that they know will make little difference are exploiting the recipients of their services and jeopardizing their own professional respectability. Assessment psychologists may differ in the amount and kind of data they regard as sufficient to conduct a fully adequate evaluation, but they generally recognize their ethical obligations to avoid going beyond what they genuinely believe will be helpful.

With further respect to providing answers to referral questions, two additional guidelines can help assessment psychologists in drawing wisely and constructively on the assessment data at their disposal. First, by taking full account of indications of both psychological strengths and weaknesses in people they examine, assessors can present a balanced description of their assets and liabilities. Psychological assessment has often addressed mainly what is wrong with people while giving insufficient attention to their adaptive capacities, positive potentials, and admirable qualities. In keeping with contemporary trends in psychology toward emphasizing wellness, happiness, optimism, and other positive features of the human condition (see Seligman & Csikszentmihalyi, 2000), assessment psychology serves its purposes best when the interpretive process gives full measure to adaptive capacities as well as functioning limitations.

Second, by recognizing that the inferences and impressions they derive from assessment data are likely to vary in the strength of the evidence supporting them, examiners can couch their interpretive statements in language that conveys their level of confidence in what they have to say. Most respondents provide clear and convincing evidence of at least some psychological characteristic, which examiners can then appropriately report in what may be called the language of certainty. The language of certainty states in direct terms what people are like and how they are likely to conduct themselves, as in saying, "This student has a marked reading disability," or "Mr. A. appears to be an impulsive person with limited self-control," or "Ms. B. is an outgoing and gregarious person who seeks out and enjoys interpersonal relationships." For other characteristics of a person being evaluated, the evidence may be fragmentary or suggestive rather than compelling and conclusive, in which case impressions are properly reported in what may be called the language of conjecture. Conjectural language suggests or speculates about possible features of a person's nature or likely behavior, as in saying, "There is some evidence to suggest that this child may have an auditory processing deficit," or "She occasionally shows tendencies to be inflexible in her approach to solving problems, which might limit the creativity of her decision-making as an executive," or "The data provide some basis for speculating that his lack of effort represents a passive-aggressive way of dealing with underlying anger and resentment he feels toward people who have demanded a lot from him."

Was this article helpful?

0 0

Post a comment