As described in other chapters in this volume, considerable effort has been made to improve the quality of assessment information (e.g., by constructing new tests). However, it is also important that advances be made in the way that assessment information is used to make judgments and decisions. Two general methods for making judgments and decisions will be described and critiqued in this chapter: clinical judgment and mechanical prediction.
Having suffered through statistics classes, students and professionals may be put off by the term mechanical prediction. They may even feel weak and bewildered when confronted with terms such as actuarial prediction, automated assessment, and statistical prediction. Terminology in this area is sometimes confusing, so it will behoove us to take a moment to clarify the meaning of these and other terms.
In the context of personality assessment, clinical judgment refers to the method by which judgments and decisions that are made by mental health professionals. Statistical prediction refers to the method by which judgments and decisions that are made by using mathematical equations (most often linear regression equations). These mathematical equations are usually empirically based—that is, the parameters and weights for these equations are usually derived from empirical data. However, some statistical prediction rules (e.g., unit weight linear rules) are not derived using empirical data. The terms statistical prediction and actuarial prediction are close in meaning: They can be used interchangeably to describe rules that are derived from empirical data. Statistical and actuarial prediction can be distinguished from automated assessment. Automated assessment computer programs consist of a series of if-then statements. These statements are written by expert clinicians based on their clinical experiences and their knowledge of the research literature and clinical lore. Computer-based test interpretation programs are examples of automated assessment programs. They have been enormously popular—for example, for the interpretation of the Minnesota Multiphasic Personality Inventory-II (MMPI-2; Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989). They will be described in detail in the chapter by Butcher in this volume. Finally, the term mechanical prediction also needs to be defined. As defined by Grove, Zald, Lebow, Snitz, and Nelson (2000), mechanical prediction is "statistical prediction (using explicit equations), actuarial prediction (as with insurance companies' actuarial tables), and what we may call algorithmic prediction (e.g., a computer program emulating expert judges) Mechanical predictions are 100% reproducible" (p. 19). In other words, mechanical prediction is a global term that subsumes statistical prediction, actuarial prediction, and automated assessment, but not clinical judgment.
To clarify how mechanical prediction rules can be used in personality assessment, it will be helpful to describe a model study. In a study conducted at Western Psychiatric Institute and Clinic at the University of Pittsburgh (Gardner, Lidz, Mulvey, & Shaw, 1996), the judgment task was to predict whether patients would become violent in the next 6 months. Clinicians were psychiatrists, psychiatric residents, and nurse-clinicians who had seen the patients in the emergency (admissions) department and who had conferred on the cases together. Clinical and statistical predictions were made for 784 patients. To obtain outcome scores, patients and significant others were interviewed over the following 6 months. Additional information was also used to learn if a patient had become violent: commitment, hospital, and police records were searched for reports of violent incidents. Patients were said to be violent if they had "laid hands on another person with violent intent or threatened someone with a weapon" (Lidz, Mulvey, & Gardner, 1993, p. 1008). One of the strengths of the study is that the data were analyzed using receiver operating characteristics (ROC) analysis. ROC methods form an important part of signal detection theory. Using ROC methods, measures of validity are unaffected by base rates or by clinicians' biases for or against Type I or Type II errors (McFall & Treat, 1999; Mossman, 1994; Rice & Harris, 1995). For both clinical prediction and statistical prediction, the average area under the ROC curve (AUC) was reported. For this task, the AUC is equal to the probability of a randomly selected violent patient's being predicted to be violent more often than a randomly selected nonviolent patient. The greater the AUC, the greater the accuracy of predictions. A value of .5 represents the chance level of prediction. With regard to the results, the AUC for statistical prediction was .74 and the AUC for clinical prediction was only .62.
Historically, the issue of clinical versus statistical prediction has been the subject of intense debate. The issue first drew a great deal of attention in 1954 when Paul Meehl published his classic book, Clinical versus Statistical Prediction: A Theoretical Analysis and a Review of the Evidence. This is a book that for many years was read by nearly all graduate students in clinical and counseling psychology programs. In his book, Meehl noted that in almost every comparison between clinical and statistical prediction, the statistical method was equal or superior to informal clinical judgment. This conclusion has generally been supported in subsequent reviews (e.g., Dawes, Faust, & Meehl, 1989, 1993; Garb, 1994; Goldberg, 1991; Grove et al., 2000; Grove & Meehl, 1996; Kleinmuntz, 1990; Marchese, 1992; Meehl, 1986; Wiggins, 1981). Meehl is one of the most highly regarded psychologists in the history of clinical psychology, and late in his career he bemoaned the fact that psychologists were neglecting the research on statistical prediction. According to Meehl (1986):
There is no controversy in social science that shows such a large body of qualitatively diverse studies coming out so uniformly in the same direction as this one. When you are pushing 90 investigations, predicting everything from the outcome of football games to the diagnosis of liver disease and when you can hardly come up with a half dozen studies showing even a weak tendency in favor of the clinician, it is time to draw a practical conclusion, whatever theoretical differences may still be disputed. (pp. 373-374)
According to Meehl and other advocates of statistical prediction, mental health professionals should be using statistical rules to make diagnoses, descriptions of traits and symptoms, behavioral predictions, and other types of judgments and decisions. Yet, clinicians rarely do this. One is left wondering why.
The following topics will be covered in this chapter: (a) results on clinical versus mechanical prediction, (b) the strengths and limitations of clinical judgment, (c) the strengths and limitations of automated assessment, and (d) the strengths and limitations of statistical prediction. Recommendations will be made for improving the way that judgments and decisions are made in clinical practice.
Was this article helpful?