Classifiers

Once a comprehensive set of significant features are obtained, a classifier can be designed to classify breast lesions as malignant or benign. Common classifiers are linear discriminant analysis (LDA) and artificial neural network (ANN). The issues involved in designing classifiers to classify breast lesions are essentially the same as in other classification tasks. Here we will address the issues related to the mammogram database and presentation of the computer classification results. The principles and use of classifiers can be found in other chapters as well as several good texts [71-73].

To design a classifier effectively, it is necessary to obtain a random sample of image data that represent reasonably well the population to be classified. However, it is often difficult to obtain a large number of mammograms with confirmed diagnosis (i.e., presence or absence of cancer) that can be accessed easily for classifier design. In practice, classifiers are often designed with 100 to 200 mammograms. While it is difficult to tell whether 200 mammograms can adequately represent the patient population, data resampling techniques such as jackknife and leave-one-out methods are often employed to use available mammograms in an efficient way [71,74]. The jackknife method divides a set of mammo-grams into a training set for designing the classifier and a test set for assessing its accuracy. Since a set of mammo-grams can be partitioned in more than one way, typically the classifier performance is assessed by averaging results from several training and test partitions. The leave-one-out method is a special case of the jackknife in which the multiple-partitioning and result-averaging are used to their extreme. A set of mammograms is divided so that only the mammograms from one patient, possibly more than one view, are used for assessing classifier accuracy, with the rest of the mammograms being used to train the classifier. The classifier accuracy is obtained from the summary results by exhausting all possible ways of partitioning the mammo-gram sample. As in other classification tasks, the sample-size issue manifests itself not only in training and testing, but also in feature selection and optimizing various parameters in a classifier, e.g., the number of hidden units in an ANN. These effects have been illustrated [71, 74] and have been studied more recently in the context of computer-aided diagnosis [75,76].

For classification results to be useful to radiologists, the results must be presented in a clear and convenient manner. Presenting classification results in terms of the likelihood of malignancy is one way to communicate the results to radiologists, since many radiologists advocate the estimation of the likelihood of malignancy in their own interpretations of mammograms as a means of improving the consistency in deciding biopsy or follow-up [29]. However, unless the native output from the classifier represents probability, which is typically not the case, a conversion must be done to put the classification results into an understandable format before they can be reviewed by radiologists.

Jiang et al. transformed ANN output to likelihood of malignancy via the maximum-likelihood (ML) estimated univariate binormal receiver operating characteristic (ROC) model [59,77] illustrated in Fig. 7. Let M(x) be the probability density function of a latent decision variable x for actually malignant cases; let B(x) be the analogous probability density function for actually benign cases; and let ^ be the prevalence of malignant cases in the population studied. The likelihood of malignancy, as a function of the latent decision variable x, can be written as

FIGURE 7 Illustration of the binormal model upon which the transformation from ANN output to likelihood of malignancy was based. Reprinted from [59] with permission.

LM(x) must then be converted to likelihood of malignancy as a function of the ANN output. This can be done by a polynomial fit of ANN output and LM(x), with the data required to estimate their relationship (ANN output, false positive fraction, and true positive fraction) provided as part of the output from Metz's LABROC4 program [78]. (The ROC analysis is described in the next section.)

2 Performance of Computer Classification

One way to assess the accuracy of computer classification is to compare its performance against that of radiologists. This is typically done in retrospective studies using a set of images with confirmed diagnosis that are often obtained by means of biopsy. Use of biopsied cases provides reasonable assurance of correct diagnoses. However, lesions that are not suspicious enough to be biopsied but otherwise are equally likely to be incorrectly diagnosed by radiologists are excluded from the database. A group of radiologists who are not familiar with the specific cases are typically asked to read the mammograms and make a diagnosis based on the mammograms. To facilitate the use of ROC analysis [79], the radiologists are often asked to report their diagnosis in terms of their confidence that a lesion in question is malignant. Using the confidence ratings, ROC curves can be computed for each radiologist and especially for the radiologists as a group. The latter case represents their skill level in mammogram interpretation. These ROC curves document radiologists' diagnostic accuracy. Similarly, an

ROC curve can be obtained as a measure of the accuracy of the computer classification. Comparison of diagnostic accuracy can then be made on the same mammograms between computer classification and radiologists.

We digress from the topic of classifying breast lesions to briefly describe ROC analysis, as it is the standard metric for assessing computer classification performance. An ROC curve (see e.g., Fig. 8) is a plot of sensitivity (defined as the fraction of cancers correctly diagnosed) vs (1 — specificity) (specificity defined as the fraction of cancer-free cases correctly diagnosed) [77,80]. Conventionally, the axes of an ROC curve are frequently labeled as in Fig. 8 by "true-positive fraction,'' which is equivalent to sensitivity, and by "false-positive fraction,'' which is equivalent to (1 — specificity). Area under the ROC curve, either from unfitted experimental data (trapezoidal area) or from a fit to the binormal model (Az), is the most commonly used accuracy index [81]. Another accuracy index, 090Afz, which represents a normalized area under the ROC curve above sensitivity of 0.90, is used to assess classification accuracy in a more clinically meaningful way, as it is imperative to maintain high sensitivity in mammography [82]. For the perfect classification accuracy, the ROC curve reaches sensitivity of 1.0 at a constant specificity of 1.0 and it maintains a sensitivity of 1.0 at all other specificity values. The

FIGURE 8 Comparison of the average unaided and CAD ROC curves of five attending radiologists (solid lines) and of five senior radiology residents (broken lines). For attending radiologists, the Az values are 0.62 for unaided and 0.76 for CAD (p = 0.006). For residents, the Az values are 0.61 for unaided and 0.75 for CAD (p = 0.0006). As a reference (dashed line), the computer's Az value is 0.80. The operating points represent biopsy performance of unaided attending radiologists (A), attending radiologists in CAD (•), unaided residents (A), and residents in CAD (O). Reprinted from [59] with permission.

FIGURE 8 Comparison of the average unaided and CAD ROC curves of five attending radiologists (solid lines) and of five senior radiology residents (broken lines). For attending radiologists, the Az values are 0.62 for unaided and 0.76 for CAD (p = 0.006). For residents, the Az values are 0.61 for unaided and 0.75 for CAD (p = 0.0006). As a reference (dashed line), the computer's Az value is 0.80. The operating points represent biopsy performance of unaided attending radiologists (A), attending radiologists in CAD (•), unaided residents (A), and residents in CAD (O). Reprinted from [59] with permission.

area under the perfect-accuracy ROC curve is 1.0. The partial area index for the perfect-accuracy ROC curve, 0 90A', is also 1.0. At the other extreme, the ROC curve of random guessing is the positive diagonal line and the area under this curve is 0.5, whereas the partial area index, 090A', is 0.05. Statistical methods and computer software are available for fitting the ROC curve and for comparison of ROC curves [83].

10 Ways To Fight Off Cancer

10 Ways To Fight Off Cancer

Learning About 10 Ways Fight Off Cancer Can Have Amazing Benefits For Your Life The Best Tips On How To Keep This Killer At Bay Discovering that you or a loved one has cancer can be utterly terrifying. All the same, once you comprehend the causes of cancer and learn how to reverse those causes, you or your loved one may have more than a fighting chance of beating out cancer.

Get My Free Ebook


Post a comment