Classification Based on Computer Extracted Features

Another way to obtain features of breast lesions is to develop computer techniques to automatically extract features from digital mammograms. At the present time, since mammograms are recorded on film, it is necessary to convert the analog image to a digital image using a film digitizer. In the future, when full-field digital mammogram (FFDM) becomes widely available, features may be extracted directly from the digital image, eliminating the manual step of film digitization. Using a computer to extract features frees the radiologist from having to provide input for computer classification before results of the computer analysis can be considered by the radiologist. Therefore, a radiologist will have the computer classification results available as he or she interprets the mammogram. This automated approach also completely eliminates the subjectivity and the associated variability in radiologist-extracted features. With today's high-performance computers, it becomes possible to develop sophisticated computer feature-extraction techniques to capture the essence of an image as it is interpreted by radiologists. It also becomes possible to extract image features that are not necessarily visible to radiologists. As will be shown later in this chapter, computer-extracted features can be highly effective for accurate classification of breast lesions. However, because the computer classification is now completely independent from radiologists' interpretation of mammogram, radiologists must decide when to trust the results of computer analysis.

A list of computer-extracted features of clustered microcalcifications developed by Jiang et al. is shown in Table 1 as an example of features that correlate to radiologists' perceptual experience [56]. These features describe both the group of microcalcifications as a cluster and the individual microcalcifications (see also Fig. 2). Because most breast cancers originate in small milk ducts, malignant microcalcifications often have the appearance of a ductal tree. The first two features are means to capture this distinctive appearance ofmalignant microcalcifications by describing the shape and size of a cluster. The third feature, number of microcalcifications in a cluster, is information readily available to radiologists and therefore is included in this and several other computer classification techniques. The size of individual calcifications is important in that relatively large calcifications that are on average larger than a millimeter in size are almost always benign. Smaller calcifications — known as microcalcifications — are more likely to be related to cancer. The fourth and seventh features measure the size of individual

FIGURE 4 ROCcurves indicating the impact ofwithin- andbetween-observer variations in radiologists' feature ratings for (a) textbook cases and (b) clinical cases. If the variation were not present, the ROC curves would have been perfect, with Az = 1. The decreases in Az from 1 indicate the impact of within- and between-observer variations. Reprinted from [54] with permission.

FIGURE 4 ROCcurves indicating the impact ofwithin- andbetween-observer variations in radiologists' feature ratings for (a) textbook cases and (b) clinical cases. If the variation were not present, the ROC curves would have been perfect, with Az = 1. The decreases in Az from 1 indicate the impact of within- and between-observer variations. Reprinted from [54] with permission.

microcalcifications both in the image plane and in the direction perpendicular to the image plane (measured by means of contrast that was converted to effective thickness). Another characteristic of malignant microcalcifications is pleo-morphism, which means that individual microcalcification particles tend to have different appearance in shape and size. The fifth and sixth features identify pleomorphism by calculating the relative standard deviation of the size of individual microcalcification particles. Lastly, perhaps the most classic feature of malignant microcalcifications is a linear or branching shape that is caused by the microcalcification particles filling segments of small ducts. The eighth feature identifies these classically malignant microcalcifications.

The links between computer-extracted features and radiologists' perceptual experience imply that feature values extracted by a computer should be correlated with feature values perceived by a radiologist. Figure 5 shows an example of this correlation. The correlation will not be perfect, however, in part because of the variability in radiologists' assessment of image features. Beyond the correlation of feature values, the links between computer-extracted features and radiologists' perceptual experience also imply that malignant lesions identified by computer and by radiologists should share similar characteristics. In the example shown in Fig. 6, which is a scatter plot of cluster circularity versus cluster area, benign microcalcification clusters tend to be small in area and round in shape while the malignant clusters tend to be larger in size and irregular in shape. These results agree with the characteristics of benign adenosis and malignant ductal

FIGURE 5 Quantitative correlation between computer-extracted features and an expert mammographer's assessment of the presence of spiculation in 95 masses. On both axes, larger numbers indicate stronger confidence that spiculation was present. Reprinted from [58] with permission.

FIGURE 6 Distributions of computer-extracted features of microcalcification cluster circularity versus cluster area illustrating the qualitative correlation between the characteristics of computer-identified malignancies and radiologists' perceptual experience. Reprinted from [56] with permission.

FIGURE 5 Quantitative correlation between computer-extracted features and an expert mammographer's assessment of the presence of spiculation in 95 masses. On both axes, larger numbers indicate stronger confidence that spiculation was present. Reprinted from [58] with permission.

FIGURE 6 Distributions of computer-extracted features of microcalcification cluster circularity versus cluster area illustrating the qualitative correlation between the characteristics of computer-identified malignancies and radiologists' perceptual experience. Reprinted from [56] with permission.

microcalcifications, respectively. Although in this example the correlation is only qualitative, the correlation can be used as a priori knowledge in developing and identifying important computer-extracted features.

Automated techniques must be developed to extract features from digital mammograms. The techniques for extracting the microcalcification features (Table 1) are described here as an example [56]. To calculate the area and circularity of a microcalcification cluster, a cluster margin was first constructed with a sequence of 10 morphological dilation followed by three morphological erosion operations on a binary image containing only the individual microcalcifications. A single kernel was used in the morphological operations and was constructed from a 5x5 pixel square with the four corner pixels removed. Microcalcifications were segmented from the mammogram using a technique based on thresholding and region growing after the parenchyma background was subtracted from the mammogram [69]. The parenchyma background was approximated by a third-degree polynomial surface fitted to a 10 x 10 mm region centered on a microcalcification. The area of a microcalcification was defined by the result of the segmentation.

Effective thickness of a microcalcification was defined as the length of a microcalcification along the X-ray projection line. It was calculated by first converting signal contrast in pixel value to signal contrast in relative exposure and then converting signal contrast in relative exposure to physical length. The first conversion uses the characteristic curve of the film digitizer and the H&D curve of the screen-film system. The second conversion uses the principle of exponential attenuation and a "standard" model of the breast and the microcalcification. The standard model assumes (a) a 4-cm-thick compressed breast composed of 50% adipose and 50% glandular tissue; (b) a microcalcification composed of calcium hydroxyapatite with a physical density of 3.06 g/mm3; and (c) a 20-keV monoener-getic X-ray beam. Corrections in contrast were made to compensate for blurring caused by the screen-film system and by the digitization process and to compensate for X-ray scatter. Effective volume of a microcalcification was then computed as the product of area and effective thickness.

A microcalcification's shape-irregularity measure was calculated as the relative standard deviation of 12 shape indices, which were defined as follows. Four shape indices represented distances between the center-of-mass pixel and the edges of the smallest rectangular box (drawn to the pixel grid) enclosing the microcalcification. The rest of the eight shape indices were the maximum length ofline segments drawn from the center-ofmass pixel and other pixels within the microcalcification along the directions of0, 45, 90, 135, 180, 225, 270, and 315 degrees.

An example of computer-extracted image features that do not necessarily correlate with radiologists' perceptual experience is the approach by Chan et al, who have developed computer techniques to extract texture features that are present in mammograms but are not explicitly used by radiologists as a source of diagnostic information [57]. Their approach is based on the co-occurrence matrices described in Chapter 14, entitled "Two-Dimensional Shape and Texture Quantification.'' Co-occurrence matrices, also known as spatial gray-level dependence (SGLD) matrices, compute second-order statistics of the digital image data. Each matrix element provides the joint probability that a pair of pixels with a given relative position will have specified gray-level values. Because a matrix can be formed for each orientation and distance, a very large number of features can be computed. From a total of 260 features computed in [57], a stepwise feature selection technique selected the 7 best features, which included correlation, difference average, difference entropy, inertia, and inverse difference moment, whose definitions can be found in Chapter 14. In such techniques, the number of mammograms determines the outcome of the analysis, since the best features determined on a very large number of mammograms may not be the same as those derived from a small number of mammograms [70].

0 0

Post a comment