Permutation and Progressive Roundoff Approach

Once the topology of a network and the database are determined, the most reliable method for the optimization of feature selection is a complete permutation search. In this method, all possible combinations of features are tested in the network. A set of features that generate the best testing performance in the network is then defined as an optimal set of features. For example, in one study [31], a complete permutation method was applied to search for an optimal feature set in a BBN used for classification of mammographic masses, as

FIGURE 5 A simple structure of a BBN for classification of mammographic masses. The number of feature nodes (varied from 1 to 20) is determined by the "exhaustive" permutation search. Reprinted with permission from Zheng, B., Chang, Y.H., Wang, X.H., Good, W.F., and Gur, D. (1999). Feature selection for computerized mass detection in digitized mammograms using a genetic algorithm. Acad. Radiol. 6, 327332.

shown in Fig. 5. In this experiment, the training database included 288 positive mass regions and 2204 negative regions, while the testing database included 304 positive mass regions and 1586 negative regions. For each region, 20 image-based features were extracted. Then, an "exhaustive" (complete permutation) search method was used to test all possible combinations of features, where the number of features used in the BBN ranged from 1 to 20. For each feature set, a BBN was defined, trained, and tested. The highest Az values computed for a given number of features are shown in Fig. 6. As the number of features increase, the maximum Az value also increases initially. It reaches the highest performance index of 0.876 + 0.008 for 11 features. Beyond 11 features, Az values decrease monotonically in the BBN. This experiment indicates an expected general shape for the optimal performance curve as a function of the number of features, because of the contribution of information versus the addition of noise when features are incorporated into a classification system.

Although the permutation method can guarantee finding the optimal feature set for the inputs of a classifier, it is very inefficient. In the preceding example, the whole search process took about 48 hours of processing using a SUN-Sparc-10 workstation to complete this exhaustive permutation search on a total of 1,048,575 feature combinations [31]. As the number of features increases or as the network topology becomes more elaborate, the complete permutation method quickly becomes computationally inaccessible. For instance, if the number of features is increased from 20 to 25 in such a simple structure of the BBN (as shown in Fig. 5), it will take more than 2 months to complete the assessment using the same SUN-Sparc-10 workstation. To solve this problem, we can use the progressive

FIGURE 5 A simple structure of a BBN for classification of mammographic masses. The number of feature nodes (varied from 1 to 20) is determined by the "exhaustive" permutation search. Reprinted with permission from Zheng, B., Chang, Y.H., Wang, X.H., Good, W.F., and Gur, D. (1999). Feature selection for computerized mass detection in digitized mammograms using a genetic algorithm. Acad. Radiol. 6, 327332.

FIGURE 6 The highest performance (Az value) achieved as a function of the number of image-based features used in the BBN. Reprinted with permission from Zheng, B., Chang, Y.H., Wang, X.H., Good, W.F., and Gur, D. (1999). Feature selection for computerized mass detection in digitized mammograms using a genetic algorithm. Acad. Radiol. 6, 327332.

FIGURE 6 The highest performance (Az value) achieved as a function of the number of image-based features used in the BBN. Reprinted with permission from Zheng, B., Chang, Y.H., Wang, X.H., Good, W.F., and Gur, D. (1999). Feature selection for computerized mass detection in digitized mammograms using a genetic algorithm. Acad. Radiol. 6, 327332.

roundoff [17] approach to replace the exhaustive permutation search.

Suppose that we want to search for an optimal set of features from N extracted features and we already know that m features (m<N) are part of the optimal features. The classifier, using these m features, can achieve the performance of Max[Az(m)]. We fix these m features and then add one of the remaining features into the classifier. After finding the highest performance (Max[Az(m + 1)]) using m + 1 features, we fix these m + 1 features again, and search for adding another feature. This process can be repeated for each of the remaining unfixed features, but will usually be stopped if the Max[Az(n + 1)]< Max[Az(n)], where n<N. Then, these n fixed (selected) features are used as input features to the classifier. In medical image processing, although a large number of features may be initially extracted, only a small number of features are needed in order to build a robust classifier. In this situation, the progressive roundoff method can be a practical and efficient approach to search for the optimal feature set. To use this method, we can first use the permutation method to build a growth core (seeds) by searching for a small number of m features (e.g., m = 5). The computation iterations (the possible feature combinations) are

0 0

Post a comment