Stepwise Feature Selection

The stepwise feature selection method is another well established statistical approach to search for the features that can enhance the performance of a classifier. Although the stepwise feature selection method was initially associated with linear

Another common criterion is minimization of Wilks' lambda [22]. When using an ANN or a BBN in medical image processing, the statistical criterion can also be the Az value.

The first step in the stepwise feature selection method is to define a small number of features as the initial feature model. This step is very similar to the initial step of progressive roundoff method. The difference is that in the progressive roundoff method, once a feature has been selected, it will stay as one of optimal features in the selection process, whereas in the stepwise method, the features selected in the previous step can be removed from the optimal feature set in the next step. There are many methods to define the initial feature set model. One research group reported using a professional statistical program, SPSS [22], to select features for the computerized detection of masses in mammograms [4]. Since there are only two classes (true-positive and false-positive mass regions) in this problem, the program calculates the Wilks' lambda values between the two classes when each of the features is used individually. The feature that provides the smallest Wilks' lambda is selected into the search model first. Once the initial feature model is selected, the number of features selected in the following steps by the stepwise method is controlled by two parameters, called F-to-enter and F-to-remove, based on F statistics. The feature entry step and the feature removal step are alternately performed in the stepwise method. In a feature entry step, each of the features not yet in the model is selected and added into the model one at a time. The Wilks' lambda, or other statistical criterion, in each new feature model is then tested based on F statistics. The feature that provides the smallest Wilks' lambda (or the most significant performance improvement) will be entered in the feature model if the F-to-enter value is larger than the F-to-enter threshold. In the feature removal step, a new set of tests is conducted to evaluate the performance of the classifier by removing each of the features inside the feature model one at a time. If, after one feature is removed from the model, there is no significant change in the performance of the classier (or the F-to-remove value is smaller than the F-to-remove threshold), this feature will be permanently removed from the feature model. The procedure of stepwise feature selection method will be terminated when the F-to-enter values for all features outside the model are smaller than the F-to-enter threshold, and F-to-remove values for all features in the model are greater than the F-to-remove threshold.

The feature selection results in the stepwise method depend on the optimization of F-to-enter and F-to-remove parameters. The optimal values of F-to-enter and F-to-remove threshold values cannot be known in advance. Thus, one has to experiment with these parameters and increase or decrease the number of selected features inside the initial model to obtain the best performance of the classifier. Chan et al. have extensively tested and applied this stepwise method to optimize feature selection for the linear discriminant and ANN classifier in computer-assisted diagnosis schemes of digital mammo-graphy [4]. The performance improvement of different classifiers after using stepwise feature selection method was also demonstrated [23].

Was this article helpful?

0 0

Post a comment