Ada Boost Procedure

Adaptative Boosting (AdaBoost) is an arcing method that allows the designer to continue adding "weak" classifiers until some desired low-training error has been achieved [40,41]. A weight is assignedto each of the feature points, these weights measure how accurate the feature point is being classified. If it is accurately classified, then its probability of being used in subsequent learners is reduced or emphasized otherwise. This way, AdaBoost focuses on difficult training points.

Figure 2.16 shows a diagram of the general process of boosting. The input data is resampled according to the weights of each feature data. The higher the weight the most probable the feature point will be in the next classification. The new set of feature points are inputs of the new classifier to be added to the process. At the end of the process, the responses of all the classifiers are combined to form the "strong" classifier.

AdaBoost is capable of performing a feature selection process while training. In order to perform both tasks, feature selection and classification process, a weak learning algorithm is designed to select the single features that best separate the different classes. For each feature, the weak learner determines the optimal classification function, so that the minimum number of feature points is misclassified. The algorithm is described as follows:

• Determine a supervised set of feature points [xi, ci} where ci = {-1,1} is the class associated to each of the feature classes.

Feature Input Data Figure 2.16: Block diagram of the AdaBoost procedure.

• Initialize weights w1i = 1 m, 1 for ci = {-1, 1} respectively, where m and l are the number of feature points for each class.

- Normalize weights

2^1= 1 wti so that wt is a probability distribution.

- For each feature, j train a classifier, hj which is restricted to using a single feature. The error is evaluated with respect to wt, ej = J2i wi\hj(xi) - ci|.

- Choose the classifier, ht with the lowest error et.

- Update the weights:

Wt+1,i = Wtiffi where ei = 1 for each well-classified feature and ei = 0 otherwise. pt = -. Calculate parameter at = — log(pt).

Figure 2.17: Error rates associated to the AdaBoost process. (a) Weak single classification error. (b) Strong classification error on the training data. (c) Test error rate.

Therefore, the strong classifier is the ensemble of a series of simple classifiers, ht (x), called "weaks". Parameter at is the weighting factor of each of the classifiers. The loop ends when the classification error of a weak classifier is over 0.5, the estimated error for the whole strong classifier is lower than a given error rate or if we achieve the desired number of weaks. The final classification is the result of the weighted classifications of the weaks. The process is designed so that if h(x) > 0, then pixel x belongs to one of the classes.

Figure 2.17 shows the evolution of the error rates for the training and the test feature points. Figure 2.17(a) shows the error evolution of each of the weak classifiers. The abscise axis is the number of the weak classifier, and the ordinate axis is the error percentage of a single weak. The figure illustrates how the error increases as more weak classifiers are added. This is because each new weak classifier focusses on the misclassified data of the overall system. Figure 2.17(b) shows the error rate of the system response on the training data. The abscise axis represents the number of iterations, that is, the number of classifiers added to the ensemble. As it is expected, the error rate decreases to very low values. This, however, does not ensure a test classification error of such accuracy. Figure 2.17(c) shows the test error rate. One can observe, that the overall error has a decreasing tendency as more weak classifiers are added to the process.

Therefore, the weak classifier has a very important role in the procedure. Different approaches can be used; however, it is relatively interesting to center our attention in low-time-consuming classifiers.

The first and the most straight forward approach to a weak is the perceptron. The perceptron is constituted by a weighed sum of the inputs and an adaptative threshold function. This scheme is easy to embed in the adaboost process since it relies on the weights to make the classification.

Another approach to be taken in consideration is to model the feature points as Gaussian distributions. This allows us to define a simple scheme by simply calculating the weighed mean and weighed covariance of the classes at each step t of the process:

i i for each xxj point in class Cj. W, j are the weights for each data point.

If feature selection is desired, this scheme is highly constrained to the N features of the N-dimensional feature space. If N is not enough large, the procedure could not improve its performance.

Both, the feature extraction and the classification processes, are the central parts of the tissue characterization framework. Next section is devoted to explain the different frameworks where these processes are applied for tissue characterization of IVUS images as well as provide quantitative results of their performance.

Was this article helpful?

0 0

Post a comment