Classification Process

Once completed the feature extraction process, we have a set of features disposed in feature vectors. Each feature vector is composed of all the feature measures computed at each pixel. Therefore, for each pixel we have an n-dimensional point in the feature space, where n is the number of features. This set of data is the input to the classification process. The classification process is divided in two main categories: supervised and unsupervised learning. While supervised learning is based on a set of examples of each class that trains the classification process, the unsupervised learning is based on the geometry position of the data in the feature space and its possibility to be grouped in clusters.

In this chapter we are mainly concerned with supervised learning and classification, since we know exactly what classes we are seeking. Supervised classification techniques are usually divided in parametric and nonparametric. Parametric techniques rely on knowledge of the probability density function of each class. On the contrary, nonparametric classification, does not need the probability density function and is based on the geometrical arrangement of the points in the input space. We begin describing a nonparametric technique, k-nearest neighbors, that will serve as a ground truth to verify the discriminability of the different feature spaces. Since nonparametric techniques have high computational cost, we make some assumptions that lead to describe maximum likelihood classification techniques. However, the last techniques are very sensitive to the input space dimension. It has been shown in the former section that some feature spaces cast the two-dimensional image data to high-dimensional spaces. In order to deal with high-dimensional data, a dimensionality reduction is needed. The dimensionality reduction techniques are useful to create a meaningful set of data because the feature space is usually large in comparison to the number of samples retrieved. The most known technique for dimensionality reduction is principal component analysis (PCA) [38]. However, PCAis susceptible to errors depending on the arrangement of the data points in the training space, because it does not consider the different distributions of data clusters. In order to solve the deficiency of PCA in discrimination matters, Fisher linear discriminant analysis is introduced [38,39]. In order to try to improve the classification rate of simple classifiers, combination of classifiers is proposed. One of the most important classification assembling process is boosting. The last part of this section is devoted to a particular class of boosting techniques, Adaptative Boosting (AdaBoost) [40,41].

0 0

Post a comment