## Wopt argmaxj WtStW j [wiw wm

A classical approach to find a linear transformation that discriminates the clusters in an optimal way is discriminant analysis. Fisher linear discriminant

1 |
0 |
• | |||

f> |
r-' |
1 |

Figure 2.15: Example of the resulting direction using principal component analysis (PCA) and Fisher linear discriminant (FLD).

analysis [38, 39] seeks a transformation matrix W such that the ratio of the between-class scatter and the within-class scatter is maximized. Let the between-class scatter SB be defined as follows:

where ¡xi is the mean value of class Xi, ¡x is the mean value of the whole data, c is the number of classes, and Ni is the number of samples in class Xi. Let the within-class scatter be c

i=1 %k.,i£ Xi where ¡xi is the mean value of class Xi, c is the number of classes, and Ni is the number of samples in class Xi. If SW is not singular, the optimal projection matrix Wopt is chosen as the matrix that maximizes the ratio of the determinant of the between-class scatter matrix of the projected samples to the determinant of the within-class scatter matrix of the projected samples:

Wopt = argmaxw ! rT„ T '\ = [w1, w2,..., wm] (2.5)

where wi; i = 1... m, is the set of SW-generalized eigenvectors of SB corresponding to the m largest generalized eigenvalues.

Opposed to PCA behavior, Fisher linear discriminant (FLD) emphasizes the direction in which both classes can be better discriminated. FLD uses more information about the problem as the number of classes and the samples in each of the classes must be known a priori. In Fig. 2.15 the projections on the FLD subspace are well separated.

In real problems, it can occur that it is not possible to find an optimal classifier. A solution is presented by assembling different classifiers.

## Post a comment