## Estimating Histogram Basis Function Parameters

This section describes parameter-estimation procedures for fitting histogram basis functions to a histogram of an entire dataset. For a given dataset the histogram, h3^(v), is first calculated over the entire dataset. The second step combines an interactive process of specifying the number of materials and approximate feature-space locations for them with an automated optimization [21] to refine the parameter estimates. Under some circumstances, users may wish to group materials with similar measurements into a single "material," whereas in other cases they may wish the materials to be separate. The

result of this process is a set of parameterized histogram basis functions, together with values for their parameters. The parameters describe the various materials and mixtures of interest in the dataset. Figure 9 shows the results of fitting a histogram. Each colored region represents one distribution, with the labeled spot-shaped regions representing pure materials and connecting shapes representing mixtures.

To fit a group of histogram basis functions to a histogram, as in Fig. 9, the optimization process estimates the relative volume of each pure material or mixture (vector a11*) and the mean value (vector c) and standard deviation (vector s) of measurements of each material. The process is derived from the assumption that all values were produced by pure materials and two-material mixtures. nm is the number of pure materials in a dataset, and nf the number of histogram basis functions. Note that nf > nm, since nf includes any basis functions for mixtures, as well as those for pure materials.

The optimization minimizes the function with respect to a111, c, and s, where

2(v; a111 - c, s> = ^(v> - £ aff(v; c,, s,>. (5> Note that f may be a pure or a mixture basis function and that

FIGURE 9 Basis functions fit to histogram of entire dataset. This figure illustrates the results of fitting basis functions to the histogram of the hand dataset. The five labeled circular regions represent the distribution of data values for pure materials, while the colored regions connecting them represent the distribution of data values for mixtures. The mixture between muscle (red) and fat (white), for example, is a salmon-colored streak. The green streak between the red and yellow dots is a mixture of skin and muscle. These fitted basis functions were used to produce the classified data used in Fig. 12. See also Plate 18.

FIGURE 9 Basis functions fit to histogram of entire dataset. This figure illustrates the results of fitting basis functions to the histogram of the hand dataset. The five labeled circular regions represent the distribution of data values for pure materials, while the colored regions connecting them represent the distribution of data values for mixtures. The mixture between muscle (red) and fat (white), for example, is a salmon-colored streak. The green streak between the red and yellow dots is a mixture of skin and muscle. These fitted basis functions were used to produce the classified data used in Fig. 12. See also Plate 18.

its parameter c, will be a single feature-space point for a pure material or a pair for a mixture. The function w(v> is analogous to a standard deviation at each point, v, in feature space, and gives the expected value of |g(v>|. w(v> can be approximated as a constant; it is discussed further in Section 10.

Equations (4) and (5) are derived in Section 9 using Bayesian probability theory with estimates of prior and conditional probabilities.

0 0