## Weighted Gmmmrf Model of Segmentation

A finite mixture model (FMM) [23, 27, 32] is defined as a linear combination of M component conditional densities f (x | m, 0m), for m ={1,..., M}, and M mixing coefficients f (m) of the form

such that the mixing coefficients f (m) satisfy the following constraints:

The framework of WGMM comprises of l e (1,...,L) class densities each modeled independently using a GMM of the form given in Eq. (11.14) and a set of mixing coefficients p(rnl) as

The lth GMM estimates the class-conditional pdf p(x | , &l), which is itself another mixture model, for each data point for each class {«l}L= j. The vector &l is defined as the M component Gaussian parameters of the lth GMM as @l = {Pl (m), iMm, Y*lm}, Vm ={1,..., M}. Each estimate of the class conditional pdf is mixed to model the overall unconditional density p(x), using a mixing coefficient p(rnl), identifying the contribution of the lth class density in the unconditional pdf.

If it is assumed that for a complete dataset X, of points xn, where X = {xi,..., xN ), is drawn independently from the distribution f (x | 0 ), then the joint occurrence of the whole dataset can be conveniently expressed as the log likelihood as follows:

log Z (©) = log p(xn I ©) = log£ Ynip(Vl)p(x„ | 0>i, &l) (11.16)

Using a modified version of the expectation-maximisation (EM) algorithm, as described below, we derive an ML estimate of the parameter values of each of the L GMMs {©l }L=1.

The general framework for parameter estimation in GMM can be used to learn the parameters of WGMM. Here the component conditional densities, appearing in Eq. (11.13) are themselves mixture models. In the EM algorithm, the update equations for mixing coefficients do not depend on the functional particulars of the component densities. Hence, the mixing coefficients of the WGMM are updated according to

The m-step involves maximizing the auxiliary function with respect to the parameters {©i}L= j. The auxiliary function can be written as

Q(&™, ©old) = pold(mi I xn, ©old) log PeW(Ml) PeW(xn | o*, 0fw)

Where

Writing yni = pold(mi | xn, ©;old), the auxiliary function can be written as the sum of L auxiliary functions, one for each mixture model:

and Qi(©fw, ©old) = Y, Ym log Pnew(mi)pew(Xn I mi, 0fw) (11.23)

The procedure for maximising the overall likelihood of a WGMM is outlined in Algorithm 1. It consists of an outer EM loop, which are nested in L inner EM loops. Each time the outer loop is traversed, the mixing weights p(mi) are updated according to Eq. (11.17) and the L inner loops are iterated to update the mixing weights pi (m), means |im, and covariances Y*im for each of the components. It should be noted that it is not necessary to iterate the inner loops to converge on each outer EM step, since it is only necessary to increase the auxiliary function to ensure convergence of the overall likelihood to a local maximum.

Algorithm 1: WGMM Algorithm

1. Make an initial estimate of all GMM parameter values {©l}f= 1 and p(rnl).

2. Iterate outer E-step and outer M-step until the change in auxiliary function (Eq. 11.18) between iterations is less than some convergence threshold WGMM converge.

(a) Inner EM steps

For each GMM modeling the class-conditional pdf of classes m1 = {1,...,L} do (update the parameter values of each individual GMM using the GMM-EM algorithm until convergence).

(b) Find new values for the WGMM mixing coefficients, ©new, that maximizes the auxiliary function given in step 3(b) above.

5. Iterate steps 2-4 until the convergence criteria are satisfied.

Finally, we combine our WGMM model with MRF in the same manner as Zhang et al. [24] combined GMM with MRF. The WGMMmrf model is based on Eq. (11.15) except that the mixing coefficients p(rnl) are replaced with a MRF-MAP estimate p(yn = m1 | bin) using ICM algorithm [29]. The auxiliary function given in Eq. (11.18) is rewritten to include the MRF hidden model as follows: