## Energy Minimization Method

The discussion in section 8.2.2.2 shows that in an MRF the optimal segmentation solution comes from finding the maximum of the a posteriori probability in Eq. (8.14), which is equivalent to the minimization problem of the energy function of Eq. (8.16). However, because of the large size of high-dimensional random variable X and the possible existence of local minima, it is fairly difficult to find the global optimal solution. Given the image size with I by J and the gray level for each pixel is Ng, the total size of the random field solution space is Ngx J that usually requires a huge amount of computation to find the optimal solution. For example, the size of MR image on carotid artery is usually 256 by 256, the gray level of each pixel is 212, then the number elements of the solution set is (212)256 x 256 = 2786432, it is a prohibitive to be implemented in most interactive applications.

Some algorithms have been proposed to solve this problem in literature. Generally speaking, they can be classified into two categories. One category is stochastic based, such as simulated annealing [56-58], Geman and Geman's Gibbs sampler [22], etc. They are all stochastic relaxation methods and theoretically can reach global minima by reducing the temperature slowly enough. Unfortunately, these algorithms normally require a huge amount of time and are intolerable for practical applications due to the extensive computation. The other category are deterministic methods, e.g., Besag's iterative conditional mode (ICM) [20] and highest confidence first (HCF) [21] proposed by Chou and Brown. These numerical approximation algorithms have much faster convergence speed and are often used as computationally efficient alternatives. ICM can be regarded as a special case of Gibbs sampler with an annealing temperature (T = 0). It may converge only to a local minimum and the results depend strongly on the initial estimate and order of updates, while the HCF algorithm, rather than updating the segmentation results via a raster scan order as that in ICM, selects the pixel with maximum confidence for updating. This overcomes the updating order problem in ICM and hence has the inherent advantage of faster convergence. In this study, we will only consider the algorithms on the second category because we promise to find practical solutions for segmentation requirements.

8.2.2.3.1 Simulated Annealing. Simulated annealing (SA), first introduced independently by Cerny [57] and Kirkpatric [58] as a stochastic solution for combinatorial optimization, simulates the physical annealing procedure in which a physical substance is melted and then slowly cooled in the process of constructing a lower energy configuration. The temperature control in the cooling process must be slow enough to allow all the substance to reach equilibrium state so as to avoid the defects. Metropolis et al. [59] proposed an iterative algorithm to simulate this annealing process. The searching of global minimum energy is controlled under a sequence of decreasing temperature T that satisfies the following rules:

(i) At higher T, a large increase in energy is accepted;

(ii) At lower T, a small increase may be accepted;

(iii) Near the freezing temperature, no increase is permitted.

Based on this, there are occasional energy ascents in the "cooling" process so as help the algorithm escape from local minima.

Suppose that the temperature parameter is at certain value T and the random field X starts from any arbitrary initial state, X(0). By applying a perturbation randomly, a new realization of the random field can be generated as X(n+1\ The implementation of this perturbation varies in different optimization scheme. For example, in Gibber sampler, only one pixel is changed in each scan, while all the other pixels are kept unchanged. Generally speaking, the perturbation is usually very small so that X(n+1) is in the neighborhood of its previous state, X(n). The probability of accepting this perturbation is decided by two factors:

(i) Total energy change, A E, due to this perturbation.

(ii) Temperature T.

The definition of acceptance probability is defined as

It is obvious that the perturbations that lower the energy will be definitely accepted. However, when there is an increase of energy, the temperature parameter T controls the accepting probability in that given the same energy change AE, when T is with relative high value, the accepting probability is more than when T is relatively lower. Since this probability is based on the overall energy change, it has no dependency on the scanning sequence as long as all the pixels have been visited. In each iteration, this perturbing-accepting process will go on until the equilibrium is assumed as being approached (this is generally controlled by the maximum times of iteration). Then the temperature parameter T is reduced according to an annealing schedule and the algorithm will repeat the iterations for equilibrium searching as discussed above with the newly reduced temperature.

This annealing process will keep on going until the temperature is below the minimum temperature defined. Then the system is frozen and the state with the lowest energy is reached.

The annealing schedule is usually application-dependent since it is very crucial to the amount of computation in the stochastic relaxation process and the accuracy of the final result. Gemen and Geman proposed a temperature-reducing schedule that is expressed as a function of the iteration numbers:

where k is the iteration cycle and t is the constant. Even though this schedule can guarantee a global minimum solution, unfortunately, it is normally too slow for practical applications. Some other annealing methods [58] have been proposed to reduce the computation burden; unfortunately, it is no longer guaranteed to reach global minimum.

8.2.2.3.2 Iterated Conditional Modes. The goal of the ICM algorithm, proposed by Besag [33] in 1974, is to find a numerical approximation to further reduce the amount of computation produced by using stochastic relaxation. In recent publications, Pappas introduced an adaptive method [20] based on ICM and Chang et al. [46] extended it to color image segmentation.

In ICM algorithm, two assumptions are applied to the Markov random field:

(i) Given random field X, the observation components, Yi, are modeled as the independent and identical distributed (i.i.d.) white Gaussian noise, with zero mean and variance s2, and they are with the same known conditional density function p(Ys | Xs), dependent only on Xs. The conditional probability can be expressed as

all pixel s

V2na2

(ii) The labeling of pixel s, Xs, depends only on the labels of its local neighborhood as p(Xs | Y, Xr, all r = s) = p(Xs | Y, Xr, all r e Ns)

This is actually the Markovian property.

Under these assumptions, the conditional a posterior probability can be written as

where Ns is the neighborhood of pixel s and Cs is the set containing all the cliques within Ns. This equation shows that the local conditional probability depends only on Xs, Ys, and Ns.

Based on these relations, the ICM iteratively decreases the energy by visiting and updating the pixels in a raster-scan order. For each pixel s, given the observed image Y and the current label of all the other pixels (actually only the neighborhood of pixel s), the label of Xs is replaced with one that can maximize the conditional probability as

X(n+1) = arg max p(X(n) | Y, X(n), allr = s). (8.23)

all labels

Starting from the initial state, this algorithm will keep on running based on the procedure introduced above until either the predefined number of iterations is reached or when the labels of X do not change any more. Then it is regarded that a local minimum is reached.

Compared with the acceptance probability in SA method, only decrease of energy change is accepted in the ICM algorithm. This can be regarded as a spatial case when T = 0 because SA never accept any positive energy change when T is at zero temperature. This is why ICM is often referred as the "instant freezing" case of simulated annealing.

Even though ICM provides a much faster convergence than stochastic relaxation based methods, the solutions from ICM are likely to reach only local minima and there is no guarantee that a global minimum of energy function can be obtained. Also, the initial state and the pixel visiting sequence can also have effects on the searching result.

8.2.2.3.3 Highest Confidence First. Highest confidence first (HCF) algorithm, proposed by Chou and Brown [21], is another deterministic method for combinatorial minimization. Similar to the ICM algorithm, the two assumptions in Eqs. (8.19)-(8.21) also hold for HCF, and the minimal energy estimation, equivalent to maximizing the conditional a posterior probability of p(Xs | Y, Xr, all r = s) for each pixel s, is also implemented iteratively.

The core feature of the HCF algorithm is the design of sites' label updating scheme. Assume L, with size NL, is the set of the labels that are assigned to each site in the segmentation label matrix, which includes a special label 0 e L to indicate the initial state of all sites. In the HCF algorithm, the site labeled with 0 is called "uncommitted site"; otherwise, it is called "committed site."

In the optimization process, once a label is assigned to an uncommitted site, the site becomes committed and cannot return to uncommitted state any more. However, a committed site can update its label through another assignment.

Instead of considering the energy change on an individual site through a raster-scan sequence as that in ICM, the HCF algorithm has the energy changes on all sites of the image within consideration via a measurement, called confidence, and updates the one with the highest confidence. The definition of confidence is derived from the local conditional probability in Eq. (8.22). Assume that the local energy function at site s is Es (xs):

ceCs

So the probability expression in Eq. (8.22) can be rewritten as

where l e L, is the segmentation label. The confidence measure cs (x) at site s is defined as max{Es(l) — Es(xi)} when s is committed with label l, xi e {xn | n = 1, ...,Nl , xn = 0, and xn = l}; as(x) = max{Es(0) — Es(Lmjn)} when s is uncommitted,

L min = argmin{Es (xi)}, x.e<p and 0 = {xn | n = 1,..., NL, xn = 0};

In Eq. (8.26), Lmjn is the one that can make the energy function at site s minimum among all the elements except 0 in label set L. When a site is uncommitted, it is obvious that cs (x) is always nonnegative. Label of the site with the maximum amount of confidence will be changed to Lmjn. The exact purpose of this label updating process is to actually pick up the site where an appropriate label assignment can lower the energy of the whole image the most. In the meantime, the new label of this site influences the energy and confidence of its neighbors whose confidences need to be updated before the start of the next iteration. We can also regard the confidence as an indication that how stable the segmentation will be with the changing of the label at s. Obviously, the more stable is the label-updated image due to the change, the more confident we are that the label at s should be updated.

The HCF algorithm initially assigns all the sites with label 0 in the segmentation matrix. In each iteration, the algorithm will update only the site with the maximum confidence to the label that minimizes Es the most. In the calculation of cs(x), only neighbors that are committed can affect the clique potentials. Therefore, once the label of a site changes, either from uncommitted state to committed state or from one committed label to another committed label, the confidence of its neighbors will have to be recalculated. The HCF algorithm stops when all the sites are committed and their confidence becomes negative.

Although it is claimed [46] that there is no initial estimate needed in HCF, some parameters, such as the mean value of each site, have to be provided in advance in order to get the confidence calculated when all the sites are uncommitted. For implementation, a heap structure is suggested in Chou's design that creates the highest confidence in the searching process faster.

Even though both the ICM and the HCF algorithms can only reach local minima of energy function, the results obtained with HCF are generally better than that with ICM [19, 46] because of its relatively optimal label-updating scheme. In HCF, the optimization result is independent of the scanning order of the sites. Even though this may lead to the increase of the computation amount and implementation complexity, the HCF algorithm is still regarded as a very practical solution with the fast growing of the processors' power.

### 8.2.3 QHCF Method 8.2.3.1 Algorithm

In the optimization procedure of the original HCF algorithm proposed by Chou and Brown [21], all the pixels are uncommitted and labeled 0 at the very beginning. A (prechosen) number of classes K are required in advance and the whole image is then presegmented (usually by K-means clustering) into K regions. The purpose is to estimate the mean of region that each pixel belongs to so as to initialize the energy of the whole image for HCF updating, and as the segmentation result is very sensitive to this initialization, the choice of K becomes very critical.

To overcome this problem, in the proposed QHCF algorithm, we first apply a Quad-Tree procedure to presegment the image instead of using K-means. The advantages are as follows:

(i) There is no need to predefine the number of classes because the QuadTree algorithm can dynamically decide the partitions based on its splitting criterion.

(ii) ^-means is a clustering method, in which the grouping is done in the measurement domain and has no spatial connectivity constraint to those pixels with the same label during the iteration process, while in the QuadTree splitting process, the grouping of pixels is done in the image spatial domain so that each region will not share labels with others.

The Quad-Tree procedure initially divides the whole image into four equal-size blocks. By checking the value of region criterion (RC) Vrc, each block will be evaluated whether it is qualified to be an individual region. The RC for each block Bi is defined as

If Vrc is smaller than a predefined threshold Trc, the block is regarded as a region. Otherwise, it will further be divided into smaller blocks by the same splitting method until every small block satisfies. The choice of Trc is application-dependent and defines how uniform each region will be.

Compared with the other systematic initialization schemes, such as uniform grid initialization presented in [19], the Quad-Tree initialization is more accurate and efficient in adaptive detection of the initial regions. The reasons are listed as follows:

(i) The selection of initial sites is based on the consideration of pixels in the surrounding region due to the spatial constraint.

(ii) The points within the same region are not used for initialization repeatedly so that unnecessary computations can be avoided.

(iii) The iterative comparisons can be simplified during the HCF labeling process and the problem of "unlabeled small region" in [19] canbe solved.

Moreover, by applying the Quad-Tree preestimation, the segmentation result can be more consistent than the uniform grid method when the initialization parameter changes in certain situations (this will be shown by experimental results at the end of this section).

After the Quad-Tree initialization, in each region, the pixel with closest value to the mean of region intensity is selected as the representative and assigned a unique label; the others are all uncommitted and labeled 0. The algorithm will then start to update labels according to the procedures given in Fig. 8.3 until the energy of the whole image becomes stable. In this label updating process, the pixels are permitted to change only within the committed states or from uncommitted states to committed states. They are not allowed to become uncommitted from committed states.

To simplify the calculation, we assume variance with a2 = 2. The local energy is normalized as

Es(x) = (ys - Ms)2 + Us(x), and the confidence becomes:

max{Es(xk) - Es(xi)} when s is committed with labelk,

Xi e{xm | m = 1,...,Ns, xm = 0}; max{Es(0) - Es(Lmin)}, when s is uncommitted, (8.29)

L min = arg min{ Es(Xi)}, and 0 = {xxn | n = 1,..., Nl , X = 0};

representing the degree of confidence with which we can update the label of pixel s. Different from the definition in Eq. (8.26), for a committed site, the range for label searching is reduced to those existing within its neighborhood to decrease the confidence computation. For those uncommitted pixels at each Quad-Tree region, once any of their neighboring pixels is labeled, their energy and confidence will be affected. The confidence updating process needs to be conducted by applying Eq. (8.29).

After getting the new confidence of these sites, we search the whole image and select the one with the largest confidence as the next candidate to be updated. Any candidate site has one of three relations with its neighboring pixels: isolated, surrounded, or partially surrounded. When isolated (all neighboring pixels are uncommitted), the candidate pixel is given a new label different from existing labels. When surrounded (all neighboring pixels are committed), a unique label for this pixel becomes unlikely. We therefore select one label for this pixel from the labels owned by its neighbors according to xs — min{Es(xi)} LN is the set of neighboring labels of site s, (8.30)

ieLN

making the energy of the candidate pixel become minimal. When the situation of partially surrounded (neighboring pixels are partially committed) occurs, we first consider it as a surrounded case and if the selected label xk cannot satisfy Es(xk) < max{Es(0) — Es(Lmin)}, then a new label is assigned to this pixel. With this updating strategy, each region is entitled to have a unique label. This is different from the original HCF in which disjointed regions may share the same label. The advantage of the proposed QHCF is that the estimation of each region's mean value, ms, is decided solely by the sites in this region, resulting in more precise estimates during the label updating process.

As shown in Fig. 8.3, the segmented procedure stops when all the pixels possess negative confidence, where any change of a single pixel's label will increase the image's overall energy. However this does not guarantee that the image energy will converge to a global minimum. The tradeoff here is the processing speed because the adopted HCF algorithm can always finish in a finite time [21], providing a feasible solution to practical applications.

## Relaxation Audio Sounds Babbling Brook

This is an audio all about guiding you to relaxation. This is a Relaxation Audio Sounds with sounds from the Babbling Brooks.

## Post a comment