## Clustering Based Method

Even though the mMRF may provide a robust solution to the segmentation problem of multiple spectral data, the intrinsic limitations of this model discussed in section 8.4.2.3 may hinder its further application. Therefore, there is a motivation to pursue a more relaxed and practical approach that can satisfy the following conditions:

(i) No presumption on the distribution function of the multiple dimension data.

(ii) Fast processing speed that is suitable for interactive applications.

In this section, a nonparametric clustering algorithm was developed to fulfill the above requirements and employed to process the multiple contrast weighting MRI images.

### 8.4.3.1 Introduction

Clustering is generally regarded as an effective technique for automatic data grouping based on a given similarity measurement. In the clustering process, discrete objects can be assigned to groups that have similar characteristics. The object can be single value, such as pixel's intensity in gray level image, or a vector in multidimensional data space. Generally, the clustering approaches usually consist of the following two steps:

(i) Cluster distribution or center searching.

(ii) Classification of dataset.

In the first step, the cluster center or the expression of density function is usually estimated so that the distribution of dataset can be clearly described. The second step is mainly on the dispatch of elements from the dataset based on certain criteria such as biggest similarity, shortest distance, maximum likelihood (ML), and K-nearest neighbor, etc.

8.4.3.1.1 Definition of Problem. First, we establish a clear definition of the studied problem. Suppose the number of MR imaging contrast weightings involved in the study is D, and images, Id, d = 1,---, D, are obtained from the same location of a patient's carotid artery. Assuming that there is no motion between these acquisitions and all the images have the same dimensions, K = R x C, where R and C are the number of rows and columns of image. The data for each location k, k = 1,...,K, can be expressed as vk = [v1k,..., VDk]T, vdk e Id, d = 1,..., D. vdk is the intensity value at pixel location k in the contrast weighting image d. Based on this, a d-dimensional space with dataset V ={vk, k = 1,...,K} is created. Our goal is to construct a new segmentation image (same as the label matrix defined in HCF study), which contains uniform regions, some of which will be the plaque tissues of interest in the cross sectional image of the carotid artery.

8.4.3.1.2 Data Cluster Center Searching. Based on the above definition of MCW MR image segmentation problem, the two steps in the clustering approaches can be described more specifically as (i) to estimate the cluster centers of the data vectors in the d-dimensional space and (ii) to partition the K data vectors into clusters by mapping them to the "nearest" cluster center under certain rules so as to generate the composite segmentation image.

Cluster center searching in multidimensional space is also called the multivariate location problem, and numerous nonparametric methods have been proposed [85, 86]. Among them, the minimum volume ellipsoid (MVE) estimator by Rousseeuw  is one of the most well-known solutions. It is defined as the ellipsoid that satisfies the following two conditions: (i) covering at least h elements of dataset V in d-dimensional space; (ii) the minimum volume. Then the center of this ellipsoid is regarded as the multivariate location estimate, the region with highest density. The MVE searching methods normally use randomly selected elements in V as the ellipsoids' initial centers. After inflating each of these ellipsoids until h elements are covered, the one with minimum volume is selected as a searched mode. Then we can remove the elements associated with this mode from dataset Vand a new search is repeated until all the cluster centers are found. Although many approaches based on this multivariate locations estimator have shown its success in various applications , experimental results indicate that the performance of MVE decreases when the number of modes is greater than 10 . The reason for this phenomenon is that the density definition in MVE presumes the multivariate normal distribution. Therefore, in the case of multiple modes, where no mixture of Gaussian distribution appears, MVE model will not be able to give an accurate description.

Another type of cluster searching techniques is called nonparametric estimation. The advantage is that they require no prior knowledge of the form of the density function in the search process. They can be applied to arbitrary distribution dataset. There are two main categories of methods for nonparametric density estimation, Parzen window and k-nearest neighbors. For the Parzen window approach, the kernel type needs to be given before it is applied. In the k-nearest neighbor method, the number of neighbors must be assigned in advance. Therefore, both require additional prior information. In addition, they are hard to optimally initialize.

To overcome these problems and provide more robust cluster estimation, a framework was proposed by Dorin . It is based on the mean-shift algorithm, a nonparametric procedure to estimate density gradients. Although this method claims to avoid the drawbacks of most existing approaches, it still has the following weaknesses: (1) The initialization scheme cannot guarantee that all the cluster centers are under consideration in the search process because of its random tessellation selection and (2) because of the static size of the search sphere, the approaching speed may be slowed and the accuracy of cluster center estimation may be affected.

In this study, the data to be processed is MCW MR image, in which the distribution of data can vary arbitrarily between subjects. Since it is impossible to obtain the forms of the underlying density function, a nonparametric technique has to be employed for multivariate location estimation. To overcome the problems in Dorin's method, we first apply a preestimation of cluster distribution to guarantee that all the typical cluster centers are considered in the initial center set. In addition, a dynamic sphere size is introduced to improve the searching performance.

8.4.3.1.3 Mean Shift Density Estimator. The mean shift algorithm was proposed by Fukunaga and Hosterler in 1975 as a "very intuitive"  estimator for data density. In 1995, Cheng generalized this algorithm and conducted a more rigorous study . In this section, we will review its estimation of the density gradient in a uni-modal situation.

Assume f (v) is the probability density function of a d-dimensional variable v. Suppose that centered with vector r, a sphere, Sv, with radius r is established. For any given vector within this sphere, y, the expected distance to sphere center v is

E[(y - v)\Sv] = f (y - v)f (y \ Sv)dy = f (y - v) f (y) dy (8.52)

With Taylor expansion, f (v) can be expressed as f (y) = f (v) + (y - v)T V f (v) (8.53)

For f (y e Sv), when Sv is sufficiently small, it can be approximated as f (y e Sv) = f (v)VSv, (8.54)

where Vsv = c ■ rD represents the sphere volume. Based onEqs. (8.53) and (8.54), Eq. (8.52) becomes [83, 89]:

Expand the LHS of Eq. (8.55), the expected center of the sphere E[v \ v e Sv] and v have the following relation:

For a given sphere, Eq. (8.56) shows that the difference vector between the local estimation of cluster center and sphere center is proportional to V f(v), the gradient of the density function at v. It is also reciprocal to f(v). When it approaches the mode, f(v) is generally large in value and V f(v) is small due to the slow increase, so a small mean-shift vector is applied. Compared to traditional density gradient searching techniques , in which only V f (v) is considered, the dynamically adjusted step size used in this method is more accurate. Additionally, the mean-shift vector is always guaranteed to point to the direction of the maximum density. However, in the case when v is far away from the mode, the density change is often very small (close to uniform) within the small sphere under consideration. This may lead to the mean-shift algorithm failing to predict the correct direction and step size. Therefore, some measures need to be taken if the initial x is not at a location close to the mode.

### 8.4.3.2 Dynamic Mean-Shift Density Estimation

In the density estimation process, the mean-shift vector is very important to the search speed and accuracy of result. In previous methods , the size of the local estimation sphere is always fixed. This may not work well in the following two situations: (1) In a location which is far from the mode where the density distribution is relatively uniform, and if the sphere size is not large enough, the mean-shift vector may be misled and (2) when the search progress is close to the mode, the mean-shift vector needs to be sensitive enough to catch the local change. Therefore, if the sphere volume is too large, the local information cannot play the determinant role in the vector calculation and hence may affect the accuracy of center searching.

To solve these problems, we propose a dynamic search range with q levels: r1,..., rq, where ri < ri+1. The values of r1 and rq come from prior analysis of MCW MR image data. In the search process, the initial radius starts from rq (the largest radius). If the mean-shift vector is over the stopping threshold Tstop, it moves to the next location with the same sphere radius as the previous position. If the mean-shift vector is lower than Tstop, it uses the next smaller radius to calculate the mean-shift vector again. Once the mode is found, a small perturbation is applied  and the procedure is repeated to avoid a local maximum.

Given the initial center of a sphere at location v and starting with sphere radius index i = q, the proposed dynamic mean-shift algorithm is implemented in the following procedures:

1. Based on x and r = ri, compute the mean-shift vector vms.

2. If vms > Tstop, v = v + vms and repeat step 1; else if i = 1, i = i — 1 and repeat step 1 until convergence.

3. If perturbation has not been applied, add a small vector upert to the converged result and repeat steps 1 and 2, where | upert | = |r | and its direction is randomly selected.

8.4.3.3 DMC-Based Segmentation

8.4.3.3.1 Cluster Center Searching. Based on the enhanced means shift density estimation, the steps for cluster searching are as follows:

1. Apply the Quad-Tree procedure to partition each contrast weighting image and assign a vector at the center of each region as a member of the initial center set X = {xi, i = 1,..., n}. In addition to the two constraints in previous method , the minimum neighboring points' distance and sphere density—a more strict constraint is applied on the pixel location in each contrast weighting image—the distance of the corresponding pixels' locations to any two elements xi and xj, cannot be too close.

2. For each point in the center set, apply the dynamic mean-shift search described in section 8.4.3.2 to find the candidate cluster centers in the d-dimensional data space.

3. Partition the center set elements into subsets in which the distances between points are within Tsub. Merging each subset by calculating the mean of elements in it, we arrive at a new center set Y = {yi, i = 1,..., p}.

4. To validate the cluster centers, the constraint on the valley between every two elements in the center set, yp and yq, is applied . Each point, for example yy after a fixed interval along the line linking yp and yq is checked and the corresponding density f (x) is estimated with Epanechnikov kernel :

0 otherwise

a valley is detected. If no valley is detected between yp and yq, the one with lower density will be removed.

600 500

128 x 128

256 x 256 Image Resolution

### 512 x 512

Figure 8.18: Average segmentation time comparison for multiple contrast weighting images and single contrast weighting images using MRF model based algorithm.

5. Using the elements in the center set as the cluster centers, the data in the d-dimensional space can be decomposed with the k-nearest neighbor approach.

8.4.3.3.2 Spatial Constrained Data Classification. The segmentation of MCW MR images is based on the cluster searching method described in section 8.4.3.3.1. After decomposing the d-dimensional vector space into p clusters, the segmented image can be derived by mapping each vector v in dataset V to a cluster center ye Y. In addition, some image domain constraints can be further enforced so as to fine-tune the results. Figure 8.19 shows the segmentation procedure.

Note that the spatial constraint step is used to apply the spatial information in the image domain to the dataset dispatching process. Those isolated pixels that are unlikely to be independent regions are often merged with their neighbors. This decision is made based on the a posterior probability:

p(vks = Li | Vks) a exp j -(vk - /j,Li)2 - ^ [UN(vs = Li) + Ue(vs = Li)] 1, I s^Sim > Figure 8.19: Processing flow chart of the proposed MCW MR image segmentation algorithm.

in which Li and ¡xLi are assumed to be the label and mean of neighboring region. UN and UE are the clique energy functions . A MAP criterion is applied to find the most reliable neighboring region to merge into.

### 8.4.3.4 MCW MR Images Segmentation

Two categories of experiments are designed to apply the multispectral segmentation technique on MCW MR image. One compares the effectiveness of multiple contrast weightings in the MR image segmentation, and the other verifies the partition accuracy of typical plaque tissues.

To show the advantage of using multiple contrast weighting techniques over single contrast weighting, 20 cases of ex vivo MR images were obtained using T1W, PDW, and TOF. In each case, the described MCW segmentation approach was applied to the three contrast-weighted images with the result IMCW. The single contrast weighting image in each case was also segmented using the same approach of assigning the dimension of vector space to one; the results are /SCW1, /SCW2, and/SCW3. All other parameters remained the same. Preliminary work had indicated that composite segmentation might disclose information not available with single segmentation, this again proved to be the case. Figure 8.19 compares MCW to single contrast weighting segmentation results, whereby (a), (c), and (e) are the original images obtained with T1W, PD2, and TOF, (a) T1W (b) (c) PDW (d) (e) TOF (f)

Figure 8.20: An example of MCW MR image segmentation. (a), (c), (e) The original contrast weighting images at the same location. (b), (d), (f) The corresponding single contrast weighing segmentations. (g) (color) Result with the proposed multiple contrast weighting segmentation algorithm.

respectively, and (b), (d), and (f) are the corresponding segmentation results using single contrast weighting only. Three distinctive differences are labeled in the images. Arrow 1 points to a region poorly segmented by the single T1W and TOF images. Arrow 2 shows a region that loses all detail in the TOF segmentation. Arrow 3 points to an area that by PDW segmentation shows no detail. However, the MCW-segmented image (Fig. 8.20(g)) retrieves and reveals these details by considering all contrast weightings in the segmentation process. Of 20 cases analyzed, 17 showed distinct differences occurring at more than two locations between the MCW and SCW methods. Lower image quality (blurring, poor contrast, and imaging artifacts) was responsible for the poor segmentation result in the three remaining cases.

The second category of experiments was used to verify the partitioning of typical tissues of interest of the atherosclerotic plaque. Again we applied the above algorithm to ex vivo MR images of endarterectomy specimens. Using histology as the gold standard, we compared each segmented MR image to the best corresponding histology section. The carotid bifurcation was used as a landmark for location registration. On sections distant from the bifurcation, lumen size and shape and distinctive regions of calcification were used for matching. A coordinate system of eight segments was generated and applied to the matched histology and MR images. Figure 8.20 contains a pair of sample images. Figure 8.21: An example of MCW MR images verification (CA: calcium, LM: loose matrix, NEC: necrotic debris). (a) (color) Segmentation of MCW MR images. (b) (color) Outlined corresponding histology section.

Figure 8.21: An example of MCW MR images verification (CA: calcium, LM: loose matrix, NEC: necrotic debris). (a) (color) Segmentation of MCW MR images. (b) (color) Outlined corresponding histology section.

 Existence* Tissue type Yes No Misdetect rate (%) Calcium 69 2 2.8 Calcium (speckled) 18 2 10.0 Necrotic core 16 2 11.1 Necrotic core (mixed) 41 3 6.8 Foam cells 18 2 10.0 Fibrous plaque (dense) 165 8 4.6

Figure 8.21(a) is a MCW segmented MR image overlaid with the eight sector coordinate system. Figure 8.21(b) is the corresponding histology section stained with Mallory's Trichrome. Tissues of interest were outlined and labeled prior to matching.

A preliminary study of 22 matched MRI histology sections from eight patients were analyzed with results shown in Table 8.5. Typical tissues of the atherosclerotic plaque such as calcification, fibrous matrix, and mixed necrotic cores appear to have good agreement with histology. For those improperly matched cases, besides the inaccuracy of segmentation algorithm, the following may also be part of the reasons that affect the comparison results: (i) low-image quality (noise involved in the imaging process) and (ii) the deformation of plaques in the making process of histology section, including shrinkage. These are beyond the study of our research. However, further refinement of our technique may allow for better detection of the less discrete tissues such as loose matrix, speckled calcification, and intraplaque hemorrhage.