In the area of deformable models, the definition of the initial estimation (see Eq. (7.9)) from which we can start the model evolution (the initialization step) is a difficult and important task. Problems associated with fitting the model to data could be reduced if a better start point for the search were available. In this section, we show a set of methods used to find the initial curve (or surface).

We start with methods that use image statistics and morphological techniques, and later we present modern approaches, such as neural nets.

7.3.1 Region-Based Approaches

The simplest way to initializing deformable models is through a preprocessing step in which the structures of interest are enhanced.

This can be done by image statistics extracted by image histograms or pattern recognition techniques  (see  for a recent review). These statistics can be represented by a mean ¡x and variance a of the image field I or any other field

Figure 7.3: Original grayscale image of human's torso.

defined over the image domain (fuzzy fields [33,76], for example). The aim is to find statistical representation of the objects, which means:

where k is an used defined parameter .

In some applications, a threshold T could be enough to characterize the object(s). Iterative and entropy methods can be obtained by simple inspection .

For an illustrative example, Fig. 7.3 shows an image of a cross section slice of a human's torso, where we can see several interesting regions such as arteries, bones, and lungs (the two largest central black regions).

Suppose we are interested in extracting the boundary of the right lung. First of all, we should isolate, in each slice, the region of interest.

Applying Eq. (7.22) with a threshold, e.g. 30, we have as output the result pictured in Fig. 7.4(a). Thus, an isoline extraction method can be used to get a rough approximation of the target boundary. Figure 7.4(b) shows the obtained curve over the original data.

We can observe that the curve is not smooth, there are protrusions and concavities due to inhomogeneities of the image field. Besides, some regions of interest may be merged (or even slit) after binarization. Such difficulties

Figure 7.4: (a) Result of applying a threshold T = 30 over image of Fig. 7.3. (b) Initialization through isoline extraction.

arise even when the images are preprocessed with more robust segmentation approaches, such as image foresting transformation  or other fuzzy techniques [70, 76]. These problems make threshold-based methods not very adequate for deformable models initialization.

In the following section, we discuss an approach to improve the automatic detection of an initial curve.

7.3.2 Mathematical Morphology for Initialization

The use of mathematical morphology to initialize deformable models is a subject with few references in the literature [59,76].

For the particular case of medical images, the general idea is to isolate objects of interest (such as lungs, arteries, heart, bones, etc.) in the scene and to work with them individually, avoiding neighboring interference of other objects, noise, spurious artifacts, or background.

Mathematical morphology is a known set of mathematical tools used in digital image processing area to perform linear transformations on the shapes of images's regions. There are two basic morphological operations: erosion and dilation. They will be defined next to make this text self-contained.

Let us take the image X and a template B, the structuring element. They will be represented as sets in two-dimensional Euclidean space. Let Bx denote the translation of B so that its origin is located at x. Then the erosion of X by B is defined as the set of all points x such that Bx is included in X, that is, erosion : X © B ={x : Bx c X}. (7.23)

Similarly, the dilation of X by B is defined as the set of all points x such that Bx hits X, that is, they have a nonempty intersection:

These two operations are the base of all more complex transformations in mathematical morphology. For example, we can use an opening which consists of an erosion followed by a dilation of the result. This operation allows one to disconnect two different regions for treating them separately. The dual of opening is the close operation, which consists of an erosion over the dilation's result. The effect of closing an image is rightly the opposite of opening: It connects weak separated regions (see  for a review of other useful operations). Figure 7.5: (a) Edge map after using the canny algorithm in the image of Fig. 7.3. (b) Erosion result over the Canny algorithm output. (c) Isolated region of interest. (d) Final result after dilation.

Figure 7.5: (a) Edge map after using the canny algorithm in the image of Fig. 7.3. (b) Erosion result over the Canny algorithm output. (c) Isolated region of interest. (d) Final result after dilation.

In this section, we are interested in applying morphological chains (sequence of a morphological operations) techniques to isolate specific regions in medical images. These extracted regions will be used for initializing deformable models.

We begin with a grayscale image such as in Fig. 7.3. Firstly, an edge detection filter is applied. The Canny edge detector was used , despite the fact that there are many other possibilities [13,35,40]. Figure 7.5(a) gives the result of applying the Canny methodology over the image in Fig. 7.3.

In Fig. 7.5(a) note that the two white predominant regions at the center of the image are the two lungs, which are the regions of interest. For convenience, this image was inverted with regard to its black-white pixels before initializing the morphological process. In this case, when applying the erosion operation (Eq. (7.23)) over the image in Fig. 7.5(a), we eliminate artifacts, weak edges, and separate weak connected regions. The net effect is to attenuate or eliminate high-frequency components. In the example of Fig. 7.5(a), we used a cross-structuring element. The result can be seen in Fig. 7.5(b).

Now, the two bigger regions are detached from the other ones, and we can separate and treat them individually. Figure 7.5(c) shows this result.

To restore the original size of the lung, we can apply the dilation operation (Eq. (7.24)). The result can be seen in Fig. 7.5(d).

Finally, an algorithm for isoline extraction gives the polygonal curve pictured in Fig. 7.5(d). This curve is an approximation of the desired boundary. It can be used as the initial curve for a deformable model. Figure 7.6: Original image with the outlined initial contour.

The obtained contour was plotted over the original image for matching (Fig. 7.6). If compared with Fig. 7.4(b) we observe an improvement in the obtained initialization.

### 7.3.3 Neural Nets

Neural networks have been used for instantiating deformable models for face detection  and handwritten digit recognition tasks  (see also  and references therein). To the best of our knowledge, there are no references using neural nets to initialize deformable models for medical images. However, the network system proposed in , which segments MR images of the thorax, may be closer to this proposal.

In this method each slice is a gray-level image composed of (256 x 256) pixels values and is accompanied by a corresponding (target) image containing just the outline of the region. Target images were obtained using a semiautomatic technique based on a region growing algorithm. The general idea is to use a multilayer perceptron (MLP), where each pixel of each slice is classified into a contour-boundary and non-contour-boundary one.

The inputs to the MLP are intensity values of pixels from a (7 x 7) window centered on the pixel to be classified. This window size was found to be the smallest that enabled the contour boundary to be distinguished from the other image's artifacts. The output is a single node trained to have an activation of 1.0 for an input window centered in the pixel of a contour boundary, and 0.0 otherwise. The network has a single hidden layer of 30 nodes.

The network was trained using error backpropagation [12,55] with weight elimination  to improve the network's generalization ability. The training data should be constructed interactively: A proportion of misclassified examples should be added to the training set and used for retraining. The process is initiated from small random selection of contour-boundary and non-contour-boundary examples and should be terminated when a reasonable classification (on a given slice) is achieved.

The MLP classified each pixel independently of the others, and therefore has no notion of a closed contour. Consequently, the contour boundaries it produces are often fragmented and noisy (false negatives and false positives, respectively). Then, with this initial set of points classified as contour boundaries, a deformable model is used to link the boundary segments together, while attempting to ignore noise.

In  the elastic net algorithm is used. This technique is based on the following equations:

where Au+l is an interslice smoothing force, K is a simulated annealing term, a, p, y are predefined parameters, and Gj is a normalized Gaussian that weights the action of the force that acts over the net point ujj due to edge point p,i (l is the slice index).

The deformable model initialization is performed by using a large circle encompassing the lung boundary in each slice. This process can be improved by using the training set.

As an example, let us consider the work  in handwritten digit recognition. In this reference, each digit is modeled by a cubic B-spline whose shape is determined by the positions of the control points in the object-based frame. The models have eight control points, except for the one model which has three, and the model for the number seven which has five control points. A model is transformed from the object-based frame to the image-based frame by an affine transformation which allows translation, rotation, dilation, elongation, and shearing. The model initialization is done by determining the corresponding parameters. Next, model deformations will be produced by perturbing the control points away from their initial locations.

There are ten classes of handwritten digits. A feedforward neural network is trained to predict the position of the control points in a normalized 16 x 16 gray-level image. The network uses a standard three-layer architecture. The outputs are the location of the control points in the normalized image. By inverting the

^ = « £ Gij(pil - ujl) + Kß (uj+hl - 2utj,l + utj_u), (7.25)

^ = « £ Gij(pil - ujl) + Kß (uj+hl - 2utj,l + utj_u), (7.25)

normalization process, the positions of the control points in the unnormalized image are determined. The affine transformation corresponding to these image can then be determined by running a special search procedure.