## Info

Image Segmentation by Fuzzy Clustering: Methods and Issues

Melanie A. Sutton 1 Introduction 87

James C. Bezdek 2 The Quantitative Basis of Fuzzy Image Segmentation 87

Tobias C. Cahoon 2.1 Fuzzy Models: What Are They, and Why? • 2.2 Numerical Pattern Recognition • 2.3 Feature

University of West Florida Extraction • 2.4 2D Image Segmentation • 2.5 Segmentation in Medical Applications

3 Qualitative Discussion of a Few Fuzzy Image Segmentation Methods 97

3.1 Unsupervised Segmentation: Track USA • 3.2 Unsupervised Segmentation: Track USB • 3.3 Supervised Segmentation: Track Su • 3.4 Three-Dimensional Applications

4 Conclusions and Discussion 103

4.1 Track USA • 4.2 Track USB • 4.3 Track Su • 4.4 Final Comments

References 105

### 1 Introduction

This chapter is about segmenting medical images with fuzzy models. Probably 80% of the methods described are based on some form of clustering or classifier design, so Section 2 contains a brief introduction to the basic ideas underlying fuzzy pattern recognition. If nothing else, we hope this piques your interest in fuzzy models, which were an amusing and controversial, but usually disregarded novelty in science as little as 10 years ago. Today, fuzzy models are an important tool in many scientific studies and fielded engineering applications. The impetus for using fuzzy models came from control theory. Many important applications based on fuzzy controllers have made their way into the marketplace in recent years [1,2]. Fuzzy pattern recognition also counts some important successes as we approach the millennium, and this chapter is an all too short account of the impact of fuzzy models in medical image segmentation [3,4,38].

Section 3 contains a few case studies of applications of algorithms that are mentioned in Section 2 in medical image segmentation. Current research efforts involving unsuper-vised and supervised segmentation of medical images with two spatial dimensions (2D images) in applications such as brain tissue analysis (e.g., to detect pathology) and mammography (to detect tumors) are presented. We also discuss fuzzy models for problems that—at least in principle— involve images with three spatial dimensions (3D images).

Most of these models are aimed toward just two 3D applications: visualization and (segmentation for) volume estimation, both of which can be used for surgical planning and therapy. Finally, some conclusions and possible topics for future research are discussed in Section 4.

2 The Quantitative Basis of Fuzzy Image Segmentation

2.1 Fuzzy Models: What Are They, and Why?

This section is based on material first published in [5]. Fuzzy sets are a generalization of conventional set theory that were introduced by Zadeh in 1965 as a mathematical way to represent vagueness in everyday life [6]. The basic idea of fuzzy sets is easy and natural. Suppose, as you approach a red light, you must advise a driving student when to apply the brakes. Would you say, "Begin braking 74 feet from the crosswalk"? Or would your advice be more like, "Apply the brakes pretty soon"? The latter, of course; the former instruction is too precise to be implemented. This illustrates that crisp precision may be quite useless, while vague directions can be interpreted and acted upon. Moreover, this particular type of vagueness does not exhibit an element of chance—i.e., it is not probabilistic. Many other situations, of course—coin flip is a nice example—clearly involve an element of randomness, or

Copyright © 2000 by Academic Press.

### All rights of reproduction in any form reserved.

chance. Accordingly, computational models of real systems should also be able to recognize, represent, manipulate, interpret, and use (act on) both fuzzy and statistical uncertainties.

Fuzzy interpretations of data structures are a very natural and intuitively plausible way to formulate and solve various problems. Conventional (crisp) sets contain objects that satisfy precise properties required for membership. The set of numbers H from 6 to 8 is crisp; we write H = {re< r < 8}. Equivalently, H is described by its membership (or characteristic, or indicator) function, mH : 1} defined as m (r) = i1; 6 < r < 8\ HV 7 \0; otherwise/'

The crisp set H and the graph of mH are shown in the left half of Fig. 1. Every real number (r) either is in H, or is not. Since mH maps all real numbers r e ^ onto the two points {0,1}, crisp sets correspond to two-valued logic—is or isn't, on or off, black or white, 1 or 0.

Consider next the set F of real numbers that are close to seven. Since the property "close to 7'' is fuzzy, there is not a unique membership function for F. Rather, the modeler must decide, based on the potential application and properties desired for F, what mF should be. Properties that might seem plausible for this F include: (i) Normality (mF(7) = 1); (ii) Monotonicity (the closer r is to 7, the closer mF(r) is to 1, and conversely); and (iii) Symmetry (numbers equally far left and right of 7 should have equal memberships). Given these intuitive constraints, either of the functions shown in the right half of Fig. 1 might be a useful representation of F. mFx is discrete (the staircase graph), while mF2 is continuous but not smooth (the triangle graph). You can easily construct a membership function for F so that every number has some positive membership in F, but we wouldn't expect numbers "far from 7,'' 20,000,987 for example, to have much! One of the biggest differences between crisp and fuzzy sets is that the former always have unique membership functions, whereas every fuzzy set has an infinite number of membership functions that may represent it. This is at once both a weakness and a strength; uniqueness is sacrificed, but this gives a concomitant gain in terms of flexibility, enabling fuzzy models to be "adjusted" for maximum utility in a given situation.

In conventional set theory, sets of real objects such as the numbers in H are equivalent to, and isomorphically described by, a unique membership function such as mH. However, there is no set-theoretic equivalent of "real objects'' corresponding to mF. Fuzzy sets are always (and only) functions, from a "universe of objects,'' say X, into [0,1]. This is depicted in Fig. 2, which illustrates that the fuzzy set is the function m that carries X into [0,1]. The value of m at x, m(x), is an estimate of the similarity of x to objects that closely match the properties represented by the semantics of m.

One of the first questions asked about fuzzy models, and the one that is still asked most often, concerns the relationship of fuzziness to probability. Are fuzzy sets just a clever disguise for statistical models? Well, in a word, NO. Perhaps an example will help—this one is reprinted from the inaugural issue of the IEEE Transactions on Fuzzy Systems [5].

Let the set of all liquids be the universe of objects, and let fuzzy subset L = {all potable (= "suitable for drinking'') liquids}. Suppose you had been in the desert for a week without a drink and came upon two bottles A and B, marked as in the left half of Fig. 3 (memb = "membership", and prob = "probability").

Confronted with this pair of bottles, which would you choose to drink from first? Most readers familiar with the basic ideas of fuzzy sets, when presented with this experiment, immediately see that while A could contain, say, swamp water, it would not (discounting the possibility of a Machiavellian fuzzy modeler) contain liquids such as hydrochloric acid. That is, membership of 0.91 in L means that the contents of A are

depends, of course, entirely on the membership function of the fuzzy set L).

We think this shows that these two types of models possess philosophically different kinds of information: fuzzy memberships, which represent similarities of objects to imprecisely defined properties: and probabilities, which convey information about relative frequencies. Moreover, interpretations about and decisions based on these values also depend on the actual numerical magnitudes assigned to particular objects and events. See [7] for an amusing contrary view, [8] for a statistician's objection to the bottles example, and [9] for a reply to [8]. The point is, fuzzy models aren't really that different from more familiar ones. Sometimes they work better, and sometimes not. This is really the only criterion that should be used to judge any model, and there is much evidence nowadays that fuzzy approaches to real problems are often a good alternative to more familiar schemes. References [1-4] give you a start on accessing the maze of literature on this topic, and the entire issue in which [8,9] appear is devoted to the topic of "fuzziness vs probability.'' It's a silly argument that will probably (l) never disappear; the proof is in the pudding, and (2), as you will see in this chapter, fuzzy sets can, and often do, deliver.

### 2.2 Numerical Pattern Recognition

There are two types of pattern recognition models—numerical and syntactic—and three basic approaches for each type— deterministic, statistical, and fuzzy. Rather than take you into this maze, we will steer directly to the parts of this topic that help us segment images with fuzzy clustering. Object data are represented as X = {x1,xn }c^p, a set of n feature vectors in feature space ffl. The jth object is a physical entity such as a fish, guitar, motorcycle, or cigar. Column vector Xj is

iwmfctfAELj = 0 91 pmb [BrLJ = 0.91 moiiUAf L)-0-91 prt^iötLJ =0,00

FIGURE 3 Bottles for the weary traveler—disguised and unmasked!

iwmfctfAELj = 0 91 pmb [BrLJ = 0.91 moiiUAf L)-0-91 prt^iötLJ =0,00

FIGURE 3 Bottles for the weary traveler—disguised and unmasked!

FIGURE 2 Fuzzy sets are membership functions.

FIGURE 2 Fuzzy sets are membership functions.

"fairly similar'' to perfectly potable liquids (pure water, perhaps). On the other hand, the probability that B is potable = 0.91 means that over a long run of experiments, the contents of B are expected to be potable in about 91% of the trials. And the other 9%? In these cases the contents will be unsavory (indeed, possibly deadly). Thus, your odds for selecting a nonpotable liquid are about 1 chance in 10. Thus, most subjects will opt for a chance to drink swamp water, and will choose bottle A. Suppose that we examine the contents of A and B, and discover them to be as shown in the right half of Fig. 3—that is, A contains beer, while B contains hydrochloric acid. After observation, then, the membership value for A will be unchanged, while the probability value for B clearly drops from 0.91 to 0.0.

Finally, what would be the effect of changing the numerical information in this example? Suppose that the membership and probability values were both 0.5—would this influence your choice? Almost certainly it would. In this case many observers would switch to bottle B, since it offers a 50% chance of being drinkable, whereas a membership value this low would presumably indicate a liquid unsuitable for drinking (this the numerical representation of object j and xkj is the kth feature or attribute value associated with it. Boldface means vector, plainface means scalar. Features can be either continuously or discretely valued, or can be a mixture of both. X = {(1, 1)T, (0, 3.1; , (1, -1.2)T} is a set of n = 3 feature vectors in p = two-dimensional (2D) feature space.

One of the most basic structures in pattern recognition is the label vector. There are four types of class labels—crisp, fuzzy, probabilistic, and possibilistic. Letting integer c denote the number of classes, 1 < c < n, define three sets of label vectors in :

Npc = {y e r : y e [0,1]Vi, y >03i} = [0,1]c - {0} (1)

## Post a comment