Classification Algorithm

Classification was done with a three-layer, feed-forward artificial neural network (ANN) consisting of an input layer, one hidden layer, and an output layer. The NevProp4 backpropagation software was used in this study. NevProp4 is a general backpropagation algorithm developed by Philip H. Goodman at the University of Nevada, Reno [60]. Figure 13.15 shows a diagram of the network structure.

The feature vector of the input layer consisted of 14 elements (features) defined in the previous stage (Table 13.2) and a bias element [60]. The hidden layer consisted of 12 nodes and the output layer had one node. For each cluster, the network was given the set of shape features at its input layer, merged these inputs internally using the hidden and output layers, and assigned a value in the range of 0-1, where 0 was the target output for the benign cases and 1 was the

Percent likelihood of malignancy

Output layer

Input layer Hidden layer

Figure 13.15: Diagram of the NevProp4 artificial neural network (ANN) used for cluster classification. This is a standard three-layer, feed-forward ANN where F1-F14 are the input features, I1-I14 are the input units, H1-H12 are the hidden units, and O is the output layer [20, 59, 60].

Percent likelihood of malignancy

Output layer

F1 F2 F3 F4

Input layer Hidden layer

Figure 13.15: Diagram of the NevProp4 artificial neural network (ANN) used for cluster classification. This is a standard three-layer, feed-forward ANN where F1-F14 are the input features, I1-I14 are the input units, H1-H12 are the hidden units, and O is the output layer [20, 59, 60].

target output for the cancer cases. This value could be interpreted as a percent likelihood for a cluster to be malignant.

The generalization error of the ANN classifier was estimated by the "leave-one-out" resampling method [61, 62]. Leave-one-out is a method generally recommended for the validation of pattern recognition algorithms using small datasets. The use of this approach usually leads to a more realistic index of performance and eliminates database problems such as small size and not fully representative contents and problems associated with the mixing of training and testing datasets [61, 63]. In the leave-one-out validation process, the network was trained on all but one of the cases in the set for a fixed number of iterations and then tested on the one excluded case. The excluded case was then replaced, the network weights were reinitialized, and the training was repeated by excluding a different case until every case had been excluded once. For N cases, each exclusion of one case resulted in N-1 training cases, 1 testing case and a unique set of network weights. As the process was repeated over all N, there were N(N — 1) training outputs and N testing outputs from which the training and testing mean square error (MSE) was, respectively, determined.

In addition to the leave-one-out method, other resampling approaches have been proposed for CADiagnosis algorithm training that could yield unbiased results and provide meaningful and realistic estimates of performance. A preference toward the bootstrap technique is found in the literature although this is strongly dependent on the application and availability of resources [64]. There is considerable work reported in the field and we will not elaborate more in this chapter. The reader, however, should be aware of the bias issues associated with large feature sets and small sample sizes and the possible methods of training and testing an algorithm. An approach should generally be selected that yields no overestimates of performance and avoids training bias.

The clinical value of CADiagnosis methods is usually assessed in two stages: First, computer performance is evaluated based on truth files defined by the experts and biopsy information using computer generated receiver operating characteristic (ROC) curves [65, 66]. Computer ROC is implemented in the evaluation of classification algorithms where sensitivity and specificity indices are generated by adjusting the algorithms' parameters. Classification algorithms differentiate usually between benign vs. cancer lesions, disease vs. not disease, disease type 1 vs. disease type 2, etc. The pairs of sensitivity and specificity generated by these algorithms can be plotted as a true positive fraction (TPF) vs. false positive fraction (FPF) to form an ROC curve [65]. Publicly available software tools, e.g., the ROCKIT from Metz at the University of Chicago [67], may be used to fit the data and estimate performance parameters such as the area under the curve, AZ, its standard error (SE), confidence intervals, and statistical significance.

Following the laboratory evaluation, a true ROC experiment is usually performed that involves relatively large number of cases and human observers [68]. The cost and time requirements of an observer ROC study are significant impediments in its implementation and such analysis is usually reserved for fully optimized techniques, namely for techniques that have been through rigorous computer ROC evaluation. Computer ROC evaluation poses specific requirements on database size and contents and the criteria used for the estimation of TPF and FPF values at the detection or the classification level. We will not labor on these issues in this chapter; guidelines may be found in several publications in the field of CAD and elsewhere [65, 69, 70]. We will only mention that a sufficiently large set should be selected for CADiagnosis validation to meet the requirements of the classification scheme, while the contents of the dataset should be such as to address the specific clinical goals of the methodology. In addition, performance criteria should follow clinical guidelines and be consistently and uniformly applied throughout the validation process. In the CADiagnosis algorithm applications presented below, equal numbers of benign and malignant cases with calcification clusters were used and almost all cluster shapes described in the BIRADS Lexicon [71] were represented in the sets. Performance parameters such as number of TP and FP clusters at the segmentation output or TPF and FPF at the classification output were estimated based on well-defined criteria that were consistently applied to all experiments.

10 Ways To Fight Off Cancer

10 Ways To Fight Off Cancer

Learning About 10 Ways Fight Off Cancer Can Have Amazing Benefits For Your Life The Best Tips On How To Keep This Killer At Bay Discovering that you or a loved one has cancer can be utterly terrifying. All the same, once you comprehend the causes of cancer and learn how to reverse those causes, you or your loved one may have more than a fighting chance of beating out cancer.

Get My Free Ebook


Post a comment