However, selecting a large training database in medical image processing is not an easy task and it may be infeasible in many applications. In reality, the size of databases used in many studies reported to date is very limited. Thus, different cross-validation methods have been widely used to evaluate the performance of an ANN or a BBN. Although there are many theoretically sound techniques for validating computerassisted diagnosis or classification schemes for medical images , most are based on the assumption that the training database covers the entire sample space sufficiently well. When the case domain is adequately sampled, and the investigator takes great care not to overtrain the classifier, these are valid approaches. This is typically the case when the feature domain is reasonably limited and well defined in many other fields, such as recognition of optical characters or mechanical parts in an assembly line. Unfortunately, it is not the case in many clinical applications. The diagnostic problems in medical images are often complex and available data are often limited and rather noisy. The noise is typically inherent because of the poor specificity of the various clinical symptoms and findings presented in the medical images. Tourassi et al. presented a study that used three different statistical methods, cross validations with various partition ratios on the training and testing samples, round robin (leave-one-out strategy), and bootstrap, to evaluate the diagnostic performance of two ANNs in diagnosing pulmonary embolism and breast cancer. The experimental results demonstrated that predictive assessment of both ANNs varied substantially depending on the training sample size and training stopping criterion. The study then concluded that it was difficult to identify the best statistical method to estimate the predictive accuracy of an ANN. The decision of selecting a validation method depends on the complexity of the diagnostic problems, and the number as well as the variability of the available sample cases. To reduce the bias, the ANN should be validated by different methods. If different validation methods achieve similar estimations of the performance of an ANN, the users can be more confident about the ANN's true diagnostic performance . The same concept is also valid for a BBN, because different combinations of learning samples may change the conditional probabilities inside the BBN, which could then change the predictive values in testing new cases.
Was this article helpful?