D

10 log10-

A common alternative normalization when the input is itself an r-bit discrete variable is to replace the variance or energy by the maximum input symbol energy (2r — 1)2, yielding the so-called peak signal-to-noise ratio (PSNR).

A key attribute of useful distortion measures is ease of computation, but other properties are also important. Ideally a distortion measure should reflect perceptual quality or usefulness in a particular application. No easily computable distortion measure such as squared error is generally agreed to have this property. Common faults of squared error are that a slight spatial shift of an image causes a large numerical distortion but no visual distortion and, conversely, a small average distortion can result in a damaging visual artifact if all the error is concentrated in a small important region. It is because of such shortcomings that many other quality measures have been studied. The pioneering work of Budrikus [10], Stockham [60], and Mannos and Sakrison [36] was aimed at developing computable measures of distortion that emphasize perceptually important attributes of an image by incorporating knowledge of human vision. Theirs and subsequent work has provided a bewildering variety of candidate measures of image quality or distortion [3-5, 7,16,17,19,20,25,27,29, 32-34, 37,40-44, 53, 55, 58, 63, 67]. Similar studies have been carried out for speech compression and other digital speech processing [49]. Examples are general lp norms such as the absolute error (l:), the cube root of the sum of the cubed errors (l3), and maximum error (lm), as well as variations on such error measures that incorporate linear weighting. A popular form is weighted quadratic distortion that attempts to incorporate properties of the human visual system such as sensitivity to edges, insensitivity to textures, and other masking effects. The image and the original can be transformed prior to computing distortion, providing a wide family of spectral distortions, which can also incorporate weighting in the transform domain to reflect perceptual importance. Alternatively, one can capture the perceptual aspects by linearly filtering the original and reproduction images prior to forming a distortion, which is equivalent to weighting the distortion in the transform domain. A simple variation of SNR that has proved popular in the speech and audio field is the segmental SNR, which is an average of local SNRs in a log scale [28,49], effectively replacing the arithmetic average of distortion by a geometric average.

In addition to easing computation and reflecting perceptual quality, a third desirable property of a distortion measure is tractability in analysis. The popularity of squared error is partly owed to the wealth of theory and numerical methods available for the analysis and synthesis of systems that are optimal in the sense of minimizing mean squared error. One might design a system to minimize mean squared error because it is a straightforward optimization, but then use a different, more complicated measure to evaluate quality because it does better at predicting subjective quality. Ideally, one would like to have a subjectively meaningful distortion measure that could be incorporated into the system design. There are techniques for incorporating subjective criteria into compression system design, but these tend to be somewhat indirect. For example, one can transform the image and assign bits to transform coefficients according to their perceptual importance or use postfiltering to emphasize important subbands before compression [51,52,60].

The traditional manner for comparing the performance of different lossy compression systems is to plot distortion rate or SNR vs bit rate curves. Figure 5a shows a scatter plot of the rate-SNR pairs for 24 images in the lung CT study. Only the compressed images can be shown on this plot, as the original images have by definition no noise and therefore infinite SNR. The plot includes a quadratic spline fit with a single knot at 1.5 bpp. Regression splines [48] are simple and flexible models for tracking data that can be fit by least squares. The fitting tends to be "local" in that the fitted average value at a particular bit rate is influenced primarily by observed data at nearby bit rates. The curve has four unknown parameters and can be expressed as y = a0 + a1 x + a2 x2 + b2 (max(0, x — 1.5))2. (2)

It is quadratic "by region" and is continuous with a continuous first derivative across the knot, where the functional form of the quadratic changes. Quadratic spline fits provide good indications of the overall distortion-rate performance of the code family on the test data. In this case, the location of the knot was chosen arbitrarily to be near the center of the data set. It would have been possible to allow the data themselves to guide the choice of knot location. The SNR results for the CT mediastinal images were very similar to those for the lung task. For the MR study, Fig. 5b shows SNR versus bit rate for the 30 test images compressed to the five bit rates. The knot is at 1.0 bpp.

For the mammography study, the SNRs are summarized in Tables 2 and 3. The overall averages are reported as well as the

0 0

Post a comment