## Relationships Between Quality Measures

As image quality can be quantified by diagnostic accuracy, subjective ratings, or computable measures such as signal-to-noise ratio (SNR), one key question concerns the degree to which these different measures agree. Verifications of medical image quality by perceptual measures require the detailed, time-consuming, and expensive efforts of human observers, typically highly trained radiologists. Therefore, it is desirable to find computable measures that strongly correlate with or predict the perceptual measures.

In previous sections we have studied how certain parameters such as percent measurement error and subjective scores appear to change with bit rate. It is assumed that bit rate has a direct effect on the likely measurement error or subjective score and therefore the variables are correlated. In this sense, bit rate can also be viewed as a "predictor." For instance, a low bit rate of 0.36 bits per pixel (bpp) may "predict" a high percent measurement error or a low subjective score. If the goal is to produce images that lead to low measurement error, parameters that are good predictors of measurement error are useful for evaluating images as well as for evaluating the effect of image processing techniques. A "good" predictor is a combination of an algorithm and predictor variable that estimates the measurement error within a narrow confidence interval.

Percent measurement error can be predicted from other variables besides bit rate. The following graphs give an indication of whether subjective scores, SNR, or image distortion are good predictors of measurement error. For instance, does a high subjective score or high SNR generally lead to low percent measurement error? We plot percent measurement error against each predictor variable of interest in Figs 1, 2, and 3. Subjective scores and SNR are as defined in previous chapters and MSE distortion is taken to be the average nonnormalized squared distortion between the original and compressed image.

How does one quantify whether or not some variable is a good predictor? In the remainder of this section, we examine the usefulness of SNR as a predictor of subjective quality for the MR data set. Our work suggests that cross-validated fits to the data using generalized linear models can be used to examine

the usefulness of computable measures as predictors for human-derived quality measures. In the example studied later, the computable measure is SNR, and the human-derived measure is the subjective rating, but the method presented is applicable to other types of prediction problems.

In the classical linear regression model, the "predictor" x is related to the outcome y by y = frx + S,

where fi is a vector of unknown coefficients, and the error s at least has mean zero and constant variance, or may even be normally distributed. In the regression problem of using SNR to predict subjective quality scores, the response variable y takes on integer values between 1 and 5, and so the assumption of constant variance is inappropriate because the variance of y depends on its mean. Furthermore, y takes on values only in a limited range, and the linear model does not follow that constraint without additional untenable assumptions. We turn to a generalized linear model that is designed for modeling binary and, more generally, multinomial data [6].

A generalized linear model requires two functions: a link function that specifies how the mean depends on the linear predictors, and a variance function that describes how the variance of the response variable depends on its mean. If X1, X2,... Xn are independent Poisson variables, then conditional upon their sum, their joint distribution is multinomial. Thus, the regression can be carried out with the Poisson link and variance functions:

FIGURE 2 Percent measurement error vs SNR for the MR study.

ftx = ln ^ and var(y ) = in which case the mean of the response variable i is

The results of this approach are shown in Fig. 4. The predictors are a quadratic spline in SNR:

FIGURE 2 Percent measurement error vs SNR for the MR study.

where the spline knot snr0 was chosen to be 22.0 (the average SNR value of the data set). In Fig. 4, the x symbols denote the raw data pairs (subjective score, SNR) for the judges pooled, and the curve is the regression fit. The o symbols denote the 95% confidence intervals obtained from the bootstrapped BCa method [4,5]. This method is outlined next. The null deviance (a measure of goodness of fit) of the data set is 229 on 449 degrees of freedom, and the residual deviance of the fit is 118 on 446 degrees of freedom, indicating a useful fit. The model parameters were estimated using the statistical software S,

FIGURE 4 Expected subjective score (y-axis) vs SNR (x-axis) (Permission for reprint, courtesy Society for Information Display).

## Post a comment