## Statistical Size and Power

The size of a test is the probability of incorrectly rejecting the null hypothesis if it is true. The power of a test is the probability of correctly rejecting the null hypothesis if it is false. For a given hypothesis and test statistic, one constrains the size of the test to be small and attempts to make the power of the test as large as possible.

Given a specified size, test statistic, null hypothesis, and alternative, statistical power can be estimated using the common (but sometimes inappropriate) assumption that the data are Gaussian. As data are gathered, however, improved estimates can be obtained by modern computer-intensive statistical methods. For example, the power and size can be computed for each test statistic described earlier to test the hypothesis that digital mammography of a specified bit rate is equal or superior to film screen mammography with the given statistic and alternative hypothesis to be suggested by the data. In the absence of data, we can only guess the behavior of the collected data to approximate the power and size. We consider a one-sided test with the "null hypothesis" that, whatever the criterion [management or detection sensitivity, specificity, or predictive value positive (PVP)], the digitally acquired mam-mograms or lossy compressed mammograms of a particular rate are worse than analog. The "alternative" is that they are better. In accordance with standard practice, we take our tests to have size 0.05. We here focus on sensitivity and specificity of management decisions, but the general approach can be extended to other tests and tasks.

Approximate computations of power devolve from the 2 by 2 agreement tables of the form of Table 1. In this table, the rows correspond to one technology (for example analog) and