1) UOQ



9) outer/lateral


whole breast

2) UIQ



10) inner/medial



3) LOQ



11) upper/cranial


axillary tail

4) LIQ



12) lower/inferior



17) both breasts/bilateral

View(s) in which finding is seen: Associated findings include: (p = possible,

3) nipple retraction (p, d)

5) lymphadenopathy (p, d)

6) trabecular thickening (p, d)

:c MLO CC and MLO

8) architectural distortion



9) calcs associated with mass



10) multiple similar masses



11) dilated veins



12) asymmetric density



13) none

( P.


C/B: call back for more information, additional assessment needed BX: Immediate biopsy

These categories are formed by combining categories from the basic form of Fig. 13: RTS is any study that had assessment = 1 or 2, F/U is assessment = 3, C/B is assessment = indeterminate/incomplete with best guess either unsure it exists, 2 or 3, and BX is assessment = indeterminate/incomplete with best guess either 4L, 4M, 4H or 5, or assessment = 4L, 4M, 4H or 5.

We also consider the binarization of these four categories into two groups: Normal and Not Normal. But there is controversy as to where the F/U category belongs, so we make its placement optional with either group. The point is to see if lossy compression makes any difference to the fundamental decision made in screening: Does the patient return to ordinary screening as normal, or is there suspicion of a problem and hence the demand for further work?

Truth is determined by agreement with a gold standard. The raw results are plotted as a collection of 2 x 2 tables, one for each category or group of categories of interest and for each radiologist. As will be discussed, the differences among radiologists prove to be so large an effect that extreme care

Assessment: The finding is

(A) indeterminate/incomplete, additional assessment needed

What? 1) spot mag 2) extra views 3) U/S 4) old films 5) mag

What is your best guess as to the finding's 1-5 assessment?_or are you uncertain if the finding exists? Y

(1) (N) negative-return to screening

(2) (B) benign (also negative but with benign findings)—return to screening

(3) (P) probably benign finding requiring 6-manth dollowup (4L) (S) suspicion of malignancy (low), biopsy

(4M) (S) suspicion of malignancy (medium), biopsy (4H) (S) suspicion of malignancy (high), biopsy (5) radiographic malignancy, biopsy


Size:_cm long axis by__cm short axis

Distance from center of finding to: nipple_cm left edge_cm, top edge_cm

MLO View Size:_cm long axis by__cm short axis

Distance from center of finding to: nipple_cm left edge_cm, top edge_cm

FIGURE 13 Observer form for mammograms: This assesment portion is completed for each finding in a case.

Measurements: CC View must be taken when doing any pooling or averaging of results across radiologists. A typical table is shown in Table 2.

The columns correspond to image modality or method I and the rows to II; I could be original analog and II original digitized, or I could be original digitized and II compressed digitized. "R" and "W" correspond to "right" (agreement with gold standard) and "wrong" (disagreement with gold standard). The particular statistics could be, for example, the decision of "normal", i.e., return to ordinary screening. Regardless of statistic, the goal is to quantify the degree, if any, to which differences exist.

One way to quantify the existence of statistically significant differences is by an exact McNemar test, which is based on the following argument. If there are N(1,2) entries in the (1,2) place and N(2,1) in the (2,1) place, and the technologies are

TABLE 2 Agreement 2x2 table


