Within epidemiology, scan statistics are a well-used and thriving analytic method. As a result, they have been incorporated into several experimental biosurveillance systems such as Heffernan et al. (2004), Lombardo et al. (2003), and Yih et al. (2004). We recommend caution when using scan statistics on new kinds of data. For example, in our early experiences of applying scan statistics to over-the-counter retail pharmacy data, it was immediately clear that simplistic assumptions in the underlying model can lead to false alarms: there are dozens of nondisease-related reasons for clusters of over-the-counter medication purchases to occur. Conversely, with the wrong data, even a sophisticated model will fail. For example, if home zipcodes are the only data in an emergency department's records then an attack on a downtown office location might not appear as a spatial cluster (although it is possible that appropriate use of commuting statistics can help in this case) (Buckeridge et al., 2003, Duczmal and Buckeridge, 2005). We believe that careful modeling is needed in order to overcome these effects on novel sources of data, and there is considerable ongoing work in the area, such as Kleinman et al. (2004, 2005).
Was this article helpful?