In this subsection, we briefly consider extensions of the spatial scan statistic to spatio-temporal cluster detection. The spacetime scan statistic was first proposed by Kulldorff et al. (1998), and a variant was applied to prospective disease surveillance by Kulldorff (2001). The goal of the space-time scan statistic is a straightforward extension of the purely spatial scan: to detect regions of space-time where the counts are significantly higher than expected. Let us assume that we have a discrete set of time steps t — 1...T (e.g., daily observations for T days), and for each spatial location si, we have counts cit representing the observed number of cases in the given area on each time step. There are two very simple ways of extending the spatial scan to space-time: to run a separate spatial scan for each time step t, or to treat time as an extra dimension and thus run a single multidimensional spatial scan in space-time (for example, we could search over three-dimensional "hyper-rectangles'' which represent a given rectangular region of space during a given time interval).The problem with the first method is that, by only examining one day of data at a time, we may fail to detect more slowly emerging outbreaks. The problem with the second method is that we tend to find less relevant clusters: for prospective disease surveillance, we want to detect newly emerging clusters of disease, not those that have persisted for a long time. Thus, in order to achieve better methods for space-time cluster detection, we must consider the question, "How is the time dimension different from space?'' In Neill et al. (2005b), we argue that there are three main distinctions:

1. The concept of "now.'' In the time dimension, the present is an important point of reference: we are typically only interested in disease clusters that are still "active'' at the present time, and that have emerged within the recent past (e.g., within a few days or a week). We do not want to detect clusters that have persisted for months or years, and we are also not interested in those clusters which have already come and gone. The exception to this, of course, is if we are performing a retrospective analysis, attempting to detect all space-time clusters regardless of how long ago they occurred. The space-time scan statistic for retrospective analysis was first presented in Kulldorff et al. (1998), and the space-time scan statistic for prospective analysis was first presented in Kulldorff (2001). In brief, the retrospective statistic searches over time intervals tmin...tmax, where 1 < tmin < tmin < T, while the prospective statistic searches over time intervals tmin ... T, where 1 < tmin < T, adjusting correctly for multiple hypothesis testing in each case. We focus here on prospective analysis, since this is more relevant for our typical disease surveillance task.

2. "Learning from the past.'' In the space-time cluster detection task, we often do not have reliable denominator data (i.e., populations), so we must infer the expected counts b! of recent days from the time series of previous counts cit, taking into account effects such as seasonality and day of week. Some methods for inferring these expected counts were discussed in the previous section; see Neill et al. (2005b) for further discussion.

3. The "arrow of time.'' Time has a fixed directionality, moving from the past, through the present, to the future. We typically expect disease clusters to emerge in time: For example, a disease may start out having only minor impact on the affected population, then increase its impact (and thus the observed symptom counts) either gradually or rapidly until it peaks. Based on this observation, we propose a variant of the scan statistic designed for more rapid detection of emerging outbreaks (Neill et al., 2005b). The idea is that rather than assuming (as in the standard, "persistent'' space-time scan statistic) that the disease rate q remains constant over the course of an epidemic, we expect the disease rate to increase over time, and thus we fit a model which assumes a monotonically increasing sequence of disease rates qt at each affected time step t in the affected region. In Neill et al. (2005b), we show that this "emerging cluster'' space-time scan statistic often outperforms the standard "persistent cluster'' approach. We note that Iyengar (2005) accounts for a different aspect of the arrow of time: this method searches over truncated pyramid shapes in space-time, allowing detection of spatial clusters that move, grow, or shrink linearly with time.

Taking these factors into account, the prospective space-time scan statistic has two main parts: inferring (based on past counts) what we expect the recent counts to be, and finding regions where the observed recent counts are significantly higher than expected. More precisely, given a "temporal window size'' W, we wish to know whether any space-time cluster within the last W days has counts ct higher than expected. To do so, we first infer the expected counts bt = E[ct] for all spatial locations on each recent day t, T - W < t < T. See Neill et al. (2005b), Kulldorff et al. (2005), and Kleinman et al. (2005) for methods of inferring these expected counts; earlier methods such as Kulldorff et al. (1998) and Kulldorff (2001) instead use at-risk populations determined from census data. Next, we choose the models H0 and H1(S, tmin), where the null hypothesis H0 assumes no clusters and the alternative hypothesis H1(S, tmin) represents a cluster in spatial region S starting at time tmin and continuing to the present time T. Neill et al. (2005b) gives two such models, one for persistent clusters and one for emerging clusters. From our model, we can derive the corresponding score function D(S, tmin) using the likelihood ratio statistic, and then find the space-time cluster (S*, tmin*) which maximizes the score function D. Finally, we can compute the statistical significance (p-value) of this space-time cluster by randomization testing, as above. More details of the space-time method described here, as well as empirical tests on several semi-synthetic outbreak data sets, are given in Neill et al. (2005b).We also refer the reader to Kulldorff et al. (1998, 2005), Kulldorff (2001), and Kleinman et al. (2005) for other useful perspectives on space-time cluster detection.

Was this article helpful?

## Post a comment