Whats Strange About Recent Events

The basic question asked by anomaly detection systems is whether anything strange has occurred in recent events. This question requires defining what it means to be recent and what it means to be strange. The WSARE algorithm (Wong and Moore, 2002, Wong et al., 2002, Wong et al., 2003) is one example of an algorithm that attempts to do this. It defines all patient records falling on the current day under evaluation to be recent events. Note that this definition of recent is not restrictive—the approach is fully general and "recent" can be defined to include all events within some other time period such as over the last six hours. In order to define an anomaly, we need to establish the concept of something being normal. In WSARE version 2.0, baseline behavior is assumed to be captured by raw historical data from the same day of week in order to avoid environmental effects such as weekend versus weekday differences in the number of ED cases. This baseline period must be chosen from a time period similar to the current day. This can be achieved by being close enough to the current day to capture any seasonal or recent trends. On the other hand, the baseline period must also be sufficiently distant from the current day. This distance is required in case an outbreak happens on the current day but it remains undetected. If the baseline period is too close to the current day, the baseline period will quickly incorporate the outbreak cases as time progresses. In the description of WSARE 2.0 below, we assume that baseline behavior is captured by records that are 35, 42, 49, and 56 days prior to the day under consideration. We would like to emphasize that this baseline period is only used as an example; it can be easily modified to another time period without major changes to the algorithm. In a later section, we will illustrate how version 3.0 of WSARE automatically generates the baseline using a Bayesian network.

We will refer to the events that fit a certain rule for the current day as Crecent. Similarly, the number of cases matching the same rule from the baseline period will be called Cbaseline. As an example, suppose the current day is Tuesday

November 2 003

Was this article helpful?

0 0

Post a comment