The question being addressed when measuring the epidemic detection performance of an NLP application in the domain of biosurveillance is How well does the NLP application contribute to detection of an outbreak? Evaluating epidemic detection is difficult. The first requirement for an epidemic detection study is reference standard identification of an outbreak. Outbreaks of respiratory and GI illnesses, such as influenza, pneumonia, and gastroenteritis, occur yearly throughout the country. Outbreaks of other infectious or otherwise concerning diseases, such as anthrax, West Nile virus, hemor-rhagic fever, or SARS, rarely occur in the United States. Once an outbreak is identified, the next requirement for an epidemic detection evaluation is having access to textual data for an adequate sample of patients living in the geographical area of the outbreak.
One example of an evaluation of epidemic detection involving NLP technology was performed by Ivanov et al. (Ivanov et al., 2003). The evaluation used ICD-9 discharge diagnoses to define retrospective outbreaks of pediatric respiratory and gastrointestinal syndromes over a five year period (1998-2001) in four contiguous counties in Utah. Outcome measures were reported for correlation between chief complaint classifications and ICD-9 classifications and for timeliness of detection. Figure 17.7 from the Ivanov publication shows the time series plot of respiratory illness admissions (reference standard) and chief complaints. It is evident from the plot that chief complaints generated the same type of signal that the reference standard generated. Chief complaint classification detected three respiratory outbreaks with 100% sensitivity and specificity, and time series of chief complaints correlated with hospital admissions and preceded them by an average of 10.3 days.
A study by Irvin (Irvin et al., 2003) showed that numeric chief complaints could correctly detect an influenza outbreak between 1999 and 2000 with one false positive alarm.Although the chief complaints were numeric instead of textual, the same study design could be applied to free-text chief complaint classification for known outbreaks.
Evaluating feature detection is an important first step in evaluation of NLP techniques to ensure that the technology is working as expected. However, to truly understand the impact of NLP in outbreak and disease surveillance, evaluations of case detection and epidemic detection must also be performed.
Was this article helpful?