This section describes studies that address Hypotheses 2 and 3, which we reproduce here for convenient reference:

Hypothesis 2: ICD code sets can discriminate between whether there is an outbreak of type Y or not.

Hypothesis 3: Algorithmic monitoring of ICD code sets can detect an outbreak of type Y earlier than current best practice (or some other reference method).

As with chief complaints, challenges in studying the outbreak detection performance of ICD code sets include collecting data from multiple healthcare facilities in the region affected by an outbreak and achieving adequate sample size of outbreaks for study. To emphasize the importance of sample size, we divide this section into studies of multiple outbreaks (N>1) and studies of single outbreaks (N=1). Unlike chief complaints, no published prospective studies exist.

N>1 Studies. Yih et al. (2005) used the detection algorithm method to study retrospectively 110 gastrointestinal disease outbreaks in Minnesota. They studied daily counts of the CDC/DoD Gastrointestinal, All code set. The detection algorithm they used was a space-time scan statistic algorithm

12 They computed a PPV of 97%. We analyzed the data reported in the paper to compute this specificity result.

13 They computed a PPV of 65%; We analyzed the data reported in the paper to compute this specificity result.

(Kleinman et al., 2005). The ICD code data they studied included 8% of the population. The outbreaks ranged in size from 2 to 720 cases with a median outbreak size of 7. Half of the outbreaks were foodborne and approximately half were caused by viruses. They defined a true alarm as a cluster of disease identified by the detection algorithm that was within 5 km of the outbreak, and that occurred any time from one week prior to the start of the outbreak to one week after the outbreak investigation began. The sensitivity of Gastrointestinal, All for the detection of small gastrointestinal outbreaks at a false alarm rate of 1 every 2 years was 1%. They also measured sensitivity at false alarm rates of one per year (sensitivity = 2%), 1 per 6 months (3%), 1 per 2 months (5%), 1 per month (8%), and 1 per 2 weeks (13%). They did not measure timeliness of detection. The low percentage of population covered may partly explain the low sensitivity.

N=1 Studies. Tsui et al. (2001) conducted the first study of outbreak detection accuracy and timeliness of ICD codes available soon after a patient visit. It is also the only study to compare the outbreak-detection performance of two ICD code sets for the same outbreak. They used correlation analysis and the detection-algorithm method to study the accuracy and timeliness of an ILI and a respiratory ICD code set for the detection of an influenza outbreak.14 Pneumonia and influenza (P&I) deaths were used as the gold-standard determination of the outbreak. The correlation analysis showed a two-week lead time for both ICD code sets relative to P&I deaths.15 The detection algorithm analysis used the Serfling algorithm. Both ICD code sets had a sensitivity of 100% (1/1). The false alarm rate for the ILI code set was one per year and for the respiratory code set was one per three months. Detection of the influenza outbreak occurred one week earlier from both code sets relative to detection from P&I deaths at these false alarm rates and measurements of sensitivity.

Miller et al. (2004) used correlation analysis and the detection algorithm method to study the ability of ICD data obtained from a large healthcare organization in Minnesota to detect an influenza outbreak. Using P&I deaths as a gold standard, Miller demonstrated a significant Pearson correlation of 0.41 of an ILI ICD code set with P&I deaths. When they lagged the ILI time series by one week, the Pearson correlation remained 0.41. For the detection-algorithm method, they used a CUSUM algorithm. CUSUM detected the outbreak from ILI (sensitivity of 1/1) one day earlier than the date the health department confirmed the first positive influenza isolate. They did not report the false alarm rate of the CUSUM algorithm for this detection sensitivity and timeliness.

Lazarus et al. (2002) used correlation analysis to study the ability of ICD codes to detect an influenza outbreak (Lazarus et al., 2002).They obtained the ICD codes from a point-of-care system used by a large multispecialty physician group practice. Clinicians assigned ICD-coded diagnoses at the time of either an in-person or phone consultation. They used hospital discharge data as a gold standard. They created two time series using the same lower respiratory ICD code set (based on the DoD-GEIS respiratory code set): one from their point-of-care data and one from the hospital discharge data. The maximum correlation of these time series was 0.92 when point-of-care ICD codes preceded hospital-discharge ICD codes by two weeks.

Lewis et al. (2002) used correlation analysis to study the ability of ICD codes to detect an outbreak of influenza. They used CDC sentinel physician data for influenza as a gold standard. Physicians assigned ICD codes at outpatient DoD facilities. The correlation of the DoD-GEIS respiratory code set with the CDC sentinel physician data was 0.7.They did not measure the correlation at time lags other than zero.

Suyama et al. (2003) used correlation analysis to study whether ICD codes contain a signal of notifiable diseases reported to the health department. They used health department data about notifiable diseases as a gold standard. Billing administrators assigned ICD codes to ED visits. They created time series from both data sets using the same ICD code sets: gastrointestinal, pulmonary, central nervous system, skin, fever, and viral. They found statistically significant correlations between gastrointestinal,pulmonary, and central nervous system code sets and notifiable diseases. Conversely, the skin, fever, and viral syndromes did not have a statistically significant correlation with notifiable diseases.

Was this article helpful?

## Post a comment