Case Detection Outbreak Detection and Outbreak Characterization

RODS Laboratory, Center for Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania San Diego County, Health and Human Services Agency and Graduate School of Public Health, Epidemiology and Biostatistics, San Diego State University, San Diego, California Center for Public Health Practice, University of Pittsburgh, Graduate School of Public Health, Pittsburgh, Pennsylvania In Chapter 2, we saw how some well-known outbreaks first came to the attention of investigators and how...

Timeliness

The postmortem medical examination, conducted only rarely, produces data about an event that is inherently late (death), using methods that are time consuming (establishing with certainty the cause of death). This implies an inherent limitation to the timeliness of medical examiner data. The medical examiner can make available the results of the physical examination soon after its completion, thanks to the use of electronic record systems. A biosurveillance system could access the data...

Info

S 47 51 i 7 II 15 21 21 31 35 < 43 J7 51 3 figure 21.1 Weekly counts for several types of routinely collected data for various time periods around the December 1999 influenza outbreak in Pittsburgh. Each data type is plotted on a normalized scale. Lab, influenza cultures from the University of Pittsburgh Medical Center Health System WebMD, counts of queries to a national web health site using words such as cold and flu Cough and cold and cough syrup, grocery chain point of purchase counts...

Feature Detection

The first type of NLP evaluation should measure the application's ability to detect features from text. The question being addressed when quantifying the performance of feature detection for the domain of biosurveillance is How well does the NLP application determine the values to the variables of interest from text For our SARS detector, examples of feature detection evaluations include how well the NLP application can determine whether a patient has a respiratory-related chief complaint,...

Summary of Linguistic Issues

We have described some of the linguistic characteristics of the sublanguage of patient medical records, including linguistic variation, polysemy, negation, contextual information, finding validation, implication, and coreference. If we want to automatically determine an individual patient's values for the variables used in our expert system, we must address these linguistic characteristics, using the types of information a physician uses to understand the meaning of the words and sentences in...

Employer And Military Attendance Reporting

In contrast to the situation in schools, business and financial needs drive the design and operation of employer attendance systems. Further, successively higher levels of the business hierarchy are concerned primarily with operational status. They ask, Is our business unit functioning properly today , and not Who is absent today and why are they absent In general, tracking attendance serves human resource and compensation needs, and departments that track attendance design their work processes...

Determining Whether Bayesian Algorithms Are Well Calibrated

For those algorithms that compute posterior probabilities (of a case, an outbreak, or an outbreak characteristic) from surveillance data, it is important to know whether the posterior probabilities are accurate that is, do they in the long run reflect the actual frequency of cases or outbreaks as determined by a gold standard. We refer to an algorithm that satisfies this requirement as well calibrated. For example, if an evaluator runs a case detection algorithm on a sample of 100 patients of...

Cbbs And Cbbs Projects

A computer-based biosurveillance system (CBBS) collects and analyzes surveillance data. A CBBS project manager must understand not only generic project management but also the information technology (IT) underlying a project and how such a project differs from a typical IT project. Common IT elements found in a CBBS include the following A data warehouse as a data repository for the often very large databases (VLDBs). Data mining and statistical tools to analyze the data in the data warehouse....

Technologies For Natural Language Processing

NLP techniques fall into two broad classes statistical and symbolic. Statistical techniques use information from the frequency distribution of words within a text to classify or extract information. Symbolic techniques use information from the structure of the language (syntax) and the domain of interest (semantics) to interpret the text to the extent necessary for encoding the text into targeted categories. Although some NLP applications exclusively use one or the other technique, many...

Introduction

The Internet is revolutionizing biosurveillance.1 It is already the electronic network over which people and biosurveillance systems located anywhere on the planet communicate and exchange data. Modern biosurveillance systems (e.g., RealTime Outbreak and Disease Surveillance RODS , National Retail Data Monitor NRDM , web-based disease reporting systems, BOSSS, ESSENCE, and BioSense) would not be possible without the Internet. Hospitals, laboratories, schools, retailers, and individuals...

Background

Early, reliable detection of outbreaks of disease, whether natural (e.g., West Nile virus and SARS) or bioterrorist induced (e.g., anthrax and smallpox), is a critically important problem today. We need to detect outbreaks as early as possible in order to provide the best response and treatment, as well as improve the chances of identifying the source. Outbreaks often present signals that are weak and noisy early in the event. If we hope to achieve rapid and reliable detection, it likely will...

Internet As Sentinel I Promedmail

ProMED is the Program for Monitoring Emerging Diseases. As described on its website, ProMED-mail is an Internet-based reporting system dedicated to rapid global dissemination of information on outbreaks of infectious diseases and acute exposures to toxins that affect human health, including those in animals and in plants grown for food or animal feed.'' ProMED-mail is developed and maintained by the International Society for Infectious Diseases and the Harvard School of Public Health, and is...

Case Detection

The question being addressed when measuring the case detection ability of an NLP application for the domain of biosurveillance is How well does the NLP application identify relevant patients from textual data For our SARS detector, examples of case detection evaluations include how well the NLP application can determine whether a patient has a respiratory syndrome, whether a patient has a fever, whether a patient has radiological evidence of pneumonia, or whether a patient has SARS. Figure 17.6...

Governmental Oversight

Government regulation of potable water use falls into two categories. The Clean Water Act (CWA), originally enacted in 1972 and subsequently amended, covers discharge of waste-water. It established the basic framework for regulation of pollutant release into U.S. waters and provided the authority to execute plans to control pollution (EPA, n.d.). This act established water pollution guidelines, particularly water quality standards for industrial wastewater. The Safe Drinking Water Act (SDWA),...

Additional Resources

National Atmospheric Release Advisory Center (NARAC) website http narac.llnl.gov index.html.This is NARAC website, which is discussed in this chapter. Environmental Protection Agency Support Center for Regulatory Air Models website www.epa.gov ttn scram . This website lists several models of atmospheric dispersion that the EPA recommends for use in regulatory applications. Hazard Prediction and Assessment Capability (HPAC) website hpac.cfm. This is the HPAC website, which is discussed in this...

Elsevier

AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Academic Press is an imprint of Elsevier 30 Corporate Drive, Suite 400, Burlington, MA 01803, USA 525 B Street, Suite 1900, San Diego, California 92101-4495, USA 84 Theobald's Road, London WC1X 8RR, UK This book is printed on acid-free paper. Copyright 2006, Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic...

Calculation of Populations Baselines

In the discussion of the spatial scan methods above, we have paid relatively little attention to the question of how the underlying populations or baselines are obtained. In the population-based methods, we often start from census data, which gives an unadjusted population pi corresponding to each spatial location si. This population can then be adjusted for covariates such as the distribution of patient age and gender, giving an estimated at-risk population'' for each spatial location. In a...

References

Campbell, M., Li, C.-S.,Aggarwal, C and et al. (2004). An Evaluation of Over-the-Counter Medication Sales for Syndromic Surveillance. IBM Research Technical Report. IBM Research Division, Thomas J. Watson Research Center, Yorktown Heights, New York. Retrieved from Chapman, W., Haug, P. (1999). Comparing expert systems for identifying chest x-ray reports that support pneumonia. In Proceedings of American Medical Informatics Association Symposium, 216-20. Chapman, W., Wagner, M., Cooper, G., et...

Implications for Biosurveillance

A 911 system is actually a set of components that work together to provide communications and data storage functions. The component with the most immediate potential for biosurveillance is the CAD database, which collects, in real time, in addition to the information listed above (the problem, previous call history from the location), information from emergency responders at the location. This information includes clinical information regarding patients and sick animals, contaminants and human...

O o o

He promulgated the use of standards for the naming of laboratory tests and the reporting of their results (discussed in detail in Chapter 32). Until these standards are more universally used in health care, however, the construction of biosurveillance systems that collect laboratory data from hospitals will be time-consuming and expensive (the monitoring of laboratory test results and orders from national laboratory companies is far more feasible at present, as we...

Molecular Methods

Direct culturing is the standard assay for analysis of biological agents in water supplies, except for Cryptosporidium, which cannot be grown. The time required for organisms to grow on culture media limits this usefulness of this method when an agency must respond quickly to a crisis. The recent progress in developing genomic sequence databases has led to the development of a number of molecular methods that are capable of more rapid, highly-specific analyses. These assays use biomole-cules,...

N

Once we have the sample mean and standard deviation, we can calculate the upper control limit as shown in Eq. 3. We can signal an alarm if the actual count exceeds the upper control limit, and the larger the amount by which we exceed the upper control limit, the greater the severity of the alarm. The alarm level is defined as the probability of seeing the count Xi or a lesser count under the assumption that the observations are distributed according to a normal distribution with mean I and...

Summary

Clinical data collected by the healthcare system are a rich source of data for biosurveillance. They include the data needed for earlier detection of cases and outbreaks and for more rapid characterization of outbreaks. Clinical data in the United States at present, however, are not highly available for biosurveillance (other than that practiced by hospital infection control). The barriers include the use of paper records, multiple departmental information systems, and nonstandard data formats....

Web pages

Australian Biosecurity Cooperative Research Centre for Emerging Infectious Disease is a collection of resources and organizations in Australia. The mission is to protect Australia's public health, livestock, wildlife, and economic resources through research and education that strengthens the national capability to detect, diagnose, identify, monitor, assess, predict, and respond to emerging infectious disease threats that impact on national and regional biosecurity. This organization provides...

EVR Use Private Companion Animal Veterinary Practice

The degree of acceptance of EVRs in veterinary hospitals is about the same as the market penetration of point-of-care systems in medicine. In 2001, no more than 10 of veterinary hospitals had them installed by the end of 2005, 20 of veterinary schools in the United States expect to have deployed EVRs. Very few private practices use them budget constraints and a lack of familiarity with these systems have limited acceptance. These systems range from complete electronic veterinary systems with...

Agribusiness

The aggregated livestock farming systems provide a large interface between humans and animals, and farm animals provide a large interface between themselves and wildlife (especially via insect vectors and rodents). Thus, the health of farm animals is of importance. All commercial livestock systems follow the same template-the cost-effective production of sufficient high-quality animal product of in a regular, sustainable, and reliable manner. Profit from livestock farming is determined from the...

Estimating Downwind Contamination

Once an outbreak has been detected, responders can use dispersion models to further characterize the outbreak. A key aspect of characterization is identification of exposed populations who are not yet symptomatic so that prophylactic treatment can be directed toward them. Using estimates of release location and time perhaps those obtained from the use of dispersion models in the analysis of biosurveillance data responders can simulate'' the outbreak to inform decisions about prophylactic...

Governmental Agencies

Government agencies at the federal, state, and local level play a role in food inspection and regulation. We discussed the government regulation of food animals in Chapter 7, including the roles of the USDA and state departments of agriculture. In this chapter, we discuss the Food and Drug Administration (FDA), which has the responsibility for safety of food and drugs, and provide additional information on the role of state departments of agriculture relevant to monitoring of the food supply.

Icd Codes

The International Classification of Diseases, 9th Revision, Clinical Modification (ICD) is a standard vocabulary for diagnoses, symptoms, and syndromes (see Chapter 32). ICD has a code for each class of diagnoses, syndromes, and symptoms that it covers. For example, the ICD code 034.0 Streptococcal sore throat includes tonsillitis, pharyngitis, and laryngitis caused by any species of Streptococcus bacteria. There are more than figure 23.5 Daily counts of constitutional and respiratory syndrome....

The Scope Of Biosurveillance

The word biosurveillance is of recent origin.2 Biosurveillance overlaps with two existing terms disease surveillance and public health surveillance. These terms are defined as systematic methods for the collection and analysis of data for the purpose of detecting disease (Thacker and Berkelman, 1988 Halperin and Baker, 1992 Teutsch and Churchill, 2000). As with any new word, we could speculate whether its invention and growing usage signals the appearance of a new field or simply reflects an...

Description Of Otc Data

The retail industry has built significant infrastructure to capture data about product sales for business purposes. Retailers analyze sales data for supply chain management and to make decisions, such as which product lines to carry in which stores, how and when to promote products, which stores to close, and where to open new stores. The vast majority of retailers in the United States collect sales data using optical cash registers. Clerks scan the familiar barcode imprinted on every product...

The Physical Internet

The physical Internet is the network of networks,'' a set of linked electronic networks-much like the telephone system-that computers worldwide use to communicate with each other. Computers transmit information across the Internet using several computer languages, known as protocols. The two most important protocols are IP (Internet protocol) and TCP (transmission control protocol). IP specifies a standard form of packets of data transmitted across the Internet through a network of intelligent...

Governmental Public Health

University of California Los Angeles, School of Public Health, Department of Epidemiology, Los Angeles, California Division of Infectious Disease Epidemiology, Pennsylvania Department of Health, Harrisburg, Pennsylvania RODS Laboratory, Center for Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania In the report The Future of Public Health, the Institute of Medicine defines public health as what we, as a society, do collectively to assure the conditions for people to be...

Networks Of Laboratories

The organization of laboratories into collaborative networks has been ongoing for more than a decade. The concept of laboratory networks arose from the need to ensure that critical laboratory services were available throughout the country (Gilchrist, 2000). The CDC, FDA, USDA, and state governmental laboratories formed many of the original partnerships that grew into the laboratory networks that exist today. Early networks included the National Laboratory System (NLS) of clinical, public health...

Atmospheric Dispersion Models

A model of atmospheric dispersion is an algorithm that predicts the downwind concentrations of a substance that result when a given quantity of the substance is released into the air. Atmospheric dispersion models require as input (at a minimum) the quantity of substance released, the release location, and weather conditions. Many atmospheric dispersion models exist. They have been developed for pollution control and modeling the spread of radioactive material from accidents at nuclear power...

Other Organizations That Conduct Biosurveillance

RODS Laboratory, Center for Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania Lieutenant Colonel, US Army, Uniformed Services, University of the Health Sciences, Washington, DC Colonel, US Air Force, Force Health Readiness, Tricare Management Activity Deployment Health Support Directorate, Washington, DC Biodefense Laboratory, Wadsworth Center, New York State Department of Health, Albany, NY This chapter completes our review of the organizations listed in Table 1.2...

Semisynthetic Test Data

Evaluators can generate semisynthetic data by injecting geometrically shaped spikes into real surveillance data collected during non-outbreak periods (Goldenberg et al., 2002, Zhang et al., 2003, Reis et al., 2003, Reis and Mandl, 2003). This technique was used for illustration in Chapter 14. The advantage of the semisynthetic approach is that it allows the evaluator to manipulate the spike size to find the smallest spike that the algorithm can detect above the background noise in real...

Wildlife Databases

ZIMS will replace existing ISIS database systems such as MedArks and will provide a more accurate and comprehensive system. ZIMS will be Web based, thus allowing users to see collections of animal data from multiple institutions in real time from any authorized computer anywhere in the world. Data will include information on animal health, laboratory accessions, genetics, disease investigations, and animal resource management. ZIMS will enhance local care and international conservation efforts...

Office International des Epizooties

The OIE (http www.oie.int eng en_index.htm) is the World Organization for Animal Health. It sets guidelines and Handbook of Biosurveillance ISBN 0-12-369378-0 provides recommendations to minimize the risk of spreading animal diseases and pests while facilitating trade between nations. The OIE lists its missions as the following 1. To guarantee the transparency of animal disease status worldwide 2. To collect, analyze, and disseminate veterinary scientific information 3. To provide expertise and...

EVR Use Institutional Veterinary Practice

The care of these animals is not subject to the same privacy and confidentiality regulations as are privately owned pets moreover, public zoos and parks are subject to the Freedom of Information Act, wherein interested parties may demand to inspect documents related to the care of animals at a given facility. However, this relative openness is not limitless. Zoos are understandably sensitive to disclosures, which may affect their business adversely if the public were to avoid the zoo for fear...

Definitive and Confirmatory Tests

A definitive or confirmatory test is a test that will identify with a very high degree of certainty the true identity of an agent. These tests have a very low likelihood of providing a false-positive result. Many of the definitive and confirmatory assays used today are molecular assays, which detect genetic material that is specific to a bacterium, virus, protozoa, or other organism. Nucleic-acid-based assays rely on the unique differences found in the structure of single strands of DNA and...

Pediatric Electrolytes

Hogan et al. (2003) used correlation analysis and the detection algorithm method to study the sales of pediatric electrolyte products during outbreaks in children. Pediatric electrolyte products are solutions of salts and water that are indicated for the treatment of dehydration in children ages five years and under. They studied the sales of pediatric electrolytes during annual winter outbreaks due to diseases such as rotavirus gastroenteritis and influenza in children ages five and under....

Laboratory Infrastructure

Because the release of a biological agent is intended to have devastating effects on the population, laboratory networks have been established to standardize the issues associated with bioanalysis, including sampling, testing, reporting, and sample disposal. These networks provide a hierarchical approach to identification and confirmation of biological agents, allowing water suppliers to become involved in the testing process and obviating the need for trained responders to take action in every...

Interoperating With Other Organizations

As we have discussed, disease surveillance by health departments involves daily interactions between health departments and many other organizations. During outbreak investigations, health departments interact with an even larger number of organizations, potentially every type of organization listed in Table 1.2 (see Chapter 1). In this section, we discuss interactions not discussed elsewhere in this book, including relationships among adjacent health departments and with law enforcement. We...

Chief Complaints

The concept of a chief complaint is important in medicine. It is a statement of the reason that a patient seeks medical care. Medical and nursing schools teach future clinicians to begin their verbal presentations of patient cases with a statement of the chief complaint. They teach them to record the chief complaint using the patient's words and to avoid replacing the patient's words with their diagnostic interpretation. It is considered bad form to proffer a diagnostic impression in a chief...

Commercial Assistance Call Centers

Commercial transport companies such as truckers and railroads, do not participate in 911 however, they operate control centers that stay in contact with their trucks and trains and these systems, along with advanced logistics systems central to their business, collect information that may be of value for biosurveillance. Larger enterprises use GPS to track the location of their trains and trucks. Their data systems contain information about what substances their trains and trucks are carrying...

Case Studies of the National Retail Data Monitor

There is a systematic effort to study the effects of outbreaks on OTC sales data that is part of the National Retail Data Monitor (NRDM) project. (We discuss the project itself in greater detail in Section 5 of this chapter.) Each case study describes the effect of a single outbreak or other public health event, such as low air quality due to forest fires, on OTC sales (and other types of data available for the event). At present, these case studies are available only to authorized public...

Limitations

The biggest limitation, bar none, to the effective use of satellite data is twofold an astronomical amount of data flows to us from the skies, and trained analysts available to sift through it are few. Moreover, the number of analysts with domain knowledge to perform biosurveillance constitutes a subset of these analysts. Currently, the probability that space-based surveillance would miss the early signs of an epidemic or bioterrorism incident are high if only because a small hillock of...

Prescription Drugs

Prescriptions (either filled or presented to pharmacists) are often included in lists of data with potential value in biosurveillance. There have been studies of the use of prescriptions to detect cases of disease (especially tuberculosis), and one study of the effect of a public health intervention for a pertussis outbreak on Medicaid claims for macrolide antibiotics, but none of outbreaks. It is anecdotally reported that at the time that AIDS was first uncovered in 1981, the CDC's...

Presumptive Diagnostic Tests

Presumptive diagnostic tests are procedures that, when properly performed, may indicate the presence of a particular agent or closely related agents. Presumptive diagnostic tests for biologic and chemical agents are more complex and require more time to perform than do the simple tests described above. Most bacterial and viral agents can readily be grown in culture media or cell culture if the appropriate conditions are met. These conditions include the appropriate temperature, pH, and a source...

Standard Evaluation Method for an Outbreak Detection Algorithm

Biosurveillance systems use outbreak detection algorithms to analyze surveillance data and search for signs of an outbreak. As discussed in previous chapters, the surveillance data are typically formed into time series of daily or weekly counts prior to analysis. For each unit in the time series, an outbreak detection algorithm calculates a value that is a measure of how unusual one or more recent counts are, when compared to historical counts. If the degree of anomaly is above some threshold,...

Results Success of the Program

Subjective user response to the system was good. In terms of the perceived value of the system to the user, 56.0 of the users responding believed the system to be very valuable. The SHARE system itself was widely praised in comments as user-friendly, and 66.7 of users reported that they need table 24.1 Submission of Reports During the Pilot Study Attendance Office Elementary School Middle High School Countywide Total Mean length of time to submit report 1.3 days 5.1 days 2.7 days Reports...

Model Inversion

We can invert either the dispersion model alone or a combined model that includes the dispersion model and a model of the effects of aerosol cloud of biological agent (Figure 19.5). We refer to the former inversion as two-stage inversion because we must invert the aerosol effects model and the dispersion model separately. The inverted aerosol-effects model using biosurveillance data as input estimates downwind concentrations for input into the inverted dispersion model. We refer to the...

Overview Of Spatial Cluster Detection

In this chapter, we focus on the task of spatial cluster detection finding spatial areas where some monitored quantity is significantly higher than expected. For example, we may want to monitor the observed number of cases of influenza, or some other specific type of disease, and find any regions where the number of cases is abnormally high. The spatial cluster detection techniques that we describe are disease independent that is, they are capable of detecting clusters of any type of disease...

High Fidelity Injection

The high-fidelity detectability experiments (HiFIDE) technique extends the semisynthetic method by forming injections whose shapes and noise levels are derived from surveillance data collected during an actual outbreak (Wallstrom et al., 2004, Wallstrom et al., 2005). The technique also scales the height of the inject in a way that preserves the known relationship between the magnitude of the real outbreak and the strength of the signal in the surveillance data collected during the real...

Incorporating the Spatial Distribution of Cases into the Model

In this section, we describe changes to the PANDA model to account for the situation in which an anthrax release infects people in more than one zip code. In particular, we added a new interface node, called Angle of Release, which describes the orientation of the airborne anthrax release and takes on the eight possible values of N, NE, E, SE, S, SW, W, or NW, as shown in Figure 18.8. Figure 18.9 depicts the modified person model, which is based on the original nonoptimized model shown in...

Queries to Search Engines

About half of people who use the Internet to access health information online do so via a search engine thus, monitoring the queries received by search engines is a potential biosurveillance strategy. A rapid increase in the number of Google searches containing the word fever'' would be of concern in the absence of a known outbreak or other explanation. In contrast to website monitoring, monitoring of queries to the three most popular search engines would catch nearly 80 of the health-related...

Military Call Centerdispatch

Although military systems serve only a small portion of the population, they are relevant to biosurveillance because outbreaks can be centered in military facilities. Most military bases have their own 911 call centers some are tied into civilian systems. For example, Fort Belvoir, Virginia is served by Fairfax County's 911 system. A caller who is located on the base dials 911 and reaches a Fairfax County PSAP operator. That operator acts as the base' dispatcher, alerting military police, fire...

Modeling

Our methodology uses Bayesian networks to explicitly model an entire population of individuals. Since, in this chapter, we are specifically interested in disease outbreak detection from syndromic information, we will refer to models of these individuals as person models, although obviously the same ideas could be applied to model other entities that might provide information about disease outbreaks, such as biosensors and livestock. We explicitly model each person in the population, and thus in...

Studies of Informational Value of ICD Codes

There have been several studies that have measured the informational value of ICD codes for chief complaints and diagnoses. As with free-text chief complaints, these studies utilized experimental methods that we discussed in Chapter 21 and address the three hypotheses of interest (restated for ICD codes) Hypothesis 1 An individual ICD code or a code set can discriminate between whether a patient has syndrome or disease X or not. Hypothesis 2 ICD code sets can discriminate between whether there...

Monitoring the Bodys Vital Functions

Monitoring such parameters of bodily functions as heart rate, respiration, temperature, blood glucose, and even the heart's electrical activity (an EKG rhythm strip) does not require a very large bandwidth. A transmission capability as low as 8 bits per second, and an 8-bit microprocessor can accomplish this. FitSense created a low-power (microwatt) local area network (LAN) to handle multiple wireless sensors attached to clothing or jewelry (FitSense Technology Corporation Inc., 2005). There...

Organizations That Use Animal Health Data

A large amount of animal health data is collected, but this is generally to satisfy specific purposes, such as determining the prevalence of a disease within a target species. Organizations that use animal health data include farming enterprises, processors (such as abattoirs), farming service providers (such as veterinarians), pharmaceutical companies, and state and federal departments of agriculture. Organizations that use domestic animal health data include individual veterinary hospitals,...

Search Engines

An Internet search engines is a computer system that (1) locates and indexes web pages, and (2) processes queries from users who are searching for information on the web. The most common way people find information on the Internet is through a search engine (PEW Internet & American Life Project, 2004). A search engine comprises three components a web spider, a database, and one or more information retrieval algorithms. The web spider (also known as a web crawler'') searches the Internet for...

RODS Data Security and Confidentiality

RODS provides a secure file server fftn.rods.DiH.edu') that allows data providers to upload their batch data through secure file transfer protocol (SFTP) or secure socket layer (SSL) connections. The file server is located in a demilitarized zone (DMZ.) and is protected by a firewall that consists ofrules allowing only specific remote hosts (IPs) to reach it -- maintained by the network security group at the University of Pittsburgh. The firewall plays the role of a security guard that blocks...

Combining Fields In Event Data The Whats Strange About Recent Events Approach

The what's strange about recent events (WSARE) algorithm (Wong et al., 2002,Wong and Moore, 2002, Wong et al., 2003) is a rule-based anomaly pattern detector that operates on discrete, multidimensional data sets with a temporal component. This algorithm compares recent data against a baseline distribution with the aim of finding rules that summarize significant patterns of anomalies. Each rule is made up of components of the form Xi Vj, where X is the ith feature and Vi is the jth value of that...

Surveys

A method that does not require any surveillance data whatsoever is a survey of sick (or recently sick) individuals to determine when they emitted'' behaviors that may show up in surveillance data. A recent survey of pediatric patients is an example of a survey designed specifically to understand the informational value of biosurveillance data (Johnson et al., 2005). In this study, Johnson interviewed caregivers of children who were being seen in emergency rooms for fevers, respiratory ailments,...

Control Charts

Many of the univariate algorithms used in biosurveillance are techniques taken from statistical quality control, which is a field concerned with monitoring the quality of a production process. One of the simplest statistical quality control methods that we can apply to surveillance is the control chart. A control chart sets an upper control limit (UCL) and a lower control limit (LCL) on the time series being monitored. If the daily counts remain between the UCL and the LCL, then the process is...

Diagnostic Precision And The Question Of How Good Is Good Enough

If a study compares the combination of algorithm plus data with current best practice,'' the results of the study will directly answer the question of whether the algorithm is good enough.'' Otherwise, the question of whether the algorithm's sensitivity, timeliness, and false alarm rate are good enough must be addressed indirectly, based on consideration of whether the algorithm's users will take actions based on anomalies identified by the algorithm. When discussing the significance of the...

State Departments of Agriculture

Each state has a department of agriculture. All have purposes and objectives similar to those of the USDA however, their focus is on state issues. Some of the issues include the provision of local animal disease diagnostic laboratories, licensing of farming premises, food safety, meat inspection, enforcement of state agricultural laws, state-based quarantine and disease control, animal movement control, animal welfare, pest management, quality assurance programs, and state-specific farm...

Availability and Time Latencies of ICDcoded Diagnoses

The availability of ICD codes, by which we mean the proportion of patients being seen in a region for which ICD codes are available, in general is poorly understood. ICD coding of chief complaints is uncommon. ICD coding of ED diagnoses by clinicians using clinical information systems is not universal. Nationwide, only 17 of physician practices and 31 of EDs have even adopted point-of-care systems (Burt and Hing, 2005). ICD codes from healthcare-insurance claims data are widely available, but...

Expressions of Uncertainty

Unfortunately, differential diagnosis is not a clear-cut science in which physicians are completely confident in what findings or diseases a patient has, and the language used in patient reports expresses the dictating physician's uncertainty on a continuum ranging from certain absence to certain presence. Consider the implications of sentences (5) to (12). The first sentence expresses certainty that pneumonia is absent, whereas the last sentence expresses certainty that pneumonia is present....

Obtaining Surveillance Data for Evaluation

The biggest challenge for an evaluator who wishes to study an outbreak detection algorithm is obtaining surveillance data for a sufficiently large number of outbreaks with which to measure sensitivity and timeliness. The exact number of outbreaks required depends on the tightness of the statistical error bounds on these measurements that he desires, but as a rough approximation, 10 outbreaks are required as a bare minimum. To have the greatest validity, an evaluator tests an algorithm using...

Food Supply and Food Monitoring

In the United States, food arrives on a consumer's table via many paths of production and distribution, but the most common and typical path is as follows growers (crop and animal cooperatives, farms, ranches, importers) sell food crops to food processors (packers, canners, or other manufacturing processing packaging operations). For example, grain farmers (e.g., those who grow wheat, corn, soybeans, barley, rice, sorghum, oats) bring their crops to grain elevators, which in turn ship it to...

Internet As Sentinel Iii Monitoring Usage Of Health Websites And Healthrelated Queries To Search Engines

The goal of ProMED-e-mail and GPHIN are similar to detect an outbreak or unusual events close to the time that astute observers or news media report them. In this section, we explore the potential of analyzing patterns of Internet utilization by sick individuals (or their caregivers) to detect events even earlier. This area of research is predicated on the assumption that sick people (or their caregivers or doctors) turn to the Internet early during the course of an illness. At present, it is...

Model for Outbreak Detection

In our empirical tests, we use a model similar to the example model shown in Figure 18.2, with two primary differences (1) we do not use the Terror Alert Level node, and (2) we use a more complex person model. Figure 18.4 shows the person model we use. The meanings of the nodes are listed below. For each underlined variable, its conditional probability table was estimated from a training set consisting of one year's worth of ED patient data from the year 2000. The variables in boldface were...

Water Surveillance

Federal and state regulators require utilities to monitor water supplies continuously. Water is an important potential mode of transmission for many pathogens, including bacteria, viruses, and parasites (Table 9.1). There are two complementary approaches to monitoring these biological contaminants. Ongoing comprehensive surveillance is intended to signal the occurrence of contaminated water before its distribution to the general public. The underlying intent of this approach is to detect...

Laboratory Networks

National networks that support biological analysis include the LRN and Emergency Laboratory Response Network (eLRN). These networks are composed of clinical hospital and public health laboratories, public and commercially owned environmental laboratories, and federal laboratories such as the CDC and U.S. Army Medical Research Institute of Infectious Diseases (USAMRIID).The LRN is a joint project of the CDC, the Association of Public Health Laboratories, and the FBI, coordinating the response to...

State Laboratories

Approximately 200 of the more than 186,000 laboratories in the United States are classified as state public health laboratories. Included in this number are about 150 regional or branch laboratories that are administered as part of the state public health laboratory. More than 6,500 laboratory professionals are employed by state public health laboratories. Each state and five territories operate a state public health laboratory. One major function of theses laboratories is to provide diagnostic...

Environmental Laboratories

Environmental testing laboratories perform physical, chemical, and microbiological analysis of specimens collected in the environment. For example, a water sample may undergo physical testing (temperature, turbidity, odor, color), chemical testing (nitrates, sulfates, pesticides, metals), and microbiological testing (total plate counts, coliforms, Giardia, cryptosporium). Environmental testing laboratories provide a wide range of testing that is in many ways similar to the testing performed in...

Internet Primer

People use the word Internet'' loosely to refer to both the physical Internet as well as the software applications that run on it. The physical Internet comprises the wires, optical fiber, satellites, protocols, and routing computers (what a technologist would consider the Internet). Examples of the software applications include e-mail programs, web servers, search engines, instant messaging programs, and file transfer programs. In this chapter, we use the term Internet'' to refer to both the...

ICD9 Hospital Discharge Diagnoses

Many state health departments (approximately 60 ) compile diagnoses of patients discharged from hospitals located within the state. These hospital discharge data sets include diagnoses encoded using the ICD-9-CM coding system (discussed in Chapter 32), dates of admission and discharge, home zip code, hospital zip code, and patient age. Evaluators can use the diagnoses to construct reference epidemiological curves for diseases such as influenza, salmonella, respiratory syncytial virus, and...

Limitations On And Of Governmental Public Health

In the words of historian Lord Aston, Power corrupts, and absolute power corrupts absolutely'' (Powell, 2001). Therefore, the founding fathers of the United States divided power among three branches of government to create a system of checks and balances. However, these checks, balances, and other institutional mechanisms may slow or distort the development, implementation, and use of biosurveillance systems. Funding decisions perhaps have the greatest effect. Biosurveillance funding competes...

Image Recognition Technology

Image recognition technology is relatively young. To quote a timeworn clich , our true state of the art in image recognition is the Mark One eyeball,'' a joking reference to military weapon naming conventions. Researchers are pursuing automated recognition via a number of techniques, such as pattern matching, use of primitives,'' and others. The military has demonstrated its utility in 1991, cruise missiles fired from aircraft and naval vessels employed image recognition technology during the...

Poison Information Centers

Dissemination of information to the public and clinicians is the primary role of poison information centers. The information is provided by specialists in poison information (nurses and pharmacists) and clinical toxicologists. The poison control specialists document each patient interaction electronically these interactions form the basis for the poison center medical record. Additionally, if a physician treating a patient calls the center, he or she records the center's advice on the patient's...

Testing Infrastructure

Laboratory testing on drinking water is performed for analysis of chemical, radiological, and biological contaminants. Although biological testing by definition focuses on microorganisms, there is overlap with chemical analysis because many toxins occur naturally. The existing infrastructure for chemical analysis is more established than that for biological monitoring, because water contamination owing to pesticides, pollutants, or even metals has been monitored for decades and because chemical...

Symbolic NLP Techniques

Linguistics is the study of the nature and structure of language, including the pronunciation of words (phonetics), the way words are built up from smaller units (morphology), the way words are arranged together (syntax), the meaning of linguistic utterances (semantics), and the relation between language and context of use (pragmatics) including the relationships of groups of sentences (discourse). As humans, we combine all of this linguistic knowledge with knowledge of the world to understand...

Signalto Noise Ratio

If the surveillance data can be formed into a time series, as illustrated by Figure 21.1, evaluators can use a variety of methods to understand their informational value for detecting outbreaks of disease. The evaluator will first examine the time series visually to determine whether an outbreak effect is present. The evaluator Footprint of Influenza in Routinely Collected Data Footprint of Influenza in Routinely Collected Data

Specifying Biosurveillance Data

As we discuss in Parts II and IV, a system designer has a large selection of surveillance data from which to construct a biosurveillance system. This abundance is a result of the increasing amounts of data collected electronically about the health of individuals and their purchasing, travel, attendance, and other behaviors (Sweeney, 2001). For a designer, the task of data selection may be simple or complex. It is simple if the organization planning a system requests that the designer automate...

Surveillance Data

Health departments collect a variety of biosurveillance data. We begin this section by discussing vital statistics, which serve many functions such as monitoring increases in unexplained deaths. We also discuss notifiable disease reports. Both vital statistics and disease reports are among the earliest types of data collected systematically by governments for the purpose of biosurveillance. We also discuss sentinel surveillance. A vital statistic is a record of a birth, death, or changes in...

The Future Of Biosurveillance Research

Although biosurveillance will benefit immensely from cross-pollination by the fields just discussed, there will remain areas of both applied and basic research that represent the core science of biosurveillance. Applied research in biosurveillance will translate techniques from other fields and address engineering and organizational issues related to the construction of biosurveillance systems. Basic research in biosurveillance will continue to address questions such as (1) which...

Organization Of The Book

We have organized this book into six parts. Part I The Problem of Biosurveillance comprises this introductory chapter and Chapters 2 through 4. Chapter 2 (Outbreaks and Investigations'') provides examples of outbreaks that have been investigated by governmental public health, hospital infection control, and the animal healthcare system. Chapter 3 (Case Detection, Outbreak Detection, and Outbreak Characterization) provides an overview of the basic tasks of biosurveillance, explaining in detail...

Agricultural Databases

Much information above farm level has been aggregated, often because it is essential to organize payments and to monitor for quality and ensure traceability of products. For example, slaughterhouses, or beef processing plants, accept animals from feedlots for slaughtering and manufacturing of meat and meat-derived products. In the United States, three large beef processors dominate this part of the meat industry IBP, a subsidiary of Tyson Foods ConAgra, and Excel Corp., a subsidiary of Cargill....

H

It is also possible to add local terms, such as the mean count over the previous seven days as an additional feature to allow the method to adapt better to recent local events. These additions can be very helpful, and as we shall see at the end of the chapter, regression methods that include these extra terms perform better than moving average (which had been our favorite method) on our data.

Evaluation Methods For Nlp In Biosurveillance

The first step in evaluating an NLP application is to validate its ability to classify, extract, or encode features from text (feature detection). Most evaluations of NLP technology in the biomedical domain have focused on this phase of evaluation. Once we validate feature detection performance, we can evaluate the ability of the encoded features to diagnose individual cases of interest (case detection). Finally, we can perform summative evaluations addressing the ability to detect epidemics...

Reliability And Utility

The reliability and accuracy of data generated by medical examiners is equivalent to that of any other medical provider. Since medical examiners are licensed physicians, they are ethically and legally obligated to maintain accurate records, the same as any other physician. In general, the reliability of ancillary data received by the medical examiner, such as clinical laboratory data, copies of the deceased's previous medical charts and radiographs would be expected to be high. Moreover, a...

CDCTM Severe Acute Respiratory Syndrome

Public Health Guidance for Community-Level Preparedness and Response to Severe Acute Respiratory Syndrome (SARS) Version 2 Appendix B1 Revised CSTE SARS Surveillance Case Definition Presence of two or more of the following features fever (might be subjective), chills, rigors, myalgia, headache, diarrhea, sore throat, rhinorrhea Mild-to-moderate respiratory illness Temperature of > 100.4 F (> 38 C)1 and One or more clinical findings of lower respiratory illness (e.g., cough, shortness of...

Discourse Relationships Among Sentences

Sentences in a patient report are not meant to stand alone-they often convey a story about the differential diagnosis and treatment process for a patient. Some of the variables our example SARS expert system would need cannot be obtained without integrating and disambiguating information from the entire report. Once the individual variables have been located in a report, some type of discourse processing must integrate values for the variables to answer questions such as (1) Were the relevant...

Sources of Variability in OTC Sales Data

There are many non-disease factors that influence the level of sales of a given OTC product or product category. Sources of variability include day of week, season, holidays, severe weather, and promotions. For many products and product categories, the daily sales exhibit day-of-week effects that is, they vary by day of the week (Figure 22.3). For example, abuse of dextromethorphan-containing cough syrups is a well-known phenomenon (Murray and Brewerton, 1993), and purchases for the purpose of...

Federal Laboratories

The Department of Health and Human Services (DHHS), USDA, Department of Energy (DOE), DoD, and Departments of Commerce and Justice, and the EPA operate or fund clinical, environmental, forensic, and research laboratories. Federal laboratories provide reference testing and are often involved with the development of new technologies, as well as the transfer of these technologies to other laboratories. Many federal laboratories collaborate with international partners and serve as reference centers...

Impact of Foodborne Illnesses

Food-borne illness is not uncommon in the United States and has significant economic impact. Food contamination causes 76 million illnesses, 325,000 hospitalizations, and 5,000 deaths every year, according to a Centers for Disease Control and Prevention (CDC) study published in 1999. The majority of deaths occur owing to unidentified agents, but 1,500 deaths per year can be attributed to Listeria, Salmonella, and Toxoplasma species (Mead et al., 1999). More than 200 different diseases may be...