The 2009 (H1N1) influenza pandemic highlighted the need for forecasting models to assist policy and planning for effective health system responses to epidemics. Adequate forewarning of impending influenza epidemics would allow hospitals to adjust staff rosters, assign dedicated wards for infection control, and establish separate influenza clinics collocated with emergency departments (EDs).
We describe the use of surveillance and forecasting to predict and track influenza outbreaks. Although this work was instigated in response to the 2009 (H1N1) influenza outbreak, the methods apply to future outbreaks of other infectious diseases. There is extensive related research in this area using temporal, spatial, or space–time detection algorithms (eg, regression, smoothing, hidden Markov, and wavelet models), with successful performance reported for both sudden and gradual outbreaks.1-9 The objective of much of this modelling is to complement traditional public health monitoring with routine automated data analysis.1 Characteristics of outbreaks that have influenced detection include the magnitude of the signal, the shape of the signal, and the timing of the outbreak.2 An interesting observation is that, since 2000, prospective health surveillance has moved from weekly or monthly data, to daily and faster data rates.1 In general, for detecting changes in recent data behaviour, methods incorporating recent data history have advantages over those producing estimates from longer baselines.3
Emergency department data have frequently been used to identify disease patterns,10-15 including assessment of the timing and magnitude of daily influenza counts against laboratory confirmed influenza.10 We describe three approaches to predicting future epidemics based on emergency department data: surveillance monitoring of influenza presentations using adaptive cumulative sum (CUSUM) plan analysis to signal unusual activity; generating forecasts of expected numbers of presentations for influenza, based on historical data; and using internet search data as outbreak notification among a population. All three are considered applicable to health facilities that routinely collect and code patient arrival data.
We collected 5 years of historical data (2005–2009) on presentations and hospital admissions for influenza-like illnesses from the Emergency Department Information System (EDIS) database of 27 Queensland public hospitals. The influenza season assessed each year comprised June, July, August and September. Waiting times and triage categories are routinely reported for these 27 public hospitals.
Patients with influenza-like symptoms were identified by the following International Classification of Diseases (ICD-10-AM) codes within EDIS:
A08.4 – Viral gastroenteritis
B34.9 – Viral infection
J10.8 – (Influenza old code)
J11.1 – (Influenza new code)
J11.1S – H1N1 influenza (Human swine influenza) suspected
J18.0 – Bronchopneumonia
J18.1 – Lobar pneumonia unspecified
J18.2 – Hypostatic pneumonia unspecified
J18.8 – Pneumonia, atypical
J18.9 – Pneumonia, unspecified
Z04.8 – Medical review
Data collected were stored and collated using Microsoft Office Excel, version 2007 (Microsoft Corporation, Redmond, Wash, USA). SPSS, version 17.0 (SPSS Inc, Chicago, Ill, USA), was used for descriptive statistics. Analysis of variance (ANOVA) tests were performed using MATLAB, verson 7.2.0.232 (R2006a) (The MathWorks Inc, Natick, Mass, USA) for comparisons of influenza presentations and admissions across multiple years. The level of significance was set at P < 0.05, with 95% confidence limits. R, version 2.6.2 2008 (The R Foundation for Statistical Computing), was used for CUSUM plan analysis and additional statistical analysis, and a proprietary algorithm written in Java, version 1.5 (Oracle Corp, Redwood Shores, Calif, USA), and Scala, version 2.8 (EPFL, Lausanne, Switzerland), was used for generating forecasts from historical data.
We used an adaptive CUSUM plan for monitoring non-homogeneous negative binomial counts of ED presentations and hospital admissions for influenza to flag unusual outbreaks.16 A transitional negative binomial regression model with a moving window of 730 days was used to construct a 1-day-ahead forecast for counts across the influenza seasons of 2005–2009. The regression model included factors such as day of the week, school and public holidays, harmonics to account for seasonal trends, and functions of lag counts to account for changes in infection rates.
Forecasts for ED presentations were generated using historical data. Based on experience, hospital bed managers can identify busy periods (ie, certain days of the week, holiday periods, mass-gathering events) reflecting larger admission numbers. Most ED forecasting models allow for this seasonality by including variables for day of the week, month of the year and holidays, and identifying repeated patterns in the time series data. We have found that days having matching characteristics are considered to be most “similar” to the day of interest in terms of patient demand, and accordingly are expected to provide the best basis for predicting the number of presentations expected on that day.
We compared our data with Google’s flu trends website (http://www.google.org/flutrends), which uses aggregated Google search data to estimate influenza “activity” in a selection of countries. Data for Queensland are available for download from 2006 onwards. We also used data downloaded from Google’s Insights for Search website (http://www.google.com/insights/search), which allows comparison of search volume patterns. We used the search terms “swine flu” and “flu” to check the consistency of internet search results.
Other influenza tracking websites, such as those maintained by the United States Centres for Disease Control and Prevention (http://www.cdc.gov/flu/ and http://healthmap.org/swineflu/) do not provide raw search data for Queensland.
During the 2009 influenza season, there were 380 000 ED presentations to Queensland public hospitals, equating to 87 ED presentations to public hospitals for every 1000 people in Queensland. Of these presentations, 9% were patients with influenza-like symptoms (influenza-like cases). This represented a significant increase over previous years (P < 0.001) (Box 1, A). Across the 2009 winter influenza season, admission rates for patients with and without influenza-like diagnoses were 18% and 22%, respectively (Box 1, B).
Examples of surveillance plans for three of the hospitals assessed in our study are shown in Box 2, with the letters (denoting months) on the x-axis showing the start of the month. These show a time–series plot of the counts being monitored (upper panel) and the associated CUSUM plan (lower panel). If the trend in the adaptive CUSUM breaks the red line (the control limit), then an unusual outbreak is signalled, and the CUSUM statistic is reset to zero. A break in the control limit occurring soon after this indicates that the outbreak continues to be significantly worse than expected, despite updating the model using the recent outbreak data.
For example, the CUSUM plan for Hospital C in Box 2 first signals an unusual outbreak in December 2008, but despite the models being updated with this outbreak data, the CUSUM continues to signal that the outbreak remains unusual early in 2009 prior to the commencement of the usual winter influenza season.
Comparing forecasts for ED presentations for influenza-like symptoms with actual presentations in 2009 shows that, at all sites, the historically based forecasts underestimated observed conditions. However, these forecasts represent a baseline for where we would normally expect things to be based on the preceding years. When validating the accuracy of the prediction algorithms across statewide data for the 2009 influenza season, it was found that forecasting accuracy was significantly worse for influenza presentations compared with all ED presentations in 2009, and that the accuracy in July 2009 was significantly worse than that for the surrounding months (Box 3).
The coherence of Google Trends data with influenza presentation data is shown in Box 4. The correlation coefficients between Google search data for Queensland and statewide ED influenza presentations (ρ values) indicate an increase in correlation since 2006 when weekly influenza search data became available. While we might expect the peak in influenza ED presentations to lag behind the peak in internet searches related to influenza, the lag was more pronounced in 2009. This may indicate high public and media interest associated with an epidemic.
With routinely collected health data becoming ubiquitous and relatively inexpensive to obtain, the potential for utilising forecasting and surveillance to support the decisions of hospital policymakers is evident.
Many hospitals have seen the merit of generating forecasts from historical data to manage elective and emergency admissions in a proactive manner.17-22 The ability to accurately predict patient flow on a weekly or seasonal basis could enable more informed decisions about staffing levels to be made, which, in turn, could result in cost savings and improved patient care. Forecasting is a widely applicable, multidisciplinary science, encompassing statisticians, economists and operations researchers, that guides decision making in many areas of economic, industrial and scientific planning, but has gained little traction in the health care industry.22 We advocate that all health facilities that routinely collect data relating to workload and bed demand use this information proactively for improving management and patient outcomes. Our analysis shows, however, that routine forecasting methods need to be coupled with an adaptive methodology, such as the CUSUM approach described here, that can detect unusual occurrences.
The value of using internet search data in pandemic planning has been reported previously,23-25 with Google publishing correlations between search data and epidemics.23 Now these search data are accessible to the lay community and were used to construct the plots in Box 4. Additionally, Google’s Insights for Search site can report search interest relating to other infectious diseases to allow state health departments to survey trends for diseases other than influenza. According to Google, raw search volume is not reported, as the size of some regions may skew the data, and normalised data are used instead. The coherence of geographically specific search data with influenza presentations may enable information-poor regions to track the progression of pandemics in the community.
To determine unusual events and predict expected presentation rates, the best system among those assessed was a combination of routine forecasting methods coupled with an adaptive CUSUM methodology. Widely accessible internet search data can also assist by reflecting the timing of epidemics. The use of these forewarning and forecasting models can assist clinicians and hospital managers in clinical and infection control decision making.
1 Presentations (A) and admissions (B) of patients with influenza-like symptoms to public hospitals in Queensland in the influenza seasons of 2005–2009; the 2009 influenza season differed significantly from those of preceding years

2 CUSUM plans for signalling unusual presentations and admissions for influenza at three Queensland hospitals

Abstract
Objective: To describe the use of surveillance and forecasting models to predict and track epidemics (and, potentially, pandemics) of influenza.
Methods: We collected 5 years of historical data (2005–2009) on emergency department presentations and hospital admissions for influenza-like illnesses (International Classification of Diseases [ICD-10-AM] coding) from the Emergency Department Information System (EDIS) database of 27 Queensland public hospitals. The historical data were used to generate prediction and surveillance models, which were assessed across the 2009 southern hemisphere influenza season (June–September) for their potential usefulness in informing response policy. Three models are described: (i) surveillance monitoring of influenza presentations using adaptive cumulative sum (CUSUM) plan analysis to signal unusual activity; (ii) generating forecasts of expected numbers of presentations for influenza, based on historical data; and (iii) using Google search data as outbreak notification among a population.
Results: All hospitals, apart from one, had more than the expected number of presentations for influenza starting in late 2008 and continuing into 2009. (i) The CUSUM plan signalled an unusual outbreak in December 2008, which continued in early 2009 before the winter influenza season commenced. (ii) Predictions based on historical data alone underestimated the actual influenza presentations, with 2009 differing significantly from previous years, but represent a baseline for normal ED influenza presentations. (iii) The correlation coefficients between internet search data for Queensland and statewide ED influenza presentations indicated an increase in correlation since 2006 when weekly influenza search data became available.
Conclusion: This analysis highlights the value of health departments performing surveillance monitoring to forewarn of disease outbreaks. The best system among the three assessed was a combination of routine forecasting methods coupled with an adaptive CUSUM method.