Skip to main content

Comparing malaria early detection methods in a declining transmission setting in northwestern Ethiopia



Despite remarkable progress in the reduction of malaria incidence, this disease remains a public health threat to a significant portion of the world’s population. Surveillance, combined with early detection algorithms, can be an effective intervention strategy to inform timely public health responses to potential outbreaks. Our main objective was to compare the potential for detecting malaria outbreaks by selected event detection methods.


We used historical surveillance data with weekly counts of confirmed Plasmodium falciparum (including mixed) cases from the Amhara region of Ethiopia, where there was a resurgence of malaria in 2019 following several years of declining cases. We evaluated three methods for early detection of the 2019 malaria events: 1) the Centers for Disease Prevention and Control (CDC) Early Aberration Reporting System (EARS), 2) methods based on weekly statistical thresholds, including the WHO and Cullen methods, and 3) the Farrington methods.


All of the methods evaluated performed better than a naïve random alarm generator. We also found distinct trade-offs between the percent of events detected and the percent of true positive alarms. CDC EARS and weekly statistical threshold methods had high event sensitivities (80–100% CDC; 57–100% weekly statistical) and low to moderate alarm specificities (25–40% CDC; 16–61% weekly statistical). Farrington variants had a wide range of scores (20–100% sensitivities; 16–100% specificities) and could achieve various balances between sensitivity and specificity.


Of the methods tested, we found that the Farrington improved method was most effective at maximizing both the percent of events detected and true positive alarms for our dataset (> 70% sensitivity and > 70% specificity). This method uses statistical models to establish thresholds while controlling for seasonality and multi-year trends, and we suggest that it and other model-based approaches should be considered more broadly for malaria early detection.

Peer Review reports


Over the last few decades, incredible progress has been made worldwide in reducing malaria cases and deaths. From 2010 to 2018, the incidence of malaria cases declined globally from 71 to 57 cases per 1000 population at risk and malaria deaths fell by 31% during the same period [1]. However, between 2014 and 2018, the decreasing trend in incidence flattened with some regions seeing an increase in cases, indicating stalled progress. In 2018, there were an estimated 228 million cases worldwide, which was 3 million less than 2017, but 1 million more than in 2016 [1]. Of the total cases in 2018, 93% (213 million) occurred in the World Health Organization (WHO) African region [1]. In addition, due to population growth, the absolute number of people at risk for malaria globally is increasing, with the sharpest increase seen in the WHO African region [2]. As a result, there is a continuing need for improved strategies and tools to support malaria prevention, control, and elimination.

Malaria surveillance as a core intervention strategy is one of the pillars of the Global Technical Strategy for malaria [3, 4]. Information from surveillance systems can be used to optimize interventions to interrupt disease transmission and ultimately accelerate elimination. Timely detection allows officials to intensify control measures as needed to manage epidemics [4,5,6,7,8,9,10]. Many early-detection algorithms exist, and there is a need to quantitatively evaluate the performance of these algorithms for different diseases and locations [11,12,13,14,15,16,17]. The central idea behind outbreak detection is to identify when the case volume exceeds a baseline threshold, and to use this information in a prospective (not retrospective) manner to identify epidemics in their early stages [4, 15]. Various algorithms are used to calculate these thresholds, with different assumptions about the pattern of disease transmission, including the speed of outbreak development, seasonality, and trends [10, 12,13,14,15,16, 18,19,20,21,22,23,24,25,26,27,28,29,30].

Early detection algorithms that have been proposed for malaria include Cullen, WHO quartile, and cumulative sum (CUSUM) methods [4, 5, 10, 12, 17, 19]. These techniques define thresholds based on statistical summaries of historical data. The Cullen and quartile methods are recommended by WHO to have at least five years of past data to generate reliable estimates of the thresholds [5, 12]. The Cullen method calculates the mean value over the past five years of the current time period (e.g., week or month of year), excluding values from any past outbreak periods. Case volumes over the mean plus two standard deviations are considered outbreaks [5, 12, 20]. The WHO quartile method defines an outbreak by calculating quartile values for the current seasonal time period (usually the week or month of the year) over the past five years. An outbreak is identified when cases exceed the third (75%) quartile. This approach may be sensitive to slight increases in case volume during time periods when there have never been spikes or outbreaks of cases, but is less affected by abnormal years compared to the Cullen method [5, 12]. Several variations of the statistical methods have been evaluated from selected health center data in Ethiopia, and weekly percentile measures performed as well as ones with more complex threshold calculations [17]. There are many variations of the cumulative sum (CUSUM) approach, a type of control chart that tracks cumulative differences between observed values and expected values and indicates an outbreak when these cumulative differences exceed a threshold [5, 12, 18, 21,22,23].

In many situations, sufficient historical data may not be available to implement these approaches. Even when historical data are available, older data may be more representative of past malaria transmission cycles than the current malaria situation [4, 10]. In places undergoing intensive malaria intervention efforts, incidence in recent years may be significantly reduced compared to only a few years in the past or may exhibit different seasonal patterns [25]. Thresholds based on previous years may then fail to capture the new patterns and intensities of current outbreaks. However, surveillance and outbreak detection are still crucial in areas of low or unstable transmission. Immunity to malaria decreases with the intensity of malaria transmission, and the population could be highly vulnerable to malaria outbreaks [5].

Other early detection algorithms use different approaches for the calculation of the thresholds and may be more applicable in regions undergoing rapid change in malaria transmission patterns. The CDC Early Aberration Reporting System (EARS) has been used as a drop-in technique for syndromic surveillance after major incidents that could precipitate disease outbreaks [16, 22, 26, 27, 31]. This suite of methods is similar to quality control charts and relies on only very recent data to create a baseline and is therefore useful when long-term data is not available or not relevant to the current situation. The EARS system is actively used by U.S. state and local public health offices [26]. Syndromic surveillance using school-based absenteeism has been investigated for potentially identifying localized malaria outbreaks in Ethiopia [32]. A family of methods developed by Farrington and later, Noufaily, have been implemented at several European infectious disease control centers [24, 28, 33,34,35]. Farrington methods are based on quasi-Poisson regression and can take advantage of historical information while accounting for seasonality, long-term trends, and previous outbreaks [24, 28, 29].

While previous research has compared various detection algorithms, many of these studies have used simulated datasets (e.g. [16, 22, 30]), and it is unclear the extent that these would be representative of real world outbreaks, especially in the context of public health interventions. Therefore, in this article, we used a 7.5 year weekly surveillance dataset of malaria cases to test the suitability of 1) CDC EARS methods, 2) methods based on weekly statistical thresholds (including WHO and Cullen methods), and 3) Farrington methods to detect malaria outbreaks. To develop a baseline dataset of malaria outbreaks, we applied a novel method to identify malaria events of interest to use as retrospective test cases. This research was conducted in the Amhara region of Ethiopia, which has been the subject of intense malaria interventions and experienced a general decline in malaria cases [36]. In 2019 there was a resurgence of malaria cases in the region, and we used this year as the basis for testing the outbreak detection algorithms. Our main objectives were to compare sensitivity and true positive rates of the event detection methods applied to malaria outbreak detection and to assess their potential for detecting outbreaks.


Study area and data

The Amhara region is located in northwest Ethiopia (Fig. 1). Most of the terrain is mountainous, with lowlands along the northwestern edge of the region. Rainfall is highly seasonal, with the heaviest rains from June through September. There are two major seasons for malaria transmission: the main transmission season after the end of the rainy season between September and December, and a secondary peak at the beginning of the rainy season in May through August [37, 38]. Population in the Amhara region is over 21 million, and the people primarily live in rural areas and practice subsistence farming [39]. There is widespread transmission of Plasmodium falciparum and P. vivax malaria, with a ratio of 1.2 of P. falciparum to P. vivax as seen in blood film tests from a cross-sectional survey [40]. A national malaria control program targets the Ethiopian population at risk, including the Amhara region. The program includes four main interventions: distribution of free long-lasting insecticidal nets (LLINs), indoor residual spraying (RDS), rapid diagnostic tests (RDTs) available at all health facilities, and treatment with artemisinin-based combination therapy (ACT) [39, 40]. Areas with low transmission rates due to declining malaria incidence and unstable transmission patterns are being targeted for elimination [39, 41, 42].

Fig. 1
figure 1

Amhara region of Ethiopia. Zones are labeled and the 47 woredas included in this study are marked in darker shades

Administratively, the region is divided into twelve zones and three administered towns, which are further divided into between four and 24 woredas, or districts (Fig. 1). Woredas are subdivided into kebeles (villages). In the Amhara region, there are 162 woredas (containing 3543 kebeles), and 47 of the most malaria-prone woredas were included in the Epidemic Prognosis Incorporating Disease and Environmental Monitoring for Integrated Assessment (EPIDEMIA) pilot project [39]. The health care system is organized into three tiers: primary, secondary, and tertiary levels [43]. The primary level in rural areas includes health posts, health centers, and a primary hospital. The primary health care units (PHCUs) contain five health posts (satellite facilities located in kebeles) and one referral health center. Secondary and tertiary levels are referral general and specialized hospitals, respectively.

Public health surveillance data on patients seeking care at health posts or health centers are collected and aggregated by the Amhara Regional State Health Bureau (ARHB). Among the data collected are the numbers of malaria cases confirmed by rapid-diagnostic tests (RDT) or blood film screening, and these counts are grouped as Plasmodium falciparum (including mixed infections) and P. vivax (only) malaria. These data are summarized by the week of the year (based on the ISO 8601 standard used by WHO) and reported to the woreda health office. This office aggregates a complete woreda report before sending the summarized data to the zonal health office, which compiles all the woreda reports within the zone and sends the reports to the regional ARHB office, where they were uploaded into the EPIDEMIA system [39].

This study analyzed data from the 47 EPIDEMIA pilot woredas, which included weekly case counts of P. falciparum (or mixed) and P. vivax malaria starting from ISO week 28 of 2012 through week 52 of 2019. These woredas have seen great public health successes in reducing the malaria burden from 2012 through 2018, but experienced a resurgence in 2019 (Fig. 2).

Fig. 2
figure 2

Time series graph of malaria case counts by species. The graph of Plasmodium falciparum (and mixed species) and P. vivax case counts shows the patterns of seasonality, long-term declining trends, and resurgence in 2019

Between 2013 and 2018, there was a steady decrease from 349,523 P. falciparum or mixed malaria cases to 104,947 cases, a 70% reduction. However, in 2019 there were 210,194 cases, a volume that had not been seen since 2015 (Table 1). We focused our analysis on P. falciparum (including mixed infections with P. vivax) which is the predominant parasite species, is of greatest concern from a public health standpoint, and had the strongest resurgence in 2019.

Table 1 Confirmed malaria case counts from 47 pilot woredas in the Amhara region by species

Identification of baseline events via trend weighted seasonal thresholds (TWST)

Prior to evaluating event detection algorithms, specific events of interest must be defined for each woreda to be the baseline testing dataset. Here, for research purposes, we developed an objective approach named trend weighted seasonal thresholds (TWST) for identifying events as anomalous increases in the number of reported malaria cases. The approach was designed to identify events retrospectively in the context of seasonal patterns and decreasing long-term trends in disease transmission, while allowing for variation in patterns across woredas as well as slight time-shifts in seasonal peaks between years.

The TWST approach identified two thresholds, weekly and yearly, for each woreda. This combination of weekly and yearly thresholds has been used in other work for defining malaria epidemics [13]. In preparation, the raw weekly time-series were smoothed using a centered 5-week triangular moving average, which used a sliding window of five weeks with the week being calculated in the center. The moving average was weighted with the center week the most important, and the weeks on either side having decreasing weight. The yearly threshold was calculated as the harmonic mean of the entire year plus a multiplier based on the standard deviation (1.5 for P. falciparum and mixed species).

The weekly threshold was calculated in a three-step process. In the first step, the raw threshold value for a given week was the harmonic mean of that week in the year, over the five years of data, plus a multiplier based on the standard deviation (1 for P. falciparum and mixed). In the second step, the raw thresholds were optionally trend weighted based on the year harmonic mean. If there was a declining trend (from the year previous), then the weekly threshold values were weighted proportional to the difference between the current year harmonic mean and the highest (max) harmonic mean using a weighting factor, 0.5 for P. falciparum and mixed: (max – weighting factor * (max – current)) / max. If there was no declining trend, the weekly thresholds were weighted based on the previous year mean instead of the current year. In the third and final step, allowances were made to prevent minor time shifts in increasing and decreasing case counts between years from triggering alerts [32, 44], by inflating weekly thresholds that were not near in time to peaks. Peak areas were identified using a percentile cut-off per year (85% for P. falciparum and mixed), plus short stretches (up to 8 weeks) between these high rates. The inflation was based on the average of the year and week harmonic means multiplied by an expansion factor (1.2 for P. falciparum and mixed), which was then added to the trend weighted threshold of the previous step to arrive at the final TWST week threshold. Anomalies were identified if cases exceeded both the yearly and weekly thresholds, and events were identified if anomalies lasted for four or more weeks consecutively. Events that were separated by only one or two weeks were merged into one event.

Event detection

Detection algorithms

The previous step, event identification, was based on a retrospective analysis with full knowledge of the entire 7.5-year time span, yielding specific spikes or abnormal increases in malaria case counts to be used as a baseline testing set for the detection algorithms. In contrast, event detection algorithms were forward-looking, running in-step with the data and only using values up to a given week, which mimics real time surveillance efforts to detect outbreaks as early as possible to mount timely public health responses. For this study, three types of event detection algorithms were used: 1) CDC EARS methods, 2) weekly statistical summaries that included the commonly-used WHO and Cullen methods, and 3) Farrington methods [4, 27,28,29].

For EARS, the three variations C1-Mild, C2-Medium, and C3-High were tested using the default alpha values (0.001 for C1 and C2, 0.025 for C3) with four different baseline periods: the default 7 periods (weeks, here), plus 14, 28, and 56 weeks.

For the weekly statistical summaries, thresholds were calculated from the week of the year median, mean plus two standard deviations (Cullen method without removing past outbreaks), and 75th and 85th percentiles (WHO method) for three historical time periods: 5 years, 6 years, or weekly maximum of 6 or 7 years depending on the week of year.

The Farrington algorithm offers parameters to control various model settings, of which we focused on five: 1) the number of weeks before and after the week in question to use as a window for calculations (‘window half-size’, w), 2) the number of years of historical data to use (b), 3) the inclusion of an optional long-term trend, 4) the number of periods to account for seasonality, and 5) the number of weeks to exclude at the beginning of the evaluation period (for events that may already be in progress). For the Farrington algorithm, 204 variations in a parameter sensitivity analysis were run. The first four runs used basic settings without population offsets: 1) original method with no seasonality (one period), 2) original method with four periods for seasonality, 3) improved method with no seasonality, 4) improved method with four periods of seasonality. The original method was first proposed by Farrington et al. [28] and the improved method has the changes proposed by Noufaily et al. [29]. The improved method aims to reduce the number of false positives through changes in the calculations of the trend component, reweighting of past events, seasonality, and error structures [29]. The set of two hundred additional runs utilized the improved method with a population offset option and an exhaustive set of combinations of the selected five parameters and values: window half size (3, 5), years of historical data to include (3, 4, 5, 6, or maximum adaptive), long-term trend inclusion (trend or no trend), seasonality periods (1, 2, 4, 8, 12), and past weeks to exclude at the beginning for spin up time (26 or set equal to window half size). All parameter combinations can be found in the supplemental materials (Additional File 2, Supplemental Tables S1 and S2) and a subset of relevant parameter combinations in the Results (Table 4). All methods were implemented in R and the surveillance package was used for the EARS and Farrington methods [27, 45].

Skill comparison test

As a skill comparison test to the real detection algorithms, six sets of random alarms were also generated. Any algorithm that produces alarms will, by chance, occasionally occur during an event, and the more alarms triggered, the more likely events will seem to be detected. This skill test checked that the event detection methods are performing better than a null model and provided context in the comparison between the methods. The random algorithm produced alarms between one and five weeks long, with a minimum buffer of four weeks between alarms. The probability per week of an alarm was varied to create different total numbers of alarms.


Metrics of detection effectiveness were event based, because using events as the unit of analysis is relevant to how these algorithms would be used in public health surveillance to find outbreaks before or as they are occurring. Two main indicators were used: the percent of events that were caught, and the percent of alarms that were associated with events (true positive alarms). An alarm and event were considered associated if the alarm was triggered any week during or up to two weeks prior to the event. Percent of events caught was an indicator of how well the algorithm detected events, with higher percentage meaning that fewer events were missed. Percent of alarms associated with events was the true positive rate of the alarms (the percentage of alarms that overlapped with or up to two weeks prior of an event). A high percentage of this metric demonstrated that the algorithm was more likely to trigger alarms when an event was actually happening and less likely to generate false alarms. Ideally, event detection algorithms would trigger alarms for all events (100% events detected) and never when there was not an event (100% alarms true positive). In addition to events caught, we also considered if the alarm for the event was timely, which was defined as an alarm between two weeks prior and including the start week of an event.


Identified events

The TWST algorithm, developed to identify time periods of excess malaria case counts that were considered of potential public health interest, found a total of 255 events for P. falciparum and mixed species. The numbers of events declined from 2013 to 2018, however in 2019 the number of events greatly increased. Also, during 2019, the average number of cases in events was the highest since 2012 (Table 2, all events are shown over time in Additional File 1, Supplemental Fig. 1).

Table 2 Events and malaria case statistics for P. falciparum and mixed malaria

The TWST algorithm was specifically designed to account for seasonality and not identify every seasonal peak as being an event in the context of overall declining trends in malaria transmission. However, different woredas in the region exhibited various patterns in incidence, including decreasing trends, increases in the middle or end of the time period, clear single seasonal peaks, dual seasonal peaks, and various combinations of these patterns. The TWST algorithm was flexible enough to appropriate identify events in these patterns (Fig. 3).

Fig. 3
figure 3

Malaria incidence and TWST-identified events in selected woredas. The examples show various patterns in seasonal and long-term trends in the incidence of malaria and the TWST events: (a) Mecha, (b) Baso Liben, (c) Jawi. Mecha and Baso Liben both had decreasing incidence and a resurgence in 2019, but Baso Liben had maintained seasonal peaks while Mecha did not. Seasonal patterns also vary between clear single or dual peaks to more irregular patterns such as in Jawi. Observed incidence is marked in light grey and the smoothed incidence in black. Week and year thresholds from the TWST algorithm are shown as dot-dashed lines in green and blue, respectively. Any identified events are marked with red circles at the appropriate weeks at the top of the graphs

The algorithm was able to identify peaks that would have been overshadowed by peaks in much earlier years but are important relative to more recent patterns. For example, the woreda Abargelie had high peaks in 2013 and to a lesser extent in 2014. During 2015 however, the season was very quiet with no large peaks. In the fall of 2016, a moderate seasonal peak returned and larger fall peaks occurred in 2017 and 2018, but if the thresholds had not considered the 2015 season (trend-weighting), the 2017 and 2018 peaks would not been identified as events (Fig. 4). The time-shift allowance in TWST was also needed to prevent notifications of events where the peak simply declined more slowly than in other years (Fig. 4).

Fig. 4
figure 4

Example TWST results versus unadjusted week thresholds. Malaria incidence, unadjusted and adjusted week thresholds, and TWST events are shown for the selected woredas: (a) Abergele, (b) Borena / Debresina, and (c) Artuma Fursi. In Abargelie, the fall 2017 and 2018 peaks would have been below an unadjusted week threshold (orange dotted line) but were identified using the final TWST thresholds (green dot-dashed line) that had been trend-weighted. Borena / Debresina and Artuma Fursi shows multiple instances where the non-peak expansion and time-shift allowances prevented inappropriately identified events on the edges of incidence peaks or in the seasonal troughs

Event detection

A total of 234 event detection algorithm and variations were tested on the 30 TWST-identified events in the 2019 evaluation period (selected entries in Table 3, for all results see Additional File 2, Supplemental Tables S1 and S2). Of the 234, there were 12 CDC EARS variants, 12 WHO and statistical variants, 205 Farrington variants, and six random alarm generators (Fig. 5). The six random alarm generators were run with various probabilities of generating an event per week: 0.2, 0.1, 0.05, 0.025, 0.012, and 0.006 which yielded 233, 151, 92, 61, 33, and 18 alarms respectively, a range similar to the number of alarms from the other event detection algorithms.

Table 3 Results for selected event detection algorithms for P. falciparum and mixed malaria events in 2019
Fig. 5
figure 5

Scatter plot of the percent of events caught versus percent of true positive alarms. Results are shown from all event detection algorithm and variants. Each category is marked in a different shape and color combination. The size of the marker shows the percent of a timely alarms for an event

As expected, random alerts performed poorly and had the lowest percentages of true positive alarms across the variants (Table 3, Fig. 5). Variants with higher probabilities created more alarms, and saw higher event caught percent scores, as the more alarms are present the more likely they are to randomly overlap with an event.

The CDC EARS methods generated large numbers of alarms (98 to 152), with high percentages of events caught (80 to 100%) and a wide range of percent timely alarms (43 to 87%), but also had low to midrange percentages (25 to 40%) of true positive alarms (selected items in Table 3, full listing in Supplemental materials). Of the weekly statistical summaries, the Cullen mean plus two standard deviations variant produced the highest true positive rates (51 to 61%, depending on the number of years of historical data included), but the lowest percentages of events caught (57 to 80%) and the lowest percentages of timely alarms (13 to 37%). The WHO 75th percentile with 5 years of data, a commonly used algorithm, produced 200 alarms with 97% of events caught (93% timely) but only 26% true positive detections (Table 3). The 85th percentile variants produced somewhat fewer alarms with higher true positive rates, and with similar or slightly reduced events caught and timely alarms.

Examining the Farrington results (orange hollow circles in Fig. 5), there was a trade-off between events caught and true positives. The Farrington variants were based on the sensitivity analysis of five parameters: window half size (w), years of historical data included (b), number of periods for seasonality, long-term trend inclusion, and the exclusion period for spin up time. Not all parameters influenced the outcomes; window half size and the exclusion period did not greatly affect the results, although the 26-week exclusion period was slightly preferable. The parameters for number of historical years of data, number of periods for seasonality, and trend inclusion had the greatest impacts on the outcome metrics. Of the 200 improved Farrington variants with population offset, the event caught rate was highest when the trend was included and there were 4 to 12 periods for seasonality (Fig. 6). The event caught rate fell as more years of historical data were included, especially in variants that did not include a trend.

Fig. 6
figure 6

Plot of event caught percentages from the Farrington event detection variants. Scores were higher when a long-term trend was included (filled circles) than when no trend was included (hollow triangles). Event caught rate fell as more years of historical data were included (x-axis), especially in variants that did not include a trend. Within each trend set, scores were higher with 4 to 12 periods for seasonality (blue to green colors), and lowest with one period, i.e. no seasonality (dark purple). The number of alarms generated is indicated by the size of the marker and decreases as more years of historical data are included

Of the 200 improved Farrington variants with population offsets, the true positive percentages were highest when no trend was included, two to four periods for seasonality were included, and increases as more years of historical data are included (Fig. 7). The number of alarms generated decreased with additional years of historical data (size of the marker in Figs. 6 and 7).

Fig. 7
figure 7

Plot of true positive alarm percentages from the Farrington event detection variants. Scores were higher when the long-term trend was not included (hollow triangles) as compared to variants where trend was included (filled circles). More historical data (x-axis) increased the alarm true positive score and decreased the total number of alarms generated (size of marker). Scores were highest with two to four periods for seasonality (blues), and lowest with no seasonality (one period, dark purple). The number of alarms generated is indicated by the size of the marker and decreases as more years of historical data are included

The Farrington original and improved methods with default values (and without seasonality) and no population offset were compared against the 200 parameter sensitivity runs using the improved method with population offsets (original A1 and A2, base improved B1 and B2 in Tables 3 and 4). As seen in Figs. 6 and 7, there were large trade-offs in the 200 variant set between events caught and true positive rates. Some Farrington runs reached 100% events caught, but the highest true positive rate of that set was only 26% (Farrington C1 in Table 3). Other variants reached 100% in alarm true positive rate, but the highest event caught score in that set was 40% (Farrington C2 in Table 3). Taking a balanced approach, a variant with reasonable trade-offs had a score of 73% events caught and 74% alarm true positive but only 37% events caught timely (Farrington C3 in Tables 3 and 4). Another, and our selected balanced variant had 83% events caught and 53% events caught timely with 51% alarms true positive (Farrington C4 in Tables 3 and 4).

Table 4 Details of selected Farrington variants, including parameter settings


The TWST algorithm that we developed succeeded in identifying malaria transmission events in the presence of changing expectations due to decreasing incidence trends. Using thresholds defined from time periods with high disease transmission may mask important events in less active years; events which would be considered abnormal if compared to more recent activity. This approach is essential in areas like the Amhara region where malaria incidence is declining in many woredas because of public health interventions. In regards to malaria surveillance, the WHO specifically notes that the normal or expected patterns of malaria, from which outbreak thresholds are derived, do change over time in areas that see sharp decreases in incidence after intensive control efforts [4]. As woredas approach elimination, the sizes of malaria events become smaller, but it will still be necessary to detect and respond quickly to these outbreaks. In the context of resurgence, having dynamic thresholds that adapt to changing conditions is crucial for identifying malaria peaks that are smaller than larger historical outbreaks, but still significantly larger than malaria case numbers in recent years.

The operational activities of detecting and responding to outbreaks are enabled by and integral to malaria surveillance systems. Surveillance as an intervention is the third pillar of the WHO global technical strategy for malaria elimination with differing key aspects as disease control transitions to pre-elimination, elimination, and prevention of reintroduction phases [7,8,9, 46,47,48,49,50,51]. More recent frameworks focus on transitions and evolving approaches needed in setting with changing epidemiology patterns [7, 46]. The Amhara region, as mentioned previously, is in a transition period marked by declining and changing trends in malaria transmission due to disease interventions plus a resurgence in 2019. By testing various algorithms on historical data from this region, we were able to assess their potential to provide early detection of resurgent malaria outbreaks. While weekly statistical methods are very commonly used for malaria surveillance [4], the CDC EARS and Farrington algorithms to our knowledge have not been previously assessed for use in malaria surveillance.

In the event detection comparison, the randomly generated alarms produced the worst results, indicating that all the algorithms that we tested were better than the naïve assumption of random outbreaks. CDC EARS is designed to be used even when lacking historical data, as it creates thresholds from only recent data (7 for C1 and C2 to 11 for C3 previous time steps with a baseline of 7). A drawback is that this approach cannot effectively account for seasonality and tends to trigger alarms at every seasonal peak. However, the results indicate that the EARS algorithms have a high sensitivity to increases in malaria cases. Thresholds based on weekly statistical summaries also produced high event caught scores and moderately higher alarm true positive rates as compared to CDC EARS methods. Both EARS and WHO methods tended to produce a high total number of alarms generated.

The suite of Farrington methods, especially the improved versions, allows adjustments for long-term trends and seasonal patterns. The Farrington algorithm, in various forms, have been implemented at public health centers and used for a variety of pathogens, particularly for gastrointestinal illnesses in several European countries: England, Wales, and Northern Ireland [28, 33], Scotland [24], Netherlands [24, 33, 35], Lower Saxony state in Germany [33], and Sweden [33, 34]. As with the weekly statistical summaries, the Farrington algorithms require several years of historical data, which may not always be available. As expected with the highly seasonal patterns we observe in the Amhara region, including enough seasonal periods was important as accuracy suffered when too few periods were included.

A substantial trade-off was found with the inclusion of long-term trend between the percent of events caught and the percentage of true positive alarms. Including the long-term trend as implemented in the Farrington algorithm increased events caught rate, however, there was also a decrease in the true alarm rate. In the context of declining malaria incidence, setting thresholds based on historical data tends to result in a high threshold that cannot detect smaller, more recent events. Adjusting the threshold using the recent trend of declining malaria cases therefore increases the sensitivity of outbreak detection but can result in large numbers of false alarms if the resulting threshold is too low. These results do show that accounting for annual cycles and inter-annual trends is essential for calibrating malaria early detection parameters in settings characterized by seasonal transmission and declining malaria trends caused by public health interventions. In situations where comprehensive data on interventions are available, other modeling approaches that explicitly account for interventions could also be used to predict trends in malaria cases [52].

One of our motivations for comparing early detection algorithms was to guide the selection of methods for a malaria early warning system in the Amhara region as part of the EPIDEMIA project [39]. Following discussions among project partners and in consideration of the public health applications of the early detection results, we opted then to give the true positive metric slightly more importance in the evaluation of algorithm performance. We did not want to generate large numbers of false alarms with an algorithm that had lower specificity, and we were cognizant that false alarms could cause ineffective and costly unnecessary mobilizations of resources. However, we balanced this desire to avoid false positives with the need to capture important events accurately and maintain credibility. In this analysis, we quantified the trade-off between events caught and true positive scores by testing a range of methods and parameterizations, and we found that variations of the Farrington method were usually best for maximizing both events caught and true positives.

Depending on the intended public health utilization of the event detection alarms, other implementations may choose to prioritize sensitivity over specificity if identifying all potential malaria outbreaks is more important than minimizing false positives. Methods and variants with high sensitivity could be useful for generating a ‘watch list’ of places that may be seeing an outbreak beginning or spike in cases. However, due to the high false alarm rate (low true positive percentage), warnings based on algorithm variants with low true positive scores run the risk of causing alert fatigue, where public health officials may be overwhelmed by alerts that are not meaningful. Alert fatigue has been observed with health care providers during health emergencies where they were inundated with public health communications and had trouble recalling specific information from the messages [53]. Health care providers tend to prefer fewer messages, from one source, and with local guidance or context [54]. Alarms from a system with high sensitivity but low specificity would not be suitable to prompt costly interventions, however, they may be useful to generate lists of places to monitor more closely.

Many of the early detection algorithms recommended for malaria use five full years to create the baseline. We tested five to 6.5 years in the weekly statistical summary methods, and from three to 6.5 years in the Farrington variants. However, given continuing changes in malaria transmission environments resulting from ongoing interventions, social and demographic changes, and climate change, it may not be reasonable to expect that historical malaria more than a few years old is suitable to provide a baseline for detecting future outbreaks [4, 25, 55,56,57,58,59]. Therefore, it is imperative to continue to explore new approaches for malaria outbreak detection that can be used with data covering shorter time periods. Future studies evaluating other algorithms will likely also prove insightful, as well as investigating the performance of the EARS and Farrington methods in other locations with different patterns of malaria incidence.


We compared the effectiveness of three methods for malaria outbreak detection: 1) CDC EARS methods, 2) methods based on weekly statistical thresholds, including the WHO and Cullen methods, and 3) Farrington methods, using 7.5 years of malaria surveillance data from the Amhara region of Ethiopia. To our knowledge, this is the first study to assess the potential application of the EARS and Farrington methods for malaria outbreak detection. The EARS methods by design use a very short historical window that cannot account for seasonal trends in malaria occurrence. As a result, they were very sensitive to increases in cases and caught most outbreaks, but they could not effectively distinguish seasonal increases from outbreaks and generated many false positive alerts. WHO and statistical methods were also quite sensitive and detected high percentages of outbreaks with intermediate percentages of true positive alerts. Variations of the Farrington method had a wide range of trade-offs between events caught and true positive scores. Farrington variants that accounted for seasonality had much higher true positive rates than the EARS and WHO methods and could achieve a better balance between true positives and the percent of malaria events caught. We determined that of the methods compared, the Farrington algorithm was the most flexible and useful approach for operational early detection, and we have successfully used it in a pilot implementation of the EPIDEMIA malaria early warning system in the Amhara region [39]. We suggest that this approach is more generally useful for detecting infectious disease outbreaks in transitional environments with strong seasonality and declining trends. The intended used of the early detection results will drive the choice of algorithm and parameter settings to optimize sensitivity and specificity of alarms for particular applications.

Availability of data and materials

The data that support the findings of this study are not publicly available because they were used under a data-sharing agreement with the Amhara Regional Health Bureau that does not permit their redistribution, but are available from the Amhara Regional Health Bureau on reasonable request. Code base for this project with synthetic data for demonstration can be found in a publically-available Github repository managed by our research group:



Amhara Regional State Health Bureau


Centers for Disease Prevention and Control


Early Aberration Reporting System


Epidemic Prognosis Incorporating Disease and Environmental Monitoring for Integrated Assessment


Primary Health Care Units


Trend Weighted Seasonal Thresholds


World Health Organization


  1. World Health Organization. World Malaria Report 2018. 2018. Accessed 20 Dec 2018.

  2. Hay SI, Guerra CA, Tatem AJ, Noor AM, Snow RW. The global distribution and population at risk of malaria: past, present, and future. Lancet Infect Dis. 2004;4(6):327–36.

    Article  PubMed  PubMed Central  Google Scholar 

  3. World Health Organization. World Malaria Report 2017. 2017. Accessed 24 May 2018.

  4. World Health Organization. Malaria surveillance, monitoring & evaluation: a reference manual. 2018.

    Google Scholar 

  5. World Health Organization. Malaria early warning systems: concepts, indicators and partners. 2001. Accessed 18 Oct 2018.

  6. Centers for Disease Prevention and Control. Updated guidelines for evaluating public health surveillance systems: recommendations from the guidelines working group. MMWR. 2001;50 RR-13. Accessed 18 Oct 2018.

  7. Larsen DA, Chisha Z, Winters B, Mwanza M, Kamuliwo M, Mbwili C, et al. Malaria surveillance in low-transmission areas of Zambia using reactive case detection. Malar J. 2015;14(1):465.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. World Health Organization. Global technical strategy for malaria 2016-2030. 2015.

  9. Landier J, Parker DM, Thu AM, Carrara VI, Lwin KM, Bonnington CA, et al. The role of early detection and treatment in malaria elimination. Malar J. 2016;15(1):363.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Girond F, Randrianasolo L, Randriamampionona L, Rakotomanana F, Randrianarivelojosia M, Ratsitorahina M, et al. Analysing trends and forecasting malaria epidemics in Madagascar using a sentinel surveillance network: a web-based application. Malar J. 2017;16(1):72.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Mabaso MLH, Ndlovu NC. Critical review of research literature on climate-driven malaria epidemics in sub-Saharan Africa. Public Health. 2012;126(11):909–19.

    Article  CAS  PubMed  Google Scholar 

  12. Hay SI, Simba M, Busolo M, Noor AM, Guyatt HL, Ochola SA, et al. Defining and detecting malaria epidemics in the highlands of western Kenya. Emerg Infect Dis. 2002;8(6):555–62.

    Article  PubMed  PubMed Central  Google Scholar 

  13. McKelvie WR, Haghdoost AA, Raeisi A. Defining and detecting malaria epidemics in south-East Iran. Malar J. 2012;11(1):81.

    Article  PubMed  PubMed Central  Google Scholar 

  14. World Health Organization. Systems for the early detection of malaria epidemics in Africa: An analysis of current practices and future priorities. 2006. Accessed 18 Oct 2018.

  15. Sonesson C, Bock D. A review and discussion of prospective statistical surveillance in public health. J R Stat Soc Ser A Stat Soc. 2003;166(1):5–21.

    Article  Google Scholar 

  16. Yang E, Park H, Choi Y, Kim J, Munkhdalai L, Musa I, et al. A simulation-based study on the comparison of statistical and time series forecasting methods for early detection of infectious disease outbreaks. Int J Environ Res Public Health. 2018;15(5):966.

    Article  PubMed Central  Google Scholar 

  17. Teklehaimanot HD, Schwartz J, Teklehaimanot A, Lipsitch M. Alert Threshold Algorithms and Malaria Epidemic Detection - Volume 10, Number 7—July 2004 - Emerging Infectious Diseases journal - CDC. doi:

  18. Rogerson PA. Surveillance systems for monitoring the development of spatial patterns. Stat Med. 1997;16(18):2081–93.<2081::aid-sim638>;2-w.

    Article  CAS  PubMed  Google Scholar 

  19. Hay SI, Were EC, Renshaw M, Noor AM, Ochola SA, Olusanmi I, et al. Forecasting, warning, and detection of malaria epidemics: a case study. Lancet. 2003;361(9370):1705–6.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Cullen JR, Chitprarop U, Doberstyn EB, Sombatwattanangkul K. An epidemiological early warning system for malaria control in northern Thailand. Bull World Health Organ. 1984;62(1):107–14.

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Rossi G, Lampugnani L, Marchi M. An approximate CUSUM procedure for surveillance of health events. Stat Med. 1999;18(16):2111–22.<2111::AID-SIM171>3.0.CO;2-Q.

    Article  CAS  PubMed  Google Scholar 

  22. Fricker RD, Hegler BL, Dunfee DA. Comparing syndromic surveillance detection methods: EARS’ versus a CUSUM-based methodology. Stat Med. 2008;27(17):3407–29.

    Article  PubMed  Google Scholar 

  23. Robertson C, Nelson TA, MacNab YC, Lawson AB. Review of methods for space–time disease surveillance. Spat Spatio-Temporal Epidemiol. 2010;1(2-3):105–16.

    Article  Google Scholar 

  24. Unkel S, Farrington C, Garthwaite PH, Robertson C, Andrews N. Statistical methods for the prospective detection of infectious disease outbreaks: a review. J R Stat Soc Ser A Stat Soc. 2012;175(1):49–82.

    Article  Google Scholar 

  25. Cotter C, Sturrock HJ, Hsiang MS, Liu J, Phillips AA, Hwang J, et al. The changing epidemiology of malaria elimination: new strategies for new challenges. Lancet. 2013;382(9895):900–11.

    Article  PubMed  Google Scholar 

  26. Zhu Y, Wang W, Atrubin D, Wu Y. Initial evaluation of the early aberration reporting system --- Florida. MMWR Suppl. 2005;54(Suppl):123–30.

    PubMed  Google Scholar 

  27. Höhle M. Surveillance: an R package for the monitoring of infectious diseases. Comput Stat. 2007;22(4):571–82.

    Article  Google Scholar 

  28. Farrington CP, Andrews NJ, Beale AD, Catchpole MA. A statistical algorithm for the early detection of outbreaks of infectious disease. J R Stat Soc Ser A Stat Soc. 1996;159(3):547–63.

    Article  Google Scholar 

  29. Noufaily A, Enki DG, Farrington P, Garthwaite P, Andrews N, Charlett A. An improved algorithm for outbreak detection in multiple surveillance systems. Stat Med. 2013;32(7):1206–22.

    Article  PubMed  Google Scholar 

  30. Bédubourg G, Strat YL. Evaluation and comparison of statistical methods for early temporal detection of outbreaks: a simulation-based study. PLoS One. 2017;12(7):e0181227.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Hutwagner LC, Thompson WW, Seeman GM, Treadwell T. A simulation model for assessing aberration detection methods used in public health surveillance for systems with limited baselines. Stat Med. 2005;24(4):543–50.

    Article  CAS  PubMed  Google Scholar 

  32. Ashton RA, Kefyalew T, Batisso E, Awano T, Kebede Z, Tesfaye G, et al. The usefulness of school-based syndromic surveillance for detecting malaria epidemics: experiences from a pilot project in Ethiopia. BMC Public Health. 2015;16(1):20.

    Article  Google Scholar 

  33. Hulth A, Andrews N, Ethelberg S, Dreesman J, Faensen D, van Pelt W, et al. Practical usage of computer-supported outbreak detection in five European countries. Eurosurveillance. 2010;15:19658.

    Article  Google Scholar 

  34. Cakici B, Hebing K, Grünewald M, Saretok P, Hulth A. CASE: a framework for computer supported outbreak detection. BMC Med Inform Decis Mak. 2010;10(1):14.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Widdowson M-A, Bosman A, van Straten E, Tinga M, Chaves S, van Eerden L, et al. Automated, laboratory-based system using the internet for disease outbreak detection, the Netherlands. Emerg Infect Dis. 2003;9(9):1046–52.

    Article  PubMed  PubMed Central  Google Scholar 

  36. Yalew WG, Pal S, Bansil P, Dabbs R, Tetteh K, Guinovart C, et al. Current and cumulative malaria infections in a setting embarking on elimination: Amhara, Ethiopia. Malar J. 2017;16(1):242.

    Article  PubMed  PubMed Central  Google Scholar 

  37. Wimberly MC, Midekisa A, Semuniguse P, Teka H, Henebry GM, Chuang T-W, et al. Spatial synchrony of malaria outbreaks in a highland region of Ethiopia: malaria outbreaks in a highland region of Ethiopia. Tropical Med Int Health. 2012;17(10):1192–201.

    Article  Google Scholar 

  38. Midekisa A, Beyene B, Mihretie A, Bayabil E, Wimberly MC. Seasonal associations of climatic drivers and malaria in the highlands of Ethiopia. Parasit Vectors. 2015;8(1):339.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Merkord CL, Liu Y, Mihretie A, Gebrehiwot T, Awoke W, Bayabil E, et al. Integrating malaria surveillance with climate data for outbreak detection and forecasting: the EPIDEMIA system. Malar J. 2017;16(1):89.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Emerson PM, Ngondi J, Biru E, Graves PM, Ejigsemahu Y, Gebre T, et al. Integrating an NTD with one of “the big three”: combined malaria and trachoma survey in Amhara region of Ethiopia. PLoS Negl Trop Dis. 2008;2(3):e197.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Abeku TA, van Oortmarssen GJ, Borsboom G, de Vlas SJ, Habbema JDF. Spatial and temporal variations of malaria epidemic risk in Ethiopia: factors involved and implications. Acta Trop. 2003;87(3):331–40.

    Article  PubMed  Google Scholar 

  42. Negash K, Kebede A, Medhin A, Argaw D, Babaniyi O, Guintran JO, et al. Malaria epidemics in the highlands of Ethiopia. East Afr Med J. 2005;82(4):186–92.

    Article  CAS  PubMed  Google Scholar 

  43. Primary health care systems (PRIMASYS): case study from Ethiopia, abridged version. Geneva: World Health Organization; 2017. Accessed 13 Nov 2020.

  44. Jima D, Wondabeku M, Alemu A, Teferra A, Awel N, Deressa W, et al. Analysis of malaria surveillance data in Ethiopia: what can be learned from the integrated disease surveillance and response system? Malar J. 2012;11(1):330.

    Article  PubMed  PubMed Central  Google Scholar 

  45. R Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for statistical Computing; 2014. Accessed 19 May 2016.

  46. World Health Organization. A framework for malaria elimination. 2017.

    Google Scholar 

  47. malERA Consultative Group on Monitoring, Evaluation, and Surveillance. A Research Agenda for Malaria Eradication: Monitoring, Evaluation, and Surveillance. PLoS Med. 2011;8.

  48. Barclay VC, Smith RA, Findeis JL. Surveillance considerations for malaria elimination. Malar J. 2012;11(1):304.

    Article  PubMed  PubMed Central  Google Scholar 

  49. Cao J, Sturrock HJW, Cotter C, Zhou S, Zhou H, Liu Y, et al. Communicating and monitoring surveillance and response activities for malaria elimination: China’s “1-3-7” strategy. PLoS Med. 2014;11(5):e1001642.

    Article  PubMed  PubMed Central  Google Scholar 

  50. Kelly GC, Hale E, Donald W, Batarii W, Bugoro H, Nausien J, et al. A high-resolution geospatial surveillance-response system for malaria elimination in Solomon Islands and Vanuatu. Malar J. 2013;12(1):108.

    Article  PubMed  PubMed Central  Google Scholar 

  51. Kelly GC, Tanner M, Vallely A, Clements A. Malaria elimination: moving forward with spatial decision support systems. Trends Parasitol. 2012;28(7):297–304.

    Article  PubMed  Google Scholar 

  52. Tun STT, von Seidlein L, Pongvongsa T, Mayxay M, Saralamba S, Kyaw SS, et al. Towards malaria elimination in Savannakhet, Lao PDR: mathematical modelling driven strategy design. Malar J. 2017;16:483.

    Article  Google Scholar 

  53. Baseman JG, Revere D, Painter I, Toyoji M, Thiede H, Duchin J. Public health communications and alert fatigue. BMC Health Serv Res. 2013;13(1):295.

    Article  PubMed  PubMed Central  Google Scholar 

  54. Staes CJ, Wuthrich A, Gesteland P, Allison MA, Leecaster M, Shakib JH, et al. Public health communication with frontline clinicians during the first wave of the 2009 influenza pandemic. J Public Health Manag Pract JPHMP. 2011;17(1):36–44.

    Article  PubMed  Google Scholar 

  55. Hay SI, Cox J, Rogers DJ, Randolph SE, Stern DI, Shanks GD, et al. Climate change and the resurgence of malaria in the east African highlands. Nature. 2002;415(6874):905–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. O’Meara WP, Mangeni JN, Steketee R, Greenwood B. Changes in the burden of malaria in sub-Saharan Africa. Lancet Infect Dis. 2010;10(8):545–55.

    Article  PubMed  Google Scholar 

  57. Patz JA, Olson SH. Malaria risk and temperature: influences from global climate change and local land use practices. Proc Natl Acad Sci. 2006;103(15):5635–6.

    Article  CAS  PubMed  Google Scholar 

  58. Ryan SJ, McNally A, Johnson LR, Mordecai EA, Ben-Horin T, Paaijmans K, et al. Mapping physiological suitability limits for malaria in Africa under climate change. Vector-Borne Zoonotic Dis. 2015;15(12):718–25.

    Article  PubMed  PubMed Central  Google Scholar 

  59. van Lieshout M, Kovats RS, Livermore MTJ, Martens P. Climate change and malaria: analysis of the SRES climate and socio-economic scenarios. Glob Environ Change. 2004;14:87–99.

    Article  Google Scholar 

Download references


We thank Chris Merkord and Yi Liu for their work on software development and data processing for the EPIDEMIA project and Aklilu Getinet for his assistance with project coordination.


This work is supported by Grant Number R01-AI079411 from the National Institute of Allergy and Infectious Diseases.

Author information

Authors and Affiliations



Conceived and designed the study: DMN, MCW. Processed surveillance data and provided critical interpretation of results: AM, TG, ML, WA. Created TWST algorithm and performed the analysis: DMN. Drafted initial manuscript: DMN, MCW. Critical revision of the article: WA, AM. All authors read and approved the manuscript.

Corresponding author

Correspondence to Michael C. Wimberly.

Ethics declarations

Ethics approval and consent to participate

Ethical approval for this research was provided by the Amhara Regional Health Bureau. The research did not involve human subjects as it used only non-identifiable data provided as aggregated summaries.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Malaria events detected using the Trend Weighted Seasonal Threshold (TWST) algorithms. The file contains one figure (S1) for Plasmodium falciparum and mixed species of all events over time detected using the TWST algorithm in each of the 47 woredas.

Additional file 2.

Variations of event detection algorithms tested. The file contains two tables for Plasmodium falciparum and mixed species of all the different parameter combinations tested. The first sheet is for the random, weekly statistical, and CDC EARS methods, the second sheet for the Farrington algorithms, and the third sheet for a description of the fields.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nekorchuk, D.M., Gebrehiwot, T., Lake, M. et al. Comparing malaria early detection methods in a declining transmission setting in northwestern Ethiopia. BMC Public Health 21, 788 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: