How has quarantine been evaluated?
Previous studies evaluating quarantine for control of SARS used two general methods: simulation studies [9, 32, 33]; and, case-study reports describing specific settings (e.g. [22, 34–42]).
Simulation studies serve several purposes in outbreak research, one of which is to estimate the impact of control measures in outbreak scenarios [1, 33]. A review of simulation studies on quarantine effectiveness for SARS has been presented by Bauch and colleagues . This review, and other individual reports  suggest that quarantine measures are most effective when mobilized at the very start of the outbreak when numbers are small. While decision-makers must rely heavily on simulation studies, a persistent concern raised  is that results are driven by prior assumptions; which may be unrealistic. Simulations rely on some degree of simplification such as considering only point-source outbreaks in non-overlapping populations. More sophisticated models incorporate heterogeneity in parameters including variations in size and behaviour of human contact networks, differences in transmissibility due to the host, and levels of adherence with control measures, such that scenarios simulated reflect what happens in real-world experiences [4, 33, 44, 45]. Simulation studies, typically, do not present impact statistics familiar to health policy makers, and may not provide evidence as compelling to decision-makers as real world outbreak experiences. To give an example, a recent World Health Organization review on control measures for influenza cited no simulation studies but only real-world observational data .
Quarantine-related studies from real SARS outbreaks typically present the epidemic curve with case counts plotted against the timeline of events including arrival of index cases, and changes in public health response. In many reports [22, 34, 39, 40], the impact of control measures is implied from qualitative observation, such as visible deceleration of new cases in plots, or the eventual end of the outbreak. Some of these reports appeared well after the outbreak, with efforts made to complete data on onset dates and transmission in hindsight (e.g., ). Drawing inference from epidemic curves suits only point-source outbreaks, and amount to one-group, pre-test post-test designs , providing weak evidence for causation. The true value of case reports, arguably, is the rich contextual information on challenges faced and unexpected events, and showing socio-political feasibility of aggressive control across settings .
Some case reports have taken a more quantitative approach. A few studies reported on the effect of quarantine in shortening the time from onset of symptoms to isolation , or the proportion of SARS cases who developed symptoms while already under quarantine . Others report quarantine yield (the proportion of individuals quarantined who eventually develop SARS) , which reflects specificity of contact tracing versus burden from unnecessary quarantine. These are intermediate outcomes, however, evaluating processes as opposed to final outcomes.
In others, control interventions are examined pre- post-quarantine in relation to subsequent changes in R [4, 43]. Wallinga and Teunis, 2004 , present the average daily effective reproductive number, R, both prior to alert and after, for each of four SARS outbreak locations. Table 1 of their analysis reveals fairly consistent transmission numbers pre-alert, but more variable R values post-alert, highlighting less effective control in Ontario relative to elsewhere. This is also an uncommon example of presentation of R estimates along with confidence limits from observed (as opposed to simulated) data. Their analysis would permit reporting of differences in transmissions before and after alert, but no reduction in R attributable to control was presented (with or without confidence limits). The approach also permitted no consideration of individual-level covariates.
Quantitative estimates of quarantine impact
We estimated that use of community quarantine in the 2003 Ontario SARS outbreak reduced transmission to one third, with an absolute difference of 0.13 secondary cases per index case under quarantine, relative to not quarantined by symptom onset. For discussion purposes, we present several effect measures, including Secondary Case Count Difference (SCCD) and "number needed to quarantine" (NNQ), a novel adaptation of NNT. Our point estimate of NNQ for the Ontario outbreak was 7.5 persons in quarantine to one SARS case averted using data from probable or suspected SARS cases. As a point estimate, this NNQ compares very favourably with NNTs reported for public health interventions such as chemoprophylaxis for leprosy  and meningococcal disease , or vaccination against pertussis  and influenza and pneumococcal disease , particularly with a condition like SARS with significant morbidity and a high case-fatality rate. All estimates we present for the impact of quarantine, however, are imprecise. Bootstrapped confidence intervals include values for no impact. Statistical power is a limitation to this and many analyses of real outbreak data.
We also show, not surprisingly for an infection now known to be transmitted by droplet spread, a statistically significant association between the number of close contacts and number of secondary cases, per index case. Number of close contacts (level 1 in Table 1) had some overlap with the observed (non-significant) effect of quarantine, whereas the number of more distant contacts was unrelated to any apparent benefit of quarantine. Our analysis also suggests (without statistical significance) that reduction in the number of close contacts contributed to reduction in spread, and this may have implications for targeting of quarantine toward closer contacts [23, 36].
Research in quarantine effectiveness presents many challenges. One complication is the unit of analysis for the outcome relative to the intervention. Clinical decision-making looks at outcomes in the same individuals assigned to treatment. Interventions such as vaccination are more complex in that outcomes may be assessed at the individual or population level, with different implications, although the individual vaccinated is part of the same population. Number Needed to Vaccinate (NNV) has been estimated incorporating herd immunity . The case of quarantine is distinct even from vaccination, in that all potential benefit is to other persons. It is theoretically possible to study sets of index cases and their contacts as independent units of analysis, although it is often difficult to identify precisely which persons exposed which others . Here, we have worked with contacts matched to an exclusive index case . The creation of sets of cases and associated controls goes only part-way toward a complete network-based analysis , although future studies could address non-independence of networks.
A second challenge to evaluators might be non-familiarity with regression models for count data. The generalized linear model used here to obtain an NNQ estimate, with a Poisson error term and identity link function, is less commonly used, but long-described in biostatistics texts. Regression models used here are available in all major statistical packages (Stata, SAS, SPSS and others). The distribution of secondary cases (per index case) may be positively skewed (with a few cases generating large numbers of transmissions ). Over-dispersion may need to be addressed through means such as use of negative binomial models in place of Poisson models, as above. Negative binomial models have interpretation very similar to Poisson models .
Statistical power was a limitation of our analysis and most studies of real outbreaks . As the goal of outbreak management is to minimize events, small samples must be considered. Procedures assuming large samples tend to overstate precision relative to bootstrapping. Other authors in this field have used bootstrap variance methods as well .
Random assignment of individuals to quarantine is not ethical; and in some jurisdictions, comprehensive quarantine procedures may eliminate any control arm . In North America and Europe, voluntary quarantine practices are favoured and some degree of non-adherence is inevitable [6, 22], so both non-quarantined and quarantined groups will be observed. However, selection bias related to health status, employment, family structure and other factors may confound the association between quarantine status and observed transmissions. The best possible observational design would permit evaluation of the decision made by public health officials to place individuals under quarantine and apply analyses based on both the intention to treat approach and taking compliance into account.
Retrospectively, we explored the possibility of identifying all individuals screened by public health staff for potential quarantine and contact tracing, regardless of final disposition. This was not feasible. Practices varied with respect to when a record was initiated (i.e., in one health unit a file might have been opened even with an unfounded inquiry, elsewhere a record was generated only with a confirmed contact link and symptoms). Within Public Health records, we were able to confirm 140 quarantined false positive "cases". These cases were quarantined contacts who became ill with possible SARS symptoms but were subsequently excluded as SARS cases, and they had at least one identified community contact. Future cost-benefit studies should include information on such groups (e.g., ). Legitimate costs are incurred for and by these false positive cases and their contacts which should be taken into account. Because people without the disease can't spread it, uneven distribution of such individuals across infection control conditions being compared could bias estimates of impact. As we found, the rate of false positive cases (resulting in no transmission) may have a large influence on the apparent benefit. Several case-reports discussed problems with prospective record-keeping, in terms of detailed contact tracing and the implications of time-lags in serological testing (including those never tested), as challenges to both outbreak management and research [4, 35, 41].
Measurement error is also likely with other important information, such as documenting level of contact and therefore numbers of individuals at risk by contact level. With delayed contact tracing, assignment of contact level may even be done after secondary infection (unblinded) and so could be biased toward closer contact where transmission already happen, and toward less close contact where the contact remained well.
Papers on the SARS experience have spoken about the importance of data management resources and described core data to be tracked during an outbreak (e.g., ). None made explicit recommendations for statistical evaluation of control measures. Planning to report statistics familiar to other areas of health care evaluation may improve the quality and comparability of data collected.
Our approach demonstrates that existing outbreak data may yield more information to evaluate outbreak control measures than has been reported. Further research, presenting quantitative differences in outcomes attributable to measures such as quarantine, would be useful in many ways. First, this would add to evidence on cost-effectiveness. Second, it would facilitate further methodological development in this field. Pooled re-analysis of existing outbreak data across several settings, would ameliorate statistical power problems, and increase the scientific contribution from these important databases.
Challenges in interpretation and communication
Policy-makers rely on estimates of the impact of population-based preventive measures, which should derive from actual experience, as well as theoretical forecasting. Evidence also needs to be understood. It has been debated whether the NNT statistic achieves its goal to facilitate decision-making in other health care settings [17–20]. Further thought and discussion are needed as to how meaningful a NNQ statistic might be for decision-making in outbreak planning, relative to other expressions of attributable case reductions, such as SCCD also proposed here, or other metrics.
No variant on attributable risk difference or NNT can be interpreted without consideration of the absolute costs of not acting, and the harm side of the decision . This discussion must include the severity of the illness, as well as harms of intervention to the individual and society which are difficult to quantify and value-laden . Quarantine includes potential non-health related harms including civil liberties and may include economic and other costs to the individual .
Finally, studies to evaluate control measures for one agent may not be generalizable to other agents. Measures to restrict close contact probably made an important contribution to the control of SARS outbreaks . Evidence of this is accumulating slowly and should be taken into consideration for future outbreaks of SARS or similar droplet spread agents without significant transmission in the asymptomatic phase. However, the applicability of this evidence to the current experience with pandemic influenza is less certain. Ferguson et al  present simulation data suggesting community quarantine and isolation may play a role in influenza control but comment that this would presume such measures were feasible. Transmission patterns for influenza however, make it less likely that contact tracing and quarantine would be fast enough to avoid transmission which is greatest in the earliest stage of infection . Under such circumstances, use of a 'severe'  measure such as quarantine is likely not justified where such efforts are likely to yield little benefit.