Quantitative comparison of SARS-CoV-2 nucleic acid amplification test and antigen testing algorithms: a decision analysis simulation model

Background Antigen tests for SARS-CoV-2 offer advantages over nucleic acid amplification tests (NAATs, such as RT-PCR), including lower cost and rapid return of results, but show reduced sensitivity. Public health organizations recommend different strategies for utilizing NAATs and antigen tests. We sought to create a framework for the quantitative comparison of these recommended strategies based on their expected performance. Methods We utilized a decision analysis approach to simulate the expected outcomes of six testing algorithms analogous to strategies recommended by public health organizations. Each algorithm was simulated 50,000 times in a population of 100,000 persons seeking testing. Primary outcomes were number of missed cases, number of false-positive diagnoses, and total test volumes. Outcome medians and 95% uncertainty ranges (URs) were reported. Results Algorithms that use NAATs to confirm all negative antigen results minimized missed cases but required high NAAT capacity: 92,200 (95% UR: 91,200-93,200) tests (in addition to 100,000 antigen tests) at 10% prevalence. Selective use of NAATs to confirm antigen results when discordant with symptom status (e.g., symptomatic persons with negative antigen results) resulted in the most efficient use of NAATs, with 25 NAATs (95% UR: 13-57) needed to detect one additional case compared to exclusive use of antigen tests. Conclusions No single SARS-CoV-2 testing algorithm is likely to be optimal across settings with different levels of prevalence and for all programmatic priorities. This analysis provides a framework for selecting setting-specific strategies to achieve acceptable balances and trade-offs between programmatic priorities and resource constraints. Supplementary Information The online version contains supplementary material available at 10.1186/s12889-021-12489-8.


Background
The COVID-19 pandemic, caused by the SARS-CoV-2 virus, continues to cause significant morbidity, mortality, and economic hardship worldwide. Diagnostic testing is a cornerstone of COVID-19 response strategies in the U.S. and globally [1,2]. Nucleic acid amplification tests (NAATs, such as real-time reverse transcription-polymerase chain reaction [RT-PCR]) and antigen tests are used to diagnose current infection with SARS-CoV-2 virus. NAATs are sensitive tests for SARS-CoV-2 infection and are often utilized as "gold-standard" assays for the diagnosis of COVID-19 [3]. However, programmatic implementation of NAATs may face challenges, such Open Access *Correspondence: pgx5@cdc.gov 1 COVID-19 Response Team, Centers for Disease Control and Prevention (CDC), 1600 Clifton Road NE, Atlanta, USA Full list of author information is available at the end of the article as long turnaround times, which hampers the ability of testing programs to be used to interrupt transmission [4]. Additionally, NAATs often carry substantial costs associated with reagents, equipment, personnel training and salaries, and quality control. Antigen tests offer several advantages over NAATs for SARS-CoV-2 testing programs, including lower costs, point-of-care administration, and rapid return of results. In particular, use of serial antigen testing may provide benefits over NAATs for controlling outbreaks in some settings, such as congregate living facilities [5]. To expand COVID-19 testing availability, the U.S. government distributed 150 million antigen tests in 2020 [6].
Despite the advantages of lower costs and faster turnaround time, antigen tests are generally less sensitive than NAATs for diagnosis of COVID-19, particularly for persons without COVID-19 symptoms [3]. In many cases, it is recommended to confirm the results of antigen tests with the use of more sensitive NAATs [5]. Several strategies for the use of antigen tests and NAATs have been recommended by public health organizations such as the U.S. Centers for Disease Control and Prevention (CDC) [5], the World Health Organization (WHO) [7], and the European Centre for Disease Prevention and Control (ECDC) [8]. Depending on program goals, different strategies may be optimal for maximizing case detection, minimizing lost productivity, or minimizing the use of NAAT testing. To date, there has been no quantitative comparison of the expected performance and testing efficiency of these different strategies at various levels of prevalence. In this analysis, we evaluated the diagnostic performance and testing volumes of SARS-CoV-2 antigen and NAAT programs under six diagnostic algorithms using a simulation-based decision analysis approach.

Population and Model Structure
We evaluated outcomes of a modeled population of 100,000 persons seeking community-based SARS-CoV-2 testing (rather than facility-based serial testing) in settings of 5%, 10%, 15%, and 20% prevalence of SARS-CoV-2 infection. (Numerical results summarized in the text focus on the 10% prevalence level for conciseness.) Prevalence levels can vary substantially over time and geographically [9] and these levels of prevalence were selected as representative of the range of percent positivity by RT-PCR in a majority of U.S. states in March 2021 [10]. Model input parameter estimates were derived from antigen test evaluations in the U.S. from September to December 2020 (Table 1). Because these primary data were collected within U.S. populations, this analysis represents expected outcomes in a U.S. setting.
We evaluated six diagnostic algorithms which were adapted from current recommendations for SARS-CoV-2 antigen testing in various settings. These algorithms are illustrated in Fig. 1 and can be summarized as follows: (A) NAAT Only -each person is tested for SARS-CoV-2 infection by a NAAT. (B) Antigen (Ag) Only -each person is tested using a single antigen test, the result of which is used as a definitive diagnosis. This algorithm represents settings with access to point-of-care antigen tests, but no access to NAAT. (C) NAAT Confirmation for persons receive an antigen test followed by a NAAT for those with positive antigen results.

Parameterization and Sampling
Parameters from empirical studies used for model simulations are summarized in Table 1. Antigen test sensitivity and specificity were assumed to be conditional on the binary symptom status of the person evaluated [11]; the prevalence of symptoms was modeled independently for infected and uninfected populations. We made the parsimonious assumption that sensitivity and specificity of NAATs are 100% as NAATs are typically considered the "gold standard" for diagnosis of SARS-CoV-2 infection. Sensitivity and specificity of repeat antigen testing were assumed to be conditional upon negative initial antigen results. Mathematical definitions for algorithms are  Table S1. Parameters were sampled from triangular distributions (defined by a modal value and upper/lower bounds, characterized in Table 1) using Latin hypercube sampling to generate 50,000 simulations of each algorithm at each prevalence level. Outcomes are reported as the median and 95% uncertainty range (UR) of simulations for each scenario. URs can be interpreted as the range of outcomes that can be expected for algorithms under the most-and least-optimistic scenarios described by input parameter ranges. All calculations and analyses were performed using R software version 4.0.2 (R Core Team, Vienna, Austria). Code for the algorithm simulations can be found on the CDC collaborative software GitHub site (https:// github. com/ CDCgov/ SARS-CoV-2-NAAT-and-Antig en-Testi ng-Algor ithms).

Primary Outcomes
Primary outcomes of interest were numbers of missed cases (persons with SARS-CoV-2 infection who receive a definitive diagnosis of "uninfected" by antigen testing with no recommendation for additional testing), false positive diagnoses (uninfected persons with a definitive diagnosis of "infected" by antigen testing with no recommendation for additional testing), and numbers of antigen tests and NAATs performed per 100,000 persons evaluated. Secondary outcomes (including person-time of lost productivity) and sensitivity analyses are available in theSupplementary Materials. Positive and negative predictive values of each algorithm are depicted in Supplementary Figure S1.

Incremental Outcomes and Trade-Off Analysis
To characterize the potential consequences of adopting different testing algorithms in settings of varying NAAT capacity, we calculated [compared to the (A) NAAT Only algorithm] the incremental number of missed cases and saved NAATs [how many fewer NAATs were needed] under each algorithm. These incremental measures, calculated as a quotient representing the number of NAATs saved for each additional missed case compared to the (A) NAAT Only algorithm, provide an indication of the number of NAATs saved under different algorithms and the consequent trade-off of additional missed cases.
A similar incremental outcome was evaluated by comparing different testing algorithms to the (B) Ag Only algorithm and calculating the number of additional NAATs needed and consequent trade-off of additional cases detected. These measures are also presented as a quotient representing the number of additional NAATs needed for each additional case detected.

Primary Outcomes
Primary outcomes for each algorithm are presented in Fig. 2  Total testing volume remained constant for (A) NAAT Onlyand (B) Ag Only algorithms, at 100,000 NAAT or antigen tests, respectively. Antigen testing also remained constant at 100,000 testsfor (C) NAAT

Incremental Outcomes and Trade-Offs
Incremental outcomes of simulations under algorithms compared to corresponding simulations under the (A) NAAT Onlyalgorithm are depicted in Figure 3A (plotted as additional missed cases vs. NAATs saved, compared to (A) NAAT Only)at a level of 10% prevalence. The quotient of these measures is defined as the ratio of NAATs saved per additional missed case in Fig. 3B. The (D) NAAT Confirmation for Ag-negalgorithm had a ratio of positive infinity,resulting from zero additional missed cases (and a small number of NAATs saved). The (C) NAAT Confirmation for Sx/Ag-neg & Asx/Ag-posalgorithm had the most favorable ratio among remaining algorithms: at 10% prevalence, a median of 46 NAATs were saved per additional missed case (95% UR: 29-83) compared to (A) NAAT Only.
Incremental outcomes compared to the (B) Ag Onlyalgorithm are depicted in Fig. 3C (plotted as additional cases detected vs. additional NAATs needed) at 10% Only. For both incremental outcomes, the order of algorithm favorability remained constant across prevalence levels; however, the absolute differences between algorithms shrank as prevalence increased. A summary and synthesis of algorithms to achieve a key programmatic priority, balancing missed cases and NAAT volume, is presented in Table 2; similar summaries for other programmatic priorities are presented in Supplementary  Table S3.

Discussion
In this analysis, we utilized a decision analysis approach to provide a quantitative comparison of different strategies for the use of antigen tests and NAATs in SARS-CoV-2 testing programs. The six algorithms evaluated reflect differing priorities testing in populations based on resources, SARS-CoV-2 prevalence, and tolerance for missed cases and false positives. Multiple reports have found that antigen tests are less sensitive than NAATs [12][13][14][15][16][17][18] and will result in some antigen false-negative results among cases. The Our analysis provides a quantitative framework for public health practitioners who are planning or evaluating community-based testing programs. A reference guide applying the results of our analyses to programmatic decisions, along with key priorities and indicators, is included in Table 2 and Supplementary Table S3. For programs intended to minimize missed cases, algo-

rithms (A) NAAT Only, (C) NAAT Confirmation for Sx/ Ag-neg & Asx/Ag-pos, and (D) NAAT Confirmation for
Ag-neg are most preferable; selecting between these algorithms depends on tolerance for missed cases and available NAAT capacity. For programs intended to minimize NAAT volume, algorithms (B) Ag Only, (C) NAAT Confirmation for Sx/Ag-neg & Asx/Ag-pos, and (E) Repeat Ag for Ag-neg are most preferable; selecting between these algorithms depends on tolerance for missed cases and available NAAT and antigen test capacity. Predictive values (Supplementary Figure S1) can also provide key indicators of algorithm performance, particularly for individual and clinical decisions; however, programs should interpret predictive values with caution as algorithms with high predictive values may still result in unwanted outcomes (e.g., large numbers of missed cases) at the population level.
Each algorithm evaluated in this analysis is rooted in strategies currently recommended by public health organizations [except for (A) NAAT Only, an idealized baseline]. Each strategy recommended is articulated with important nuances; algorithms analyzed here are intended to be analogous to, but not exact reproductions of these strategies. Guidance from WHO and ECDC distinguishes strategies for antigen testing in communities with low and high prevalence of SARS-CoV-2 infection. In high prevalence settings, WHO recommends considering repeat antigen testing for those with negative results [7], analogous to (E) Repeat Ag for Ag-neg; ECDC indicates that negative tests should be confirmed with RT-PCR [8], analogous to (D) NAAT Confirmation for Ag-neg. In low prevalence settings following negative antigen results, WHO recommends clinical evaluation for suspect cases in lieu of confirmatory NAATs [7], analogous to (B) Ag Only; ECDC does not recommend antigen testing for asymptomatic persons and recommends confirmatory RT-PCR for symptomatic persons with positive antigen results [8], analogous to (F) NAAT for Asx & Sx/Ag-pos. CDC interim guidance recommends a unified strategy for testing across settings analogous to (C) NAAT Confirmation for Sx/Ag-neg & Asx/Ag-pos [5].
This decision analysis approach necessarily simplifies complex factors that may impact SARS-CoV-2 testing programs, and therefore results may not be representative of all testing programs. Our analysis does not account for individual-level variations (except symptom status) in test performance, such as patient age or sex. (However, empirical data indicate that these factors are not associated with significant differences in test performance [19]). This analysis is intended to be representative of community-based testing rather Table 2 Summary and synthesis of algorithms for balancing missed cases and NAAT volume a  (E) eliminates NAAT entirely but substantially increases missed cases (23% compared to (A)). This will save between 87 (at 5% prevalence) and 22 (at 20% prevalence) NAATs for each additional case missed. At low prevalence, cases are rare and many NAATs are needed for each case detected in (C), (D) and (E).
As prevalence increases, cases increase more than NAAT volume increases and fewer NAATs are needed for each case detected in (C), (D) and (E). Absolute numbers of missed cases increase more and (E) than (C) (and remain 0 for (A) and (D)). As prevalence increases, the efficiency of (C), (D), and (E) becomes more favorable, while the negative consequences of (C) and (E) become less favorable.
than facility-based serial testing (where each person is tested on a recurring basis). Our results would therefore overestimate the numbers of missed cases and testing volumes in serial testing programs. This analysis also does not evaluate dynamic transmission-related outcomes intrinsic to the intervention (as a consequence of detected/missed cases) which have been evaluated previously [20,21]. Finally, this decision analysis approach is used to estimate expected outcomes under a theoretical perfect implementation of each algorithm to highlight the fundamental distinctions between testing algorithms (independent of implementation challenges).
As a decision analysis model, our approach allows for a standardized comparison of the performance of all algorithms and, while specific settings or populations may differ from the one modeled, our conclusions about the relative benefits of each algorithm are portable for programmatic decisions across settings.
The results of our analysis are dependent on the accuracy and generalizability of the input parameter estimates used. Several reports have described the performance characteristics of several antigen tests, with comparable results across reports [12][13][14][15][16][17][18]. Programs implementing antigen tests with performance characteristics substantively different from the distributions described in Table 1 are likely to have different numbers of missed cases, depending on the assay's sensitivity. (This may include the influence of vaccination, as there is limited current evidence of the performance of antigen tests among vaccinated individuals.) Additionally, as variants of SARS-CoV-2 virus continue to emerge, the sensitivity of antigen tests for detecting prevalent variants may have a substantial impact on the performance of algorithms implementing antigen tests; however, early reports have found antigen tests perform similarly across multiple different SARS-CoV-2 variants [22]. However, only one report to date has evaluated the performance of immediate repeat antigen testing [13] and this may not be representative of settings where immediate repeat antigen testing performs with higher sensitivity. Importantly, this parameterization does not reflect the sensitivity of delayed repeat antigen testing (e.g. as recommended by ECDC for confirmation of negative results after 2-4 days when RT-PCR capacity is limited [8]). Finally, we adopted a simplifying assumption that NAATs have 100% sensitivity and specificity as NAATs are typically considered the "gold standard" for diagnosis of SARS-CoV-2 infection. However, NAATs may have lower sensitivity early in the course of infection [23] and remain positive during a patient's post-infectious recovery [24]. Therefore, in our approach the prevalence among persons seeking testing is representative of currently and recently infected persons detectable by NAATs at the time of testing and some "missed cases" in this approach may represent post-infectious persons still detectable by NAAT.

Conclusions
Our results provide the first quantitative comparison of the expected performance of different strategies for community-based SARS-CoV-2 testing programs recommended by public health organizations. None of the algorithms evaluated in this analysis is likely to be optimal in all settings and for all programmatic priorities, and this analysis provides a framework for selecting setting-specific strategies to achieve an acceptable balance and trade-offs between programmatic priorities and constraints. As global responses to the COVID-19 pandemic continue to evolve and adapt, our results contribute to the body of evidence informing SARS-CoV-2 testing strategies.