Population-level counterfactual trend modelling to examine the relationship between smoking prevalence and e-cigarette use among US adults

Background Studies have suggested that some US adult smokers are switching away from smoking to e-cigarette use. Nationally representative data may reflect such changes in smoking by assessing trends in cigarette and e-cigarette prevalence. The objective of this study is to assess whether and how much smoking prevalence differs from expectations since the introduction of e-cigarettes. Methods Annual estimates of smoking and e-cigarette use in US adults varying in age, race/ethnicity, and sex were derived from the National Health Interview Survey. Regression models were fitted to smoking prevalence trends before e-cigarettes became widely available (1999–2009) and trends were extrapolated to 2019 (counterfactual model). Smoking prevalence discrepancies, defined as the difference between projected and actual smoking prevalence from 2010 to 2019, were calculated, to evaluate whether actual smoking prevalence differed from those expected from counterfactual projections. The correlation between smoking discrepancies and e-cigarette use prevalence was investigated. Results Actual overall smoking prevalence from 2010 to 2019 was significantly lower than counterfactual predictions. The discrepancy was significantly larger as e-cigarette use prevalence increased. In subgroup analyses, discrepancies in smoking prevalence were more pronounced for cohorts with greater e-cigarette use prevalence, namely adults ages 18–34, adult males, and non-Hispanic White adults. Conclusion Population-level data suggest that smoking prevalence has dropped faster than expected, in ways correlated with increased e-cigarette use. This population movement has potential public health implications.


Background
The net effect of e-cigarette use on cigarette smoking at the population level is not well quantified.E-cigarette use could affect cigarette smoking in two main opposing ways: First, e-cigarettes could act as a catalyst to smoking among non-smokers (former and never) who would not have initiated or re-initiated cigarette smoking had it not been for e-cigarettes; such 'gateway' effects have been inferred among adolescents and young adults [1,2].Second, e-cigarettes could displace smoking via substitution, as smokers 'switch' from cigarettes to e-cigarettes, and non-smokers are diverted from smoking initiation.The combination of these processes determines the population-level impact of e-cigarette use on smoking prevalence.
Concerning the gateway, longitudinal studies have reported significant associations between e-cigarette use among non-smoking adolescents and subsequent smoking initiation [3].However, Lee et al. [4] and Chan et al. [5] have argued that this effect is not causal, but rather due to common liabilities, that is, shared risk factors for both vaping and smoking, such as parental smoking and delinquent behavior, which predispose adolescents to both forms of nicotine use and which are not adequately controlled in such analyses [6][7][8].Conversely, it has been hypothesized that e-cigarette use among non-smoking adolescents may prevent those who otherwise would have smoked cigarettes from doing so, as their e-cigarette use may replace cigarette smoking, rather than lead to cigarette smoking.This so-called 'diversion' effect has been observed in multiple studies [9][10][11][12][13].
Concerning switching, randomized trials have indicated potential for e-cigarettes to help adult smokers switch away from combustible cigarettes [14,15], but some of these studies have been criticized methodologically [16].Some cohort studies of individuals purchasing particular ENDS products in real-world settings have demonstrated high switching rates [17], with reduced cigarette consumption among dual users [18], and minimal smoking initiation and relapse among baseline never and former smokers using e-cigarettes [19,20].However, other cohort studies have come to opposite conclusions, suggesting that e-cigarette use does not prevent relapse to cigarette smoking [21,22].Using different analytic techniques, economic studies examining cross-elasticities between cigarettes and e-cigarettes have suggested these products are economic substitutes [23][24][25], which would suggest that e-cigarette use would reduce the likelihood of smoking.Agent-based population modeling also suggests that the introduction of e-cigarettes would be expected to reduce smoking prevalence [26].
Another useful approach to determining the overall impact of e-cigarette use on smoking prevalence (i.e., the combination of gateway and substitution effects) at the population level is to model expected trends in smoking prevalence, and then assess whether the introduction of e-cigarettes was associated with a net deviation from the expected smoking prevalence, either an increase (gateway) or decrease (substitution).Such modeling studies have generally found that the introduction of e-cigarettes was associated with more rapid declines in smoking prevalence [9,[26][27][28][29].The present study uses this approach to assess whether and how much the introduction of e-cigarettes in the US may be correlated with declining smoking prevalence among adults in the following ways.To test whether declining smoking prevalence is correlated with increasing e-cigarette use among adults, analyses examine subpopulations in which this correlation is especially likely to be evident.If e-cigarette use is correlated with smoking prevalence, the correlation should be greater in populations with higher e-cigarette prevalence.Use of electronic cigarettes by US adults is particularly concentrated among cigarette-smoking younger adults and males [30][31][32].Thus, discrepancies in expected versus actual smoking prevalence are examined in age, race/ ethnicity, and sex cohorts whose e-cigarette use prevalence differ.

Sample
Annual smoking prevalence estimates for US adults were derived from 29 waves of the National Health Interview Survey (NHIS), an annual, cross-sectional, populationrepresentative health survey with a geographically clustered sampling design administered by the US Centers for Disease Control and Prevention's National Center for Health Statistics [33].NHIS provides trend data for cigarette smoking dating back decades, and has included data on e-cigarette use starting in 2014, thus providing an appropriate source of nationally representative data for these analyses.NHIS interviewed 17,317-43,732 individuals on their tobacco use behaviors each year between 1990 and 2019.Data from the 2020 NHIS, while available, were not included in the present study due to serious potential confounding by the COVID-19 pandemic both with respect to data collection procedures (shifting from in-person to all-telephone interviews) as well as COVID-related impacts on cigarette smoking prevalence which are beyond the scope of this study.
Current smokers were defined as adults who had smoked at least 100 lifetime cigarettes and who 'now' smoked cigarettes 'every day' or 'some days' [34].
Similarly, current e-cigarette users were defined from 2014 (the first year that e-cigarette use was assessed in NHIS) to 2019 as those respondents who now used e-cigarettes every day or some days.Cumulative lifetime measures of e-cigarette use were not surveyed in NHIS, therefore 'established' use could not be defined.In 2014, e-cigarettes were defined as "electronic cigarettes, often called e-cigarettes" without explicit reference to nicotine.From 2015 to 2018, e-cigarettes were defined as "vapepens, hookah-pens, e-hookahs, or e-vaporizers… usually contain[ing] liquid nicotine." In 2019, e-cigarettes were defined as "Electronic cigarettes (e-cigarettes)… include electronic hookahs (e-hookahs), vape pens, e-cigars, and others… usually contain[ing] nicotine… These questions concern electronic vaping products for nicotine use.The use of electronic vaping products for marijuana use is not included in these questions." For full definitions, see Additional File 1.
The prevalence of current smoking and current e-cigarette use was determined for three age cohorts (18-34 years, 35-54 years, and 55+ years, following Axelsson et al. [35]), three race/ethnicity cohorts (Hispanic, non-Hispanic (NH) White, and NH Black), and two sex cohorts (female and male).These cohorts were selected to maximize the sample sizes used in each prevalence estimate.Analyses were not repeated for any of the other race/ ethnicity categories in NHIS due to very low sample size, producing coefficients of variation (relative standard error) >30% which is standard practice with NHIS data.[34].

Analyses
A cut-off year was determined from the NHIS data using the knee (or inflection point) identification algorithm 'Kneedle' published by Satopaa et al. [36].This algorithm identified 2010 as the inflection point in NHIS cigarette smoking prevalence data.2010 was also used as the cutoff year in cigarette and e-cigarette use trend modelling studies by Wagner and Clifton [29], and by Foxon and Selya [9], and Selya and Foxon [11].Indeed, data from objective financial analyses by Wells Fargo and Agora Financial suggest minimal e-cigarette market presence prior to 2010 compared to after [37].
Linear weighted least squares regression models relating smoking prevalence to year were fitted from 1990 to 2009 (before the cut-off ), and these were used to generate best-fit estimates for 2010-2019 (after the cut-off ) to model the counterfactual: i.e., what would have been expected to happen to smoking prevalence in the US in each year if e-cigarettes had not been introduced in 2010.These projections were compared to the actual NHIS smoking prevalence estimates for 2010-2019.The difference between these two -what we will refer to as the 'discrepancy' in cigarette smoking prevalence -can provide information on the effect of e-cigarettes on smoking prevalence among US adults.The discrepancy is defined as d = y p − y a , where y p is the projected smoking prev- alence and y a is the actual NHIS smoking prevalence, such that positive values of d mean that actual smoking prevalence is lower than expected from projections.
Linear weighted least squares regression models were fitted to NHIS adult e-cigarette use prevalence from 2014 (the first year e-cigarette use was assessed) to 2019 (with e-cigarette use prevalence defined as zero in 2010).These models were used to estimate e-cigarette use prevalence from 2010 to 2013.
The correlation between e-cigarette use prevalence from 2010 to 2019 (model-based e-cigarette use estimates for 2010-2013; actual NHIS e-cigarette use prevalence estimates for 2014-2019) and cigarette smoking discrepancies from 2010 to 2019 was then investigated by calculating Pearson correlation coefficients with two-tailed p-values (alpha = 0.05).These analyses were repeated for the three age cohorts (18-34, 35-54, 55+), three race/ethnicity cohorts (Hispanic, NH White, NH Black), and two sex cohorts (female, male).Goodness of fit was evaluated with Root Mean Square Errors (RMSE), which is appropriate for forecasts on means [38].
NHIS prevalence estimates were calculated in SAS version 9.

Sensitivity tests
In 2019, NHIS underwent a questionnaire redesign which, among other changes, shortened the survey length and changed the e-cigarette question wording (described above) [39].As a sensitivity test, correlations were re-run excluding the 2019 point estimates (the last time point analyzed) to account for possible variation in findings.
As a sensitivity test, correlations between smoking discrepancies and e-cigarette use prevalence were recalculated excluding the regression-estimated e-cigarette prevalence estimates.Another sensitivity test examined the effect of alternative cut-off years centered around the Kneedle-identified cut-off year of 2010.Finally, a sensitivity test used an exponential decay function instead of a linear function in the regression analyses, following Foxon and Selya [9] (this form allows the change in users across time to depend on the number of users at a given time and is consistent with the hardening hypothesis [40]).
Finally, to consider the effect of major, distinct national population interventions, the impacts of the FSPTCA and the CDC's 'Tips®' campaign were considered.This was done by taking quantitative estimates for the association between these two interventions and US adult smoking prevalence from the published literature [41,42] and comparing these estimates to the smoking discrepancy or prevalence observed in the present study.If the literature estimates for the decrease in smoking prevalence expected due to Tips® and the FSPTCA do not account for the smoking discrepancy or prevalence observed in the present study, this suggests these interventions alone do not explain the observed smoking discrepancy, which allows for possible association between e-cigarette use and the observed smoking discrepancy (among other factors).

Main results
Table 1 shows the combined NHIS sample distribution.The total sample consists of nearly one million observations (N = 870,652) and is majority NH White, majority female, majority never smoking/e-cigarette using, and is approximately evenly distributed by age category.
Root mean square errors of all models were consistent and small relative to the y-axis scale, ranging from 0.518 to 1.115, at least one order of magnitude smaller than cigarette smoking prevalence (see Additional File 1, Supplementary Table 1).
Figure 1 shows the results of counterfactual trend modelling among all adults.Smoking prevalence declined steadily from 1990 to 2010.This decline apparently accelerated in the post-2010 period, where actual smoking prevalence was as much as approximately 3.4 ± 0.5 (SE) percentage points lower than projected.This smoking discrepancy coincided with a rise in e-cigarette use prevalence to approximately 4.5 ± 0.2% of adults in 2019.The correlation between smoking discrepancy and e-cigarette use prevalence from 2010 to 2019 was high and statistically significant (Pearson r = 0.803, p = 0.005).
Figure 2 shows the results by age group.Smoking prevalence declined steadily from 1990 to 2010 among 18-34 and 35-54 year olds, while smoking prevalence was more stable among those aged 55+.The discrepancy between projected and actual smoking prevalence was most pronounced among 18-34 year olds, with discrepancies up to 8.0 ± 0.9 percentage points.This age cohort also had the highest e-cigarette use prevalence, with approximately 8.2 ± 0.4% of 18-34 year olds being current e-cigarette users in 2019.Smoking discrepancies were approximately half as pronounced among 35-54 year olds as they were among 18-34 year olds, but were still substantial (up to 3.5 ± 0.   Figure 3 shows the results by sex cohort.Similar to the age cohort results, smoking discrepancies were most pronounced for the cohort with the highest e-cigarette use prevalence.Among males, smoking discrepancies up to 4.2 ± 0.6 percentage points were observed, while among females, smoking discrepancies up to 2.5 ± 0.6 percentage points were observed.E-cigarette use prevalence meanwhile was approximately 5.5 ± 0.3% among males and 3.5 ± 0.2% among females in 2019.Correlation between smoking discrepancy and e-cigarette use was stronger among males (r = 0.869, p = 0.001) than among females (r = 0.634, p = 0.05).

percentage points). E-cigarette use prevalence among 35-54 year olds was approximately
Finally, Fig. 4 shows the modelling results by race/ ethnicity cohort.From 1990 to 2019, smoking prevalence declined consistently among all three race/ethnicity cohorts.Smoking prevalence discrepancies up to 4.2 ± 0.6 percentage points were observed among the NH White cohort, whereas discrepancies were less apparent among the NH Black and Hispanic cohorts (up to 1.9 ± 1.2 and 2.0 ± 0.8 percentage points respectively).E-cigarette use prevalence in 2019 was highest among NH White individuals (5.1 ± 0.2%) compared to NH Black (3.4 ± 0.4%) and Hispanic (2.8 ± 0.3%) individuals.Finally, correlation between e-cigarette use prevalence and cigarette smoking discrepancy was greatest for the NH White cohort (r = 0.804, p = 0.005) followed by the NH

Sensitivity test results
Results from the main analyses were largely robust to the five sensitivity tests described in the Methods, namely, (1) excluding the 2019 point estimates due to NHIS survey changes; (2) excluding the regression-estimated  2).
For the 55+ cohort, the main correlation result differed more substantially from the sensitivity test results (main result: r = 0.115, test range: r=-0.011-0.452),however correlations were consistently low (below r = 0.5) and non-significant in all analyses for this cohort.
For the 35-54 cohort, while the main correlation result was high (r = 0.614), the correlations across sensitivity tests ranged from low (r = 0.386) to high (r = 0.692).This is also true for the Hispanic cohort for which the sensitivity tests also ranged widely (r = 0.175-0.728),but only reached significance in sensitivity tests and not in the main analysis.
Root mean square errors for non-linear models were the same as those for the linear models to between one and three significant figures (see Additional File 1, Supplementary Table 1), suggesting little difference between the linear and non-linear fits for these data.

Other considerations
The effect of the FSPTCA and the CDC's 'Tips®' campaign, which represent major, distinct national population interventions, were considered by comparing quantitative estimates for the association between these two interventions and smoking prevalence from the published literature, to the smoking prevalence observed in the present study.
The association between the Tips® campaign and smoking prevalence is quantified in the literature by a CDC study which estimated approximately one million Tips® campaign-associated sustained quits between 2012 and 2018 [41].This equates to a 0.4 percentage point decrease in smoking prevalence (because one million adults represent approximately 0.4% of the US adult population [43]).By comparison, in the present study, a 3.3 ± 0.5 percentage point smoking discrepancy was observed among all adults in 2018.Because the 3.3 ± 0.5 percentage point discrepancy observed in the present study is much greater than the 0.4 percentage point decrease in smoking prevalence associated with Tips®, Tips® does not explain the smoking discrepancy observed.
The association between the FSPTCA and smoking prevalence is quantified in the literature by a study which estimated a 0.6% reduction in US adult smoking prevalence each quarter following implementation of the FSPTCA in June 2009 [42].Cumulatively, this would result in a 24% reduction in adult smoking prevalence from mid-2009 to mid-2019 (0.6% times 40 quarters).NHIS smoking prevalence among all adults in 2009 was approximately 20.6 ± 0.4% (present study).Applying the 24% reduction associated with the FSPTCA to the 2009 NHIS smoking prevalence provides a predicted adult smoking prevalence of approximately 15.7% in 2019, due to the FSPTCA.By comparison, in the present study the actual NHIS smoking prevalence in 2019 was approximately 14.0% (95% CI: 13.5-14.5%),which is statistically lower than the 15.7% prevalence from the FSPTCA.Because the actual NHIS smoking prevalence of 14.0% is statistically lower than the 15.7% prevalence predicted from FSPTCA effects, FSPTCA effects do not explain the smoking prevalence observed.

Discussion
The aim of this research was to use population-level data to examine the correlation in trends between e-cigarette use and smoking prevalence among US adults from 2010 to 2019.Results suggest that actual smoking prevalence was lower than it otherwise would have been if trends from 1990 to 2009 (before e-cigarettes became widely available) had continued uninterrupted.Further, the discrepancy between actual and predicted smoking prevalence tended to be highest in groups with higher e-cigarette use prevalence, such as among adults age 18-34, adult males, and non-Hispanic White adults.
Overall, the sensitivity analyses largely confirmed results from the primary analysis.However, sensitivity test results ranged more widely for the age 35-54 and Hispanic cohorts.Some of the variations showed a stronger association between e-cigarette prevalence and discrepancy in smoking prevalence, suggesting that the main analysis is conservative.
These lower-than-expected smoking prevalences, correlated with e-cigarette use, suggest population-level displacement of cigarettes by e-cigarettes, consistent with extant modelling literature [28,29].A significant overall decline in adult cigarette smoking is observed, above what was otherwise expected, even for younger adults, among whom the reported 'gateway' effect is claimed to be strong and e-cigarette use is relatively high [1].Levy et al. [27] also identified similar vaping-related reductions in smoking prevalence, and -consistent with our resultsnoted that these effects were primarily driven by younger adults aged 18-44.
Low smoking initiation and relapse among baseline never and former smokers using e-cigarettes have also been noted in longitudinal cohort studies [19,20].However, use by unintended groups (e.g., nonsmokers) is of high concern and efforts to reduce use in these populations should continue to be a high priority.
The predictions for smoking prevalence in the present study may be compared to predictions from previous modelling efforts.A dynamic simulation model developed by Mendez and Warner [44] and similarly calibrated with NHIS data predicted an overall adult smoking prevalence of 16.8% for 2020.The smoking trend among all adults from 1990 to 2009 in the present study predicts an adult smoking prevalence of 16.3% (95% CI: 15.4-17.2%)when this trend is projected to 2020.This is statistically consistent with Mendez and Warner which validates the modelling methods of the present study.Importantly, neither accurately predict the actual 2020 smoking prevalence of 12.5 ± 0.3% from NHIS.This further suggests the introduction of some effect on US adult smoking prevalence circa 2010 which is not accounted for by simple extrapolation of prior trends or by models based on population dynamics structures.The lowerthan-expected smoking prevalence of this study as well as Levy et al. [27], and Wagner and Clifton [29] suggest that this unaccounted-for effect coincides with the introduction and use of e-cigarettes among adults.
Because the analysis used cross-sectional data, the results presented here are subject to the usual limitations of such data, including selection bias, response bias, and inability to infer causality because changes in behavior between survey waves may not reflect a trend but differences between samples [45].Additionally, these data and this methodology cannot precisely parse the effects of e-cigarettes' introduction and other market/policy changes that may also have impacted smoking prevalence declines since 2009.However, we show that two of the major changes (the 'Tips®' campaign and FSPTCA) do not explain the observed smoking discrepancy, even when optimistically assuming their impacts are sustained over the last decade.Other important demographic distinctions may exist, for example trends among 18-24 year olds.However, analyses were limited by low sample size for these subpopulations.Lastly, when comparing across age cohorts over a 10-year period, it should be noted that part of the population in one cohort would have aged into the next cohort (e.g., respondents age 18-34 in 2009 will be age 28-44 in 2019, which partially overlaps the 35-54 category).

Conclusion
This analysis of nationally representative data supports an association between the availability of e-cigarettes and decreased cigarette smoking at the population level.Consistent with a substitution effect, the results consistently show that subgroups of US adults reporting higher prevalence of e-cigarette use show bigger discrepancies from the expected trend in cigarette smoking prevalence.

List of abbreviations
4 using PROC SURVEY to account for complex survey design.All other analyses were performed in Python version 3.7.6 with the packages NumPy version 1.18.1,Scipy version 1.4.1,Uncertainties version 3.1.5,Kneed version 0.7.0, and Matplotlib version 3.1.3.

Fig. 1
Fig. 1 Trends in Smoking and E-Cigarette Use Prevalence among All Adults

Fig. 2
Fig. 2 Trends in Smoking and E-Cigarette Use Prevalence by Age Group

Fig. 3
Fig. 3 Trends in Smoking and E-Cigarette Use Prevalence by Sex

Fig. 4
Fig. 4 Trends in Smoking and E-Cigarette Use Prevalence by Race/Ethnicity

Table 1
Combined Sample Characteristics