Skip to main content

Association between vaccination rates and COVID-19 health outcomes in the United States: a population-level statistical analysis



Population-level vaccine efficacy is a critical component of understanding COVID-19 risk, informing public health policy, and mitigating disease impacts. Unlike individual-level clinical trials, population-level analysis characterizes how well vaccines worked in the face of real-world challenges like emerging variants, differing mobility patterns, and policy changes.


In this study, we analyze the association between time-dependent vaccination rates and COVID-19 health outcomes for 48 U.S. states. We primarily focus on case-hospitalization risk (CHR) as the outcome of interest, using it as a population-level proxy for disease burden on healthcare systems. Performing the analysis using Generalized Additive Models (GAMs) allowed us to incorporate real-world nonlinearities and control for critical dynamic (time-changing) and static (temporally constant) factors. Dynamic factors include testing rates, activity-related engagement levels in the population, underlying population immunity, and policy. Static factors incorporate comorbidities, social vulnerability, race, and state healthcare expenditures. We used SARS-CoV-2 genomic surveillance data to model the different COVID-19 variant-driven waves separately, and evaluate if there is a changing role of the potential drivers of health outcomes across waves.


Our study revealed a strong and statistically significant negative association between vaccine uptake and COVID-19 CHR across each variant wave, with boosters providing additional protection during the Omicron wave. Higher underlying population immunity is shown to be associated with reduced COVID-19 CHR. Additionally, more stringent government policies are generally associated with decreased CHR. However, the impact of activity-related engagement levels on COVID-19 health outcomes varied across different waves. Regarding static variables, the social vulnerability index consistently exhibits positive associations with CHR, while Medicaid spending per person consistently shows a negative association. However, the impacts of other static factors vary in magnitude and significance across different waves.


This study concludes that despite the emergence of new variants, vaccines remain highly correlated with reduced COVID-19 harm. Therefore, given the ongoing threat posed by COVID-19, vaccines remain a critical line of defense for protecting the public and reducing the burden on healthcare systems.

Peer Review reports


By March 1st, 2023, the COVID-19 pandemic caused over 102 million reported cases and 1.1 million deaths in the United States. Vaccine development and distribution have been at the forefront of efforts to combat the impact of the disease. Three vaccines are currently available in the U.S., developed by Pfizer-BioNTech, Moderna, and Johnson & Johnson. Initial randomized clinical trials demonstrated the safety and efficacy of these vaccines, with vaccine efficacies against severe disease (hospitalization and death) ranging from 73.1 to 96.7% [1,2,3]. The clinical trials were designed to estimate the direct effect of vaccines against severe disease at the individual level [4]. However, as vaccines roll out to a broader population, uncertainties such as the emergence of new variants, variable immune responses, the quality of cold-chain storage, and other confounding factors can impact a vaccine’s efficacy [5]. Hence, evaluating real-world vaccine protection against COVID-19 health outcomes poses a challenge.

Several published studies have attempted to quantify the real-world impact of the COVID-19 vaccines on health outcomes. For example, a study in Qatar assessed the vaccines’ effectiveness against severe, critical, or fatal Omicron infections using test-negative case-control analysis, and found previous infections and vaccination are effective against symptomatic Omicron infections [6]. An observational study conducted in Israel using national surveillance data showed that the two doses of the Pfizer-BioNTech mRNA vaccines are 97.2% effective in preventing COVID-19-related hospitalizations [7]. A Danish study estimated vaccine effectiveness against COVID-19 hospitalization using a cohort study design, and found that two doses of the vaccine provide high protection against hospitalization for the Alpha and Delta variant, and even higher protection against hospitalization for the Omicron variant [8]. A similar cohort study was applied in Singapore and the United Kingdom to determine whether booster shots reduce the severity of COVID-19 infections during the Omicron wave, and found consistent results that the risk of severe COVID-19 outcomes reduced after receiving booster mRNA vaccines [9, 10].

Most existing literature on the population-level effects of COVID-19 vaccination is based on individual-level data and observational studies. Specifically, these studies relied upon detailed individual-level data to assess the direct effectiveness of vaccination by comparing health outcomes between vaccinated and unvaccinated individuals exposed to the same environment. However, these studies may be subject to confounding by unmeasured factors and inconsistent quality of individual-level data. Further, in the U.S., such high-resolution data is unavailable at the population-level, so alternative strategies must be engaged to evaluate the impact of vaccine at a regional level.

One such approach is to rely on compartmental and agent-based models to simulate transmission and disease outcomes both in the presence and absence of vaccines implementation for the same population. Watson et al. applied this method to estimate the impact of varying vaccine uptake rates on mortality across multiple countries and found that vaccines prevented 14·4 million COVID-19 deaths in 2021 [11]. However, this approach is subject to many assumptions and is limited in its ability to estimate accurate effectiveness. Alternatively, statistical methods such as time series and regression analysis can be implemented to evaluate the association between vaccination coverage and healthcare outcomes across different locations. One study using this strategy evaluated the association between vaccination coverage and the COVID-19 cases growth rate for all 50 U.S. states in the U.S. using a structural nested mean model and found a 1% increase in vaccination coverage was associated with a 1.02% reduction in case growth rate [12]. However, the scope of this study is limited to cases between March and May 2021. Another study utilized linear regression to analyze vaccine coverage and natural immunity in relation to mortality during the Delta and Omicron waves. It found that vaccine coverage reduced COVID-19 mortality, but seroprevalence and prior infection rates were not associated with mortality [13]. However, this method has limitations in capturing dynamic changes and non-linear relationships between variables. A different study by Bollyky et al. [14] applied regression analysis to determine how vaccination coverage amongst other factors (e.g., presence of comorbidities, political partisanship, race, and ethnicity) impacted health outcomes (standardized infections and deaths) in the U.S. at the state level, and determined that higher vaccination rates were associated with lower death rates. The scope of this study varies from ours in its focus on the association between static variables and COVID-19 health outcomes for a fixed time window between January 1st, 2020, and July 31st, 2022, while our study expands the analysis by incorporating novel dynamic variables to capture behavioral changes over time, and explicitly evaluating the different variants independently. A recent study evaluated the time-varying relationship between vaccination, mobility, and COVID-19 health outcomes before and after the Omicron waves [15]. They found the significance of the vaccine’s impact in reducing case rates diminished during the Omicron surge, while its efficacy in lowering case-fatality rates remained substantial throughout the pandemic.

Our study contributes to the existing literature by prioritizing case-hospitalization risk as the outcome variable, breaking aggregated mobility into activity-related engagement levels, modeling previous infections as a dynamic variable, including an interaction between the completed primary series and booster rate for the Omicron wave, and considering the critical static factors such as comorbidities, social vulnerability, race, and state healthcare expenditures. Despite numerous studies assessing the effectiveness of vaccines, most have not accounted for the relative impact of vaccines across different populations and variant waves, while considering dynamic potential confounding factors. Therefore, a more comprehensive understanding of vaccines’ impact across diverse populations and COVID-19 waves is crucial in developing informed public health policies that can effectively mitigate the spread of the virus and ensure equitable distribution of healthcare resources.


Study design

The primary objective of this study is to analyze the association between COVID-19 vaccination rates and COVID-19 case-hospitalization risk (CHR) in the U.S. while controlling for potential confounding effects. Time-dependent COVID-19 CHR is chosen as the modeled response variable to gain insights into the factors influencing COVID-19 harm; CHR serves as both a proxy for disease severity at an individual level, and captures the burden on the healthcare system at a population level. We use Generalized Additive Models (GAMs) to perform the analysis because of their ability to capture nonlinear dynamics. Data used include novel dynamic covariates that may potentially contribute to COVID-19 CHR, such as naturally derived immunity from prior COVID-19 infection, local healthcare infrastructure, activity-related engagement levels in the population, and government policies, alongside various static variables that have been identified to be significant in previous studies [14, 16, 17] such as comorbidities, social vulnerability index (SVI), race, and state healthcare expenditures. By controlling for these factors, we aim to provide a more comprehensive understanding of the association between vaccination rate and COVID-19 CHR at the population level. To further elucidate the role of potential driving factors, we also model reported case incidence rates (CIR) as a separate response variable and compared the factors associated with COVID-19 transmission versus those associated with COVID-19 CHR. Our framework explicitly captures the spatial variation in the modeled relative associations through a variable transformation procedure (discussed in detail in the methods section). The study was conducted for 48 states in the U.S. for the period between April 19th, 2021, the date at which the vaccines were approved for all adults in the U.S., to March 1st, 2022. This period covers the pre-Delta (characterized by the predominance of the Alpha variant and other variants), Delta, and Omicron waves (predominance of the BA.1.1 variant) of COVID-19, which are each evaluated independently. To distinguish between COVID-19 variant-driven waves, we utilized SARS-CoV-2 genomic surveillance data and identified the dominant variant for each state and point in time, to determine time windows so the distinct variant-driven waves can be modeled independently. For the Omicron wave, we also considered the added benefit of booster doses on COVID-19 health outcomes. Specifically, we evaluated the interaction between the completed primary series and booster rate on reducing COVID-19 CHR. West Virginia and New Hampshire were omitted in this study due to a decrease in cumulative vaccination rates over the study period, likely due to reporting errors. Results from this analysis help improve our understanding of the real-world relative impact of the available COVID-19 vaccines against COVID-19 CHR at the population-level over time, and can help inform future public health policies to reduce harm.

Data sources and collection

We collected state-level time-series data and static variables from publicly available databases. All time-series data were aggregated to the weekly level. A summary of the variables and their respective sources are listed in Tables 1 and 2, and detailed explanations of each variable are provided in Appendix Sect. 1.2. A 3-week moving average was applied to all time-series variables to mitigate the effects of potential noise and reporting issues, with the exception of the government policy index. Regarding the age groups covered in the dataset, with the exception of activity-related engagement levels, all variables include data for both children and adults.

Table 1 Summary of dynamic variables in the model
Table 2 Summary of static variables in the model

Dynamic variable transformation

To ensure the precise estimation of each dynamic variable’s impact, a variable transformation mechanism must be used to account for the effects of time trends in the data. For example, the completed primary series rate is always increasing with time for all locations modeled, hence it can be difficult to distinguish how much of the observed associations between vaccination rate and COVID-19 health outcomes are due to the variable interaction or the passage of time. Moreover, the main focus of this study lies in modeling spatial differences and considering location-specific variations that influence the observed associations. Consequently, we applied the following transformation to all dynamic variables to remove the time trend and redefine the relative variable (\( {RV}_{i}^{t}\)):

$$ {RV}_{i}^{t} = \frac{{V}_{i}^{t}}{\frac{1}{n}\sum _{j}{V}_{j}^{t}}$$

Where \( {RV}_{i}^{t}\) represents the transformed variable for state \( i\) at week \( t\), \( {V}_{i}^{t}\)represents the original variable for state \( i\) at week \( t\) without the transformation, \( \frac{1}{n}\sum _{j}{V}_{j}^{t}\) represents the mean of the original variable at week \( t\), over all locations being modeled n, e.g., the national mean across the U.S. A \( {RV}_{i}^{t}\) larger than one indicates that state \( i\) has a higher variable value compared to the national mean at week t, while \( {RV}_{i}^{t}\) lower than one indicates that state \( i\) has a lower variable value compared to the national mean, at week t. After normalization, the final set of time-dependent variables included in the analysis are: Relative case-hospitalization rates (\( {RCHR}_{i}^{t}\)), relative reported case-incidence rate (\( {RCIR}_{i}^{t}\)), relative completed primary series rate (\( RCPS{R}_{i}^{t}\)), relative booster rate (\( RB{R}_{i}^{t}\)), relative weekly testing rate (\( RWT{R}_{i}^{t}\)), relative gym visits (\( RG{V}_{i}^{t}\)), relative physician visits (\( RP{V}_{i}^{t}\)), relative university visits (\( RU{V}_{i}^{t}\)), relative previous infection (\( {RPI}_{i}^{t}\)), and relative government policy (\( RG{P}_{i}^{t}\)). These newly transformed variables enable an explicit evaluation of the relative association between each of them and the COVID-19 health outcome of interest within a single multi-state model. Moreover, this variable transformation procedure facilitates assessing individual state’s performance relative to national dynamics. It emphasizes evaluating the expected outcomes when a state’s performance diverges from the national average.

The dynamic variables, with and without variable transformation, are visually depicted in Appendix figure S2. Among all the variables, the rankings of \( RCPS{R}_{i}^{t}\) remain relatively stable across time, as seen in Appendix figure S2b2. This stability indicates a more consistent spatial-temporal pattern of variation among vaccination rates across states. On the other hand, all other dynamic variables exhibit more noticeable spatial ranking changes over time. The changing spatial-temporal rankings of other dynamic variables highlight the importance of considering spatial differences through time and evaluating their influence on COVID-19 health outcomes.

Statistical analysis

The generalized additive model (GAM) was selected as the statistical model for this analysis because of its ability to capture complex and nonlinear relationships between the set of covariates and the outcome variables of interest in each state. We independently model each variant-driven wave during the study period to allow for different driving factors for different variants. To define the variant waves, we clustered each state-week pair based on the dominant circulating variant based on SARS-CoV-2 genomic surveillance data downloaded from GISAID [30]. The three waves are classified as: (1) Pre-Delta Wave, (2) Delta Wave, (3) Omicron Wave, and each state is labeled with its most dominant variant each week to define the windows. Details of this classification are described in Appendix Sect. 1.1, and the assignment of state-week pairs is shown in Appendix figure S1.

The primary set of models treat weekly state-level RCHR as the response variable, with separate models generated independently for each variant wave, namely Pre-Delta-RCHR, Delta-RCHR, and Omicron-RCHR. These three models have the form:

$$\begin{array}{*{20}{l}}{RCHR_i^t \sim Gamma(\mu,\varphi )}\\{\begin{array}{*{20}{l}}{{\rm{log}}\left( \mu\right) = \alpha+ {f_1}\left( {RCPSR_i^{t - 2}} \right) + {f_2}\left( {RWTR_i^{t - 2}} \right) + }\\{{f_3}\left( {RGV_i^{t - 2}} \right) + {f_4}\left( {RPV_i^{t - 2}} \right) + }\end{array}}\\{\begin{array}{*{20}{l}}{{f_5}\left( {RUV_i^{t - 2}} \right) + {f_6}\left( {RGP_i^{t - 2}} \right) + }\\{{f_7}\left( {RPI_i^t} \right) + {\beta _1}\left( {Blackproportion} \right) + }\end{array}}\\{\begin{array}{*{20}{l}}{{\beta _2}\left( {SVI} \right) + {\beta _3}\left( {{\rm{Adultsathighrisk}}} \right)}\\{ + {\beta _4}\left( {{\rm{Medicaidspending}}} \right)}\end{array}}\end{array}$$

Where \( \alpha \) represents the intercept, \( {\beta }_{i}\) represent the parametric coefficients of each static variable, and \( {f}_{\text{i}}\) are spline smooth functions of the relative dynamic variables. Additionally, a model is constructed for the Omicron wave, incorporating an interaction between completed primary series and booster rate (Omicron-Booster-RCHR). The Omicron-Booster-RCHR has the form:

$$\begin{array}{*{20}{l}}{RCHR_i^t \sim Gamma(\mu,\varphi )}\\{\begin{array}{*{20}{l}}{{\rm{log}}\left( \mu\right) = \alpha+ {f_1}(RCPSR_i^{t - 2},RBR_i^{t - 2}) + }\\{{f_2}\left( {RWTR_i^{t - 2}} \right) + {f_3}\left( {RGV_i^{t - 2}} \right) + {f_4}\left( {RPV_i^{t - 2}} \right) + }\end{array}}\\{\begin{array}{*{20}{l}}{{f_5}\left( {RUV_i^{t - 2}} \right) + {f_6}\left( {RGP_i^{t - 2}} \right) + }\\{{f_7}\left( {RPI_i^t} \right) + {\beta _1}\left( {Blackproportion} \right) + }\end{array}}\\{\begin{array}{*{20}{l}}{{\beta _2}\left( {SVI} \right) + {\beta _3}\left( {{\rm{Adultsathighrisk}}} \right) + }\\{{\beta _4}\left( {{\rm{Medicaidspending}}} \right)}\end{array}}\end{array}$$

Where \( {f}_{1}\) represent a smooth interaction function between \( RCPS{R}_{i}^{t-2}\) and \( RB{R}_{i}^{t-2}\). For all the mentioned models above, the weekly state-level RCHR is assumed to follow a Gamma distribution with a log link. This choice of the Gamma family accounts for the positively skewed distribution of the outcome variable. We use thin plate regression splines as the smoothing basis for all \( {f}_{\text{i}}\) and set the basis dimension to three to maximize the interpretability of the models. The basis dimension refers to the maximum possible complexity of each smooth term; a large basis dimension could overfit the data and result in highly non-linear relationships between input and outcome variables.

To consider the sequential process of infection leading to hospitalization we introduce a time lag between each of the input variables relative to the outcome variable, which is denoted by the superscript. The timeline of this model is introduced as follows: the modeled relative case-hospitalizations rate (\( {RCHR}_{i}^{t}\)), occur at time t. Infections resulting in hospitalization, are assumed to occur at time t-2, to account for a one week incubation period [31], and one additional week between symptoms onset and hospitalization [32]. Note, this timeline aligns with the definition of the CHR variable, which is normalized by the number of reported infections one week prior, which assumes a one week delay between when infection occurred and when it is reported. To accurately reflect the conditions presented at the time of infection, each of the variables related to vaccination (\( RCPS{R}_{i}^{t-2}\)), activity-related engagement levels (\( RG{V}_{i}^{t-2}\), \( RP{V}_{i}^{t-2}\), \( RU{V}_{i}^{t-2}\)), policy (\( RG{P}_{i}^{t-2}\)), and testing (\( RWT{R}_{i}^{t-2}\)) are also lagged by two weeks relative to the case-hospitalization risk. Lastly, the past infections variable, defined as stated above to capture the role of recently acquired immunity from infection in protecting from severe disease upon reinfection, is equal to the total infection rate in the population summed over the prior 4 to 16 weeks. This time window is explicit in the definition of \( {RPI}_{i}^{t}\) (see Appendix Sect. 1.2). A sensitivity analysis explores diverse immunity durations and is detailed in Appendix Sect. 2.5.

A secondary set of analogous models treats RCIR as the response variable, namely the Pre-Delta-RCIR, the Delta-RCIR, the Omicron-RCIR, and the Omicron-Booster-RCIR. The first three models adopt the same form as Eq. (1), while the Omicron-Booster-RCIR follows the same form as Eq. (2). To account for the sequential process leading to infections, all lags between dynamic covariates, and RCIR have been reduced by one week. This results in eight models, with four models fit to RCHR, and four models fit to RCIR. The exact formulation of models with RCIR as outcome variable are documented in Appendix Sect. 2.6.

The selection of covariates for each model relies on correlation-based feature selection, taking into account both Pearson’s correlation between variables and the concurvity measures derived from GAMs. For comprehensive information on the selection criteria of static variables, please refer to Appendix Sect. 2.1. Additionally, Appendix Sect. 2.2 provides detailed documentation on the selection of dynamic variables, particularly those related to activity-related engagement levels. The impact of each dynamic variable is quantified by computing the Accumulated Local Effects (ALE) of each smooth term on outcome variables. The local effect refers to the change in model output when a particular input feature is changed while keeping all other features constant. The ALE method aggregates the local effects of each input feature across its entire range. By accumulating these local effects, we gain insight into how changes in each input variable influence the outcome variable across its entire range. Data processing, visualization, and analysis were carried out using R 4·0 and Python 3·8.


GAMs analysis for the relative case-hospitalization rate (RCHR) as the outcome

Fig. 1
figure 1

Results for the Pre-Delta-RCHR (Blue), the Delta-RCHR (Orange), and the Omicron-RCHR (Red). a-g: Accumulated local effects (ALE) of dynamic variables. Shaded areas in each plot indicate 95% confidence intervals. h-k: Estimated slopes for each static variables, the upper and lower band indicate 95% confidence intervals. m: Deviance explained for each model. ‘***’: variable significant at p < 0.001. ‘**’: variable significant at p < 0.01. ‘*’: variable significant at p < 0.05. ‘’: variable significant at p > 0.05

In our analysis, we evaluated goodness-of-fit based on several metrics. For models with relative case-hospitalization risk (RCHR) as the outcome variable, the deviance explained ranges between 46.8% and 72.3% (Fig. 1m) for each variant wave. Moreover, we assessed the correlation between observed RCHR and predicted RCHR, which exhibited strong positive correlations ranging from 0.67 to 0.81 (Appendix Sect. 3.1). These findings provide compelling evidence of the models’ effectiveness in capturing and predicting the case-hospitalization rate.

The relative completed primary series rate, and relative previous infections consistently displayed strong negative associations with RCHR across different waves (Fig. 1a and b). Of particular note is that relative previous infections consistently ranked the highest in terms of ALE across the different waves. Figure 1c reveals the impact of the relative government response index gradually flattening out from the pre-Delta to the Omicron wave. Regarding activity-related engagement levels, their effects on RCHR appear inconsistent across different waves, as exemplified by the relative physician visits, which slightly changed from negative to positive effects as the analysis progressed from the pre-Delta to the Omicron wave (Fig. 1e). Lastly, the relative weekly testing rate served as a control variable to address the state-level differences in testing rates. The result revealed a negative correlation between the relative weekly testing rate and RCHR. Nevertheless, it is noteworthy that this association exhibited a decrease from the pre-Delta wave to the Omicron wave, as illustrated in Fig. 1g.

Regarding the static variables, adults at high risk exhibited a declining positive association with RCHR. Additionally, states with higher Social Vulnerability Index (SVI) consistently showed higher RCHR. Among racial groups, the proportion of Black positively associated with RCHR during the Pre-Delta wave but did not exhibit a significant impact since the Delta wave. With healthcare systems variables, Medicaid spending per person consistently showed a negative association with RCHR.

Fig. 2
figure 2

Results of Omicron-Booster-RCHR for just the Omicron wave with the additional inclusion of an interaction effect between the relative completed primary series rate and the relative booster rate. a: Two-dimensional contour plot for the interaction between relative completed primary series rate and relative booster rate. The deeper red indicates a more positive effect on the RCHR, and the deeper blue indicates a more negative effect to the RCHR. b-g: Accumulated local effects (ALE) of dynamic variables. Shaded areas in each plot indicate 95% confidence intervals. h: Estimated slopes for each static variables, the upper and lower band indicate 95% confidence intervals. ‘***’: variable significant at p < 0.001. ‘**’: variable significant at p < 0.01. ‘*’: variable significant at p < 0.05. ‘’: variable significant at p > 0.05

With the exception of the completed primary series rate, the effects of all other variables modeled in the Omicron-Booster-RCHR remained consistent with the results for the Omicron-RCHR shown in Fig. 1. Figure 2a show the interaction between two vaccine-related variables in a two-dimensional variable space. The solid black lines represent the contour lines. The contour lines correspond to points that have an equivalent impact on the hospitalization rate, with the values marked on each line indicating the actual interaction effect of these points on the RCHR. Figure 2a reveals that the RCHR decreases along the direction of increasing the relative booster rate and the relative completed primary series rate.

GAMs analysis for the relative reported case-incidence rate (RCIR) as the outcome

Fig. 3
figure 3

Results for the Pre-Delta-RCIR (Blue), the Delta-RCIR (Orange), and the Omicron-RCIR (Red). a-g: Accumulated local effects (ALE) of dynamic variables. Shaded areas in each plot indicate 95% confidence intervals. h-k: Estimated slopes for each static variables, the upper and lower band indicate 95% confidence intervals. m: Deviance explained for each model. ‘***’: variable significant at p < 0.001. ‘**’: variable significant at p < 0.01. ‘*’: variable significant at p < 0.05. ‘’: variable significant at p > 0.05

The GAMs using relative reported case-incidence rate (RCIR) as the outcome variable consistently demonstrate lower performance than those GAMs with RCHR as the outcome variable. Specifically, all GAMs for RCIR have deviance explained values below 40%, and correlations between observed RCIR and predicted RCIR range from 0.43 to 0.61 (Appendix Sect. 3.2). The observed performance pattern indicates a more intricate and dynamic relationship concerning COVID-19 transmission, particularly evident during the Omicron wave.

Figure 3a illustrates a strong negative association between the relative completed primary series rate and RCIR during the Pre-Delta and Delta waves. However, this association vanished during the Omicron wave, coinciding with a decline in model performance (Fig. 3m). The ALE plot of the relative previous infection rate (Fig. 2b) revealed an insignificant association between previous infection and RCIR during the pre-Delta and Delta waves but a significant negative association during the Omicron wave. Additionally, when the relative government policy index is greater than one, the ALE plots demonstrate a negative trend; however, the magnitude of this effect is relatively smaller compared to other dynamic variables examined in the analysis. Similar to GAMs for RCHR, the activity-related engagement levels exhibited inconsistent patterns across different waves. Notably, the ALE of relative university visits reverses direction from negative to positive between the pre-Delta wave and the later two waves.

For the static variables, adults at high risk were consistently positively associated with RCIR across different waves. However, the other static variables, including racial groups, SVI, and healthcare expenditures, do not show a consistent or significant impact across different waves.

Figure 4 illustrates the results of the Omicron-Booster-RCIR for just the Omicron wave with the additional inclusion of an interaction effect between the completed primary series rate and the relative booster rate. This interaction effect is presented as a dimension contour map in Fig. 4a.

Fig. 4
figure 4

Results of the Omicron-Booster-RCIR for just the Omicron wave with the additional inclusion of an interaction effect between the relative completed primary series rate and the relative booster rate. a: Two-dimensional contour plot for the interaction between relative completed primary series rate and relative booster rate. b-g: Accumulated local effects (ALE) of dynamic variables. Shaded areas in each plot indicate 95% confidence intervals. h: Estimated slopes for each static variables, the upper and lower band indicate 95% confidence intervals. ‘***’: variable significant at p < 0.001. ‘**’: variable significant at p < 0.01. ‘*’: variable significant at p < 0.05. ‘’: variable significant at p > 0.05

The incorporation of the relative booster rate does not result in an improvement in the model fit; the deviance explained for Model Omicron-Booster-RCIR remains at 17%. As depicted in Fig. 4a, it is evident that only states with both a high relative completed primary series rate and a high relative booster rate exhibits a slightly negative impact, approximately − 0.1, on the RCIR. The findings from Model Omicron-RCIR, when combined with Omicron-Booster-RCIR, suggest that the covariates examined in this study do not contribute significantly to explaining the variation in RCIR during the Omicron wave. These results highlight the need for further research to identify other factors that may better capture the dynamics of COVID-19 transmission during this specific period.


This analysis aims to characterize the relationship between population-level COVID-19 vaccine administration and pandemic-induced healthcare burdens, taking into account essential and confounding real-world processes. Our results point to three significant conclusions:

  • Population-level vaccination is always significantly associated with reduced COVID-19 case-hospitalization risk.

  • Increased recent (1–4 months prior) infections are also consistently and strongly associated with reduced case-hospitalization risk.

  • Local factors, activity-related engagement levels, and policy measures are important to the model’s explanatory power, supporting the importance of considering these factors on population-level outcomes. However, their associations are inconsistent over time and across different variants.

Each of these conclusions is explained in more detail in the sections below. In each section, we discuss the findings regarding case-hospitalization risk and compare them with the results related to the reported case incidence rate. In general, our results strongly support the importance of population-level vaccination and align with extant research on the role of acquired immunity in reducing severe outcomes. However, it should be noted that the case incidence rate has a reduced association with vaccination during the Omicron wave and much less consistently meaningful associations with previous infection rates. Additionally, our analysis reflects the complexity of the evolution of human behavior during the pandemic, given the dynamic role of activity-related engagement levels and policy. It also supports the recognition of the epidemiological vulnerability of socially and economically underserved communities.

Vaccines protect against COVID-19 case-hospitalization risk for pre-Delta, Delta and Omicron waves

Our study reveals a strong and statistically significant association between vaccine uptake rates and reduced COVID-19 case-hospitalization risk. This relationship was consistent across each of the variant waves modeled, and is consistent with earlier findings that vaccine protection against severe illnesses does not significantly wane in response to new variants. In contrast, when we modeled reported case-incidence rates as the response variable, we observed a decreasing effect of vaccines from the pre-Delta to the Omicron wave (Fig. 3a). This outcome aligns with existing literature highlighting the rapid waning of the vaccines’ effectiveness against infection [33, 34]. Nonetheless, while vaccines may offer reduced protection against infection, our results indicate that they continue to provide substantial protection against hospitalization risk and help alleviate the burden on healthcare systems. Additionally, although the value of booster shots for protection against severe cases of COVID-19 is still being studied [35], results from our analysis provide evidence supporting the effectiveness of booster doses against hospitalization risk caused by the Omicron variant (Fig. 2). Conversely, the findings obtained from our Omicron-Booster-RCIR model reveal that the interaction between the booster and completed primary series rates has a relatively limited impact on Omicron infection (Fig. 4). However, it is crucial to emphasize that despite the diminished effectiveness of mRNA boosters against Omicron infections, vaccines still serve the essential purpose of reducing the harm of COVID-19 in the face of emerging variants.

Immunity from recent infection protects against COVID-19 case-hospitalization risk upon reinfection

Higher past COVID-19 infection levels in a population are associated with a decrease in COVID-19 case-hospitalization risk, indicating immunity gained from infection can provide some protection against severe disease in the event of reinfection in the future, but only for a limited period of time. Our study utilized the total number of cases reported in a 12-week window, ranging from 4 to 16 weeks prior to the time period modeled, as a proxy for recently acquired immunity, and found a strong negative association between the previous infection rate and future case-hospitalization risk. These results were consistent across the different variant waves. This finding aligns with other case-control studies that found previous infections showed strong effectiveness against severe, critical, or fatal COVID-19 [6, 36]. Our analysis indicates that prior infections from up to 6 months ahead are associated with decreased hospitalization risk, but 4 to 16 weeks has the strongest effect (see Appendix Sect. 2.4 and 2.5 for this sensitivity analysis). While the waning of natural immunity has been established in molecular and clinical research [37], our analysis provides additional insight at the population-level. In our models with case-incidence rate as the outcome variable, we found an insignificant association between previous infection and case-incidence rate during the pre-Delta and Delta waves. However, during the Omicron wave, there was a significant negative association (Fig. 3b). This finding contrasts with existing literature that found, at the individual level, previous infection protected against infection pre-Omicron, but this effectiveness decreases substantially during the Omicron wave [38]. Nevertheless, at the population level, the number of infected individuals is considerably higher during the Omicron wave than earlier, while a smaller proportion remains susceptible. Consequently, the cumulative impact of previous infections becomes more pronounced. These results highlight that previous infections have a variable and inconsistent impact on reinfection at the individual and population levels.

Local factors contribute to variation in COVID-19 health outcomes

Existing clinical and statistical studies [14, 16, 17] have identified critical indicators for COVID-19 health outcomes including demographics, comorbidities, social vulnerability index (SVI), and healthcare expenditures. Results from our model using RCHR as outcome variables indicate that the SVI is positively associated with COVID-19 case-hospitalization risk across all variant waves (Fig. 1i). This finding is consistent with existing literature [14, 17], which suggests that individuals from socially vulnerable regions are more likely to experience harmful COVID-19 outcomes. For each new variant wave, the proportion of adults at high risk was less associated with case-hospitalization risk than for the prior wave (Fig. 1h). This result aligns with a cohort study that the hazard ratio of hospital admissions with the Omicron variant, compared to the Delta variant, showed a more significant drop in the elder age group compared to individuals younger than 20 [39]. Our results reveal an insignificant association between black proportion and case-hospitalization risk during the Delta and Omicron waves, which differs from previously identified positive associations across all waves [14]. In the United States, the eligibility for Medicaid varies by state, but generally, individuals and families with incomes up to 138% of the federal poverty level may qualify for coverage [40]. Our results reveal a consistent negative association between state-level Medicaid spending per person and COVID-19 case-hospitalization risk (Fig. 1k), which indicates the potential protective effect of healthcare expenditures in mitigating the impact of the pandemic on vulnerable groups. In contrast to the case-hospitalization risk models, the case-incidence rate models indicate that there is no evidence for consistent or significant associations with demographics, SVI, or healthcare expenditures across variant waves, except for adults at high risk consistently positively associated with case-incidence rate (Fig. 3). These results suggest that dynamic COVID-19 infection risk is complex and changes over time, and the factors contributing to transmission vary across waves. Further research is needed for a more comprehensive understanding of the complex and evolving nature of COVID-19 transmission.

Activity-related engagement levels are associated with COVID-19 health outcomes

At the beginning of the pandemic, several studies evaluated the association between mobility and COVID-19 transmission with inconsistent findings [41, 42]. One possible reason for this inconsistency is that aggregated mobility data may not accurately reflect the risk of dynamically changing human behaviors, given that a minority of travel activities could be accountable for a significant majority of infections [43]. Furthermore, the connection between mobility and harmful health outcomes remains unclear. Our study uses disaggregated mobility patterns to capture diverse behaviors between populations, specifically relative activity-related engagement levels, to explore the association between these variables and COVID-19 severity. To achieve this, we divided activities into subcategories based on their purpose. University visits were used to represent school-related activities, gym visits to signify high-risk indoor activities, and physician visits to indicate healthcare-related visits.

Results shown in Fig. 1d indicate that the state-week pairs with relatively higher gym visits are expected to observe higher case-hospitalization risk during the Pre-Delta wave. However, this association did not reach statistical significance during the Delta and Omicron waves. In contrast, when we modeled the case-incidence rate as the outcome variable, our analysis revealed a minor effect of gym visits, as shown in Fig. 3d. The positive impact of gym visits during the Pre-Delta wave may be linked to infections among unvaccinated individuals engaging in indoor activities. It is supported by existing research that unvaccinated individuals have a 2.6 times higher likelihood of contracting SARS-CoV-2 than vaccinated individuals during indoor activities [44]. Moreover, unvaccinated individuals exhibit a higher likelihood of hospitalization [45], leading to a strong positive association between gym visits and case-hospitalization risk during the initial phases of vaccination distribution. In addition to indoor activity, we also observed a significant association between case-hospitalization risk and overall visits to hospitals, medical centers, and Outpatient Care Centers. Unlike indoor activity, this association transitioned from negative to positive between the pre-Delta to Omicron wave. One possible explanation for this finding is that as the pandemic evolved, the public became more familiar with the disease and more tolerant of at-home symptom management; thus, those COVID-19 patients that sought medical care were more likely to be those with more severe symptoms. Finally, visits to the university were found to have a relatively minor impact on case-hospitalization risk. We hypothesize that this is due to the young and relatively healthy demographic that frequents visiting schools, while still vulnerable to contracting the SARS-CoV-2 virus, they are less likely to experience severe outcomes from COVID-19 infection. This hypothesis is further supported by the findings from the case-incidence rate model, which identified a positive association with university visits during the Delta and Omicron waves (Fig. 3d). It is worth noting that during the Pre-Delta wave, school visits negatively impacted the case-hospitalization risk. However, this impact changed to a positive association for the later waves. These observations align with existing research, which has demonstrated that the younger population exhibits the highest increase in susceptibility to the Delta variant compared to the pre-Delta variant [46].

More stringent government public health policy is associated with reduced COVID-19 case-hospitalization risk

Our results indicate that more stringent government policies were associated with reduced COVID-19 case-hospitalization risk during the Pre-Delta and Delta wave. This is consistent with previous studies [47]. In particular, we found that state-week pairs with a significantly high government response index (indicating stricter policy) have a stronger negative effect on the case-hospitalization risk (Fig. 1c). However, this negative effect decreased over time, and was least evident during the Omicron wave. The reduced effect of the policy during Omicron is likely due to a complex combination of factors, including the increasing population level immunity from both widespread adoptions of vaccines and prior exposure providing more protection from severe disease during this period, combined with a reduction in the government’s response to the pandemic over time.

Additionally, weekly testing rates were shown to be negatively associated with case-hospitalization risk. While this result does not imply a causative relationship between testing rate and COVID-19 severity, there are various reasons why testing rates may be linked to case-hospitalization risk. Firstly, it represents a proxy input feature to capture the level of healthcare infrastructure available to a population. Second, it directly impacts the reported case incidence rate, as the number of reported cases in a region is a direct function of local testing availability, thus increased testing will lead to higher reported case rates, and lower case-hospitalization risk. Third, increased testing can lead to more cases being identified, and thus impact people’s awareness and behavior during an outbreak. For these reasons testing rate is included as a potential control factor in our model.


As with all modeling studies, this work is subject to several limitations. Firstly, this study was primarily designed to determine the association between various potential risk factors and COVID-19 outcomes, rather than to establish causality between these variables. Thus, our findings may reflect the role of unobserved confounding factors excluded from our study. Another potential limitation is due to the application at the state-level. The aggregation of the data to the state-level is unable to capture the heterogeneities of the communities within each state, and it is possible that different associations exist at the local level, than are identified at the state-level. Additionally, we believe the use of the case-hospitalization risk in a given state at a given time is a plausible choice as a proxy for disease severity at an individual-level, and captures the burden on the healthcare system at a population-level. However, it is subject to variable case reporting and data quality issues across states, which may arise due to uneven testing capacity, reporting delays or at-home testing. Moreover, it is important to note that we have not captured potential behavior changes across time, such as alterations in self-protection and risk perception during the pandemic, which could significantly influence the population-level health outcomes. Lastly, it is important to acknowledge that our variable transformation, while facilitating a deeper understanding of relative changes, does come with the inherent consequence of diminishing the original meaning these variables initially conveyed.


This research utilizes publicly available real-world data to provide robust evidence of the efficacy of vaccines against COVID-19 case-hospitalization risk across various variant waves in the United States. More importantly, this paper concludes that booster shots offer additional protection against severe COVID-19 during the Omicron waves. Despite the emergence of new variants, vaccines remain the most effective intervention for mitigating the harm of COVID-19 and reducing burden on healthcare systems. Therefore, given the ongoing threat posed by COVID-19 and its potential variants, vaccines continue to be the best line of defense for protecting public health and preventing the further spread of the virus.

Data availability

All the data and code used for the analysis is available from



Accumulated local effects


Case-hospitalization risk


Reported case incidence rate


Coronavirus 2019


Generalized Additive Models


Relative booster rate


Relative case-hospitalization risk


Relative case incidence rate


Relative completed primary series rate


Relative government policy


Relative gym visits


Relative previous infection


Relative physician visits


Relative university visits


Relative variable


Relative weekly testing rate


Social vulnerability index


United States


  1. Baden LR, El Sahly HM, Essink B, Kotloff K, Frey S, Novak R, et al. Efficacy and safety of the mRNA-1273 SARS-CoV-2 vaccine. N Engl J Med. 2021;384:403–16.

    Article  CAS  PubMed  Google Scholar 

  2. Sadoff J, Gray G, Vandebosch A, Cárdenas V, Shukarev G, Grinsztejn B, et al. Safety and efficacy of single-dose Ad26.COV2.S vaccine against Covid-19. N Engl J Med. 2021;384:2187–201.

    Article  CAS  PubMed  Google Scholar 

  3. Thomas SJ, Moreira ED, Kitchin N, Absalon J, Gurtman A, Lockhart S, et al. Safety and efficacy of the BNT162b2 mRNA Covid-19 vaccine through 6 months. N Engl J Med. 2021;385:1761–73.

    Article  CAS  PubMed  Google Scholar 

  4. Kim JH, Marks F, Clemens JD. Looking beyond COVID-19 vaccine phase 3 trials. Nat Med. 2021;27:205–11.

    Article  CAS  PubMed  Google Scholar 

  5. Patel MK, Bergeri I, Bresee JS, Cowling BJ, Crowcroft NS, Fahmy K, et al. Evaluation of post-introduction COVID-19 vaccine effectiveness: summary of interim guidance of the World Health Organization. Vaccine. 2021;39:4013–24.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Altarawneh HN, Chemaitelly H, Ayoub HH, Tang P, Hasan MR, Yassine HM, et al. Effects of previous infection and vaccination on symptomatic omicron infections. N Engl J Med. 2022;387:21–34.

    Article  CAS  PubMed  Google Scholar 

  7. Haas EJ, Angulo FJ, McLaughlin JM, Anis E, Singer SR, Khan F, et al. Impact and effectiveness of mRNA BNT162b2 vaccine against SARS-CoV-2 infections and COVID-19 cases, hospitalisations, and deaths following a nationwide vaccination campaign in Israel: an observational study using national surveillance data. The Lancet. 2021;397:1819–29.

    Article  CAS  Google Scholar 

  8. Gram MA, Emborg H-D, Schelde AB, Friis NU, Nielsen KF, Moustsen-Helms IR, et al. Vaccine effectiveness against SARS-CoV-2 infection or COVID-19 hospitalization with the Alpha, Delta, or Omicron SARS-CoV-2 variant: a nationwide Danish cohort study. PLOS Med. 2022;19:e1003992.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Ng OT, Marimuthu K, Lim N, Lim ZQ, Thevasagayam NM, Koh V, et al. Analysis of COVID-19 incidence and severity among adults vaccinated with 2-dose mRNA COVID-19 or inactivated SARS-CoV-2 vaccines with and without boosters in Singapore. JAMA Netw Open. 2022;5:e2228900.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Agrawal U, Bedston S, McCowan C, Oke J, Patterson L, Robertson C, et al. Severe COVID-19 outcomes after full vaccination of primary schedule and initial boosters: pooled analysis of national prospective cohort studies of 30 million individuals in England, Northern Ireland, Scotland, and Wales. The Lancet. 2022;400:1305–20.

    Article  Google Scholar 

  11. Watson OJ, Barnsley G, Toor J, Hogan AB, Winskill P, Ghani AC. Global impact of the first year of COVID-19 vaccination: a mathematical modelling study. Lancet Infect Dis. 2022;22:1293–302.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Wang R, Wang J, Hu T, Zhou X-H. Population-level effectiveness of COVID-19 vaccination program in the United States: causal analysis based on structural nested mean model. Vaccines. 2022;10:726.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Shioda K. Population-level relative effectiveness of the COVID-19 vaccines and the contribution of naturally acquired immunity n.d.

  14. Bollyky TJ, Castro E, Aravkin AY, Bhangdia K, Dalos J, Hulland EN, et al. Assessing COVID-19 pandemic policies and behaviours and their economic and educational trade-offs across US states from Jan 1, 2020, to July 31, 2022: an observational analysis. The Lancet. 2023;0140673623004610.

  15. Hu S, Xiong C, Zhao Y, Yuan X, Wang X. Vaccination, human mobility, and COVID-19 health outcomes: empirical comparison before and during the outbreak of SARS-Cov-2 B.1.1.529 (omicron) variant. Vaccine. 2023:S0264410X23006175.

  16. Ogedegbe G, Ravenell J, Adhikari S, Butler M, Cook T, Francois F, et al. Assessment of racial/ethnic disparities in hospitalization and mortality in patients with COVID-19 in New York City. JAMA Netw Open. 2020;3:e2026881–1.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Gallo Marin B, Aghagoli G, Lavine K, Yang L, Siff EJ, Chiang SS, et al. Predictors of COVID-19 severity: a literature review. Rev Med Virol. 2021;31:1–10.

    Article  CAS  PubMed  Google Scholar 

  18. Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect Dis. 2020;20:533–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. HHS Protect Public Data Hub. n.d. (accessed 13 Jan 2023).

  20. COVID-19/. data_tables/vaccine_data/us_data at master · govex/COVID-19. GitHub n.d. (accessed 13 Jan 2023).

  21. CDC. COVID Data Tracker. Cent Dis Control Prev. 2020. (accessed 13 Jan 2023).

  22. COVID-19/data_tables. /testing_data at master · govex/COVID-19. GitHub n.d. (accessed 13 Jan 2023).

  23. Places Data Curated for Accurate Geospatial Analytics | SafeGraph. n.d. (accessed 13 Jan 2023).

  24. Du H, Dong E, Badr HS, Petrone ME, Grubaugh ND, Gardner LM. Incorporating variant frequencies data into short-term forecasting for COVID-19 cases and deaths in the USA: a deep learning approach. EBioMedicine. 2023;89:104482.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Hale T, Webster S, Petherick A, Phillips T, Kira B. Oxford COVID-19 government response tracker (OxCGRT). Last Updat. 2020;8:30.

    Google Scholar 

  26. Bureau UC. accessed April 30,. State Population by Characteristics: 2010–2019. CensusGov n.d. (2023).

  27. CDC/ATSDR SVI Data and Documentation Download | Place and Health | ATSDR. 2022. (accessed 13 Feb 2023).

  28. Adults at Higher Risk of Serious Illness if Infected with Coronavirus. KFF 2020. (accessed 13 Feb 2023).

  29. Health Insurance Coverage of the Total Population. KFF 2022. (accessed 30 Apr 2023).

  30. Khare S, Gurry C, Freitas L, Schultz MB, Bach G, Diallo A, et al. GISAID’s role in pandemic response. China CDC Wkly. 2021;3:1049.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Quesada JA, López-Pineda A, Gil-Guillén VF, Arriero-Marín JM, Gutiérrez F, Carratala-Munuera C. Incubation period of COVID-19: a systematic review and meta-analysis. Rev Clínica Esp Engl Ed. 2021;221:109–17.

    Article  CAS  Google Scholar 

  32. Faes C, Abrams S, Van Beckhoven D, Meyfroidt G, Vlieghe E, Hens N, et al. Time between symptom onset, hospitalisation and recovery or death: statistical analysis of Belgian COVID-19 patients. Int J Environ Res Public Health. 2020;17:7560.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Goldberg Y, Mandel M, Bar-On YM, Bodenheimer O, Freedman L, Haas EJ, et al. Waning immunity after the BNT162b2 vaccine in Israel. N Engl J Med. 2021;385:e85.

    Article  CAS  PubMed  Google Scholar 

  34. Tenforde MW, Self WH, Naioti EA, Ginde AA, Douin DJ, Olson SM, et al. Sustained effectiveness of Pfizer-BioNTech and Moderna vaccines against COVID-19 associated hospitalizations among adults—United States, March–July 2021. Morb Mortal Wkly Rep. 2021;70:1156.

    Article  CAS  Google Scholar 

  35. Chenchula S, Karunakaran P, Sharma S, Chavan M. Current evidence on efficacy of COVID-19 booster dose vaccination against the Omicron variant: a systematic review. J Med Virol. 2022;94:2969–76.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Carazo S, Skowronski DM, Brisson M, Barkati S, Sauvageau C, Brousseau N, et al. Protection against omicron (B. 1.1. 529) BA. 2 reinfection conferred by primary omicron BA. 1 or pre-omicron SARS-CoV-2 infection among health-care workers with and without mRNA vaccination: a test-negative case-control study. Lancet Infect Dis. 2023;23:45–55.

    Article  CAS  PubMed  Google Scholar 

  37. Bobrovitz N, Ware H, Ma X, Li Z, Hosseini R, Cao C, et al. Protective effectiveness of previous SARS-CoV-2 infection and hybrid immunity against the omicron variant and severe disease: a systematic review and meta-regression. Lancet Infect Dis. 2023;23:556–67.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Altarawneh HN, Chemaitelly H, Hasan MR, Ayoub HH, Qassim S, AlMukdad S, et al. Protection against the Omicron variant from previous SARS-CoV-2 infection. N Engl J Med. 2022;386:1288–90.

    Article  PubMed  Google Scholar 

  39. Nyberg T, Ferguson NM, Nash SG, Webster HH, Flaxman S, Andrews N, et al. Comparative analysis of the risks of hospitalisation and death associated with SARS-CoV-2 omicron (B.1.1.529) and delta (B.1.617.2) variants in England: a cohort study. The Lancet. 2022;399:1303–12.

    Article  CAS  Google Scholar 

  40. Blavin F, Karpman M, Kenney GM, Sommers BD. Medicaid versus marketplace coverage for near-poor adults: effects on out-of-pocket spending and coverage. Health Aff (Millwood). 2018;37:299–307.

    Article  PubMed  Google Scholar 

  41. Badr HS, Du H, Marshall M, Dong E, Squire MM, Gardner LM. Association between mobility patterns and COVID-19 transmission in the USA: a mathematical modelling study. Lancet Infect Dis. 2020;20:1247–54.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Kraemer MU, Yang C-H, Gutierrez B, Wu C-H, Klein B, Pigott DM, et al. The effect of human mobility and control measures on the COVID-19 epidemic in China. Science. 2020;368:493–7.

    Article  CAS  PubMed  Google Scholar 

  43. Chang S, Pierson E, Koh PW, Gerardin J, Redbird B, Grusky D, et al. Mobility network models of COVID-19 explain inequities and inform reopening. Nature. 2021;589:82–7.

    Article  CAS  PubMed  Google Scholar 

  44. Wienkes H, Vilen K, Lorentz A, Gerlach D, Wang X, Saupe A, et al. Transmission of and infection with COVID-19 among vaccinated and unvaccinated attendees of an indoor wedding reception in Minnesota. JAMA Netw Open. 2022;5:e220536.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Havers FP, Pham H, Taylor CA, Whitaker M, Patel K, Anglin O, et al. COVID-19-associated hospitalizations among vaccinated and unvaccinated adults 18 years or older in 13 US States, January 2021 to April 2022. JAMA Intern Med. 2022;182:1071.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Chun JY, Jeong H, Kim Y. Age-varying susceptibility to the Delta variant (B.1.617.2) of SARS-CoV-2. SSRN Electron J. 2021.

    Article  Google Scholar 

  47. Hale T, Angrist N, Hale AJ, Kira B, Majumdar S, Petherick A, et al. Government responses and COVID-19 deaths: global evidence across multiple pandemic waves. PLoS ONE. 2021;16:e0253116.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references


We want to thank Maximilian Marshall for providing valuable insights and suggestions that have contributed to enhancing the quality of this paper.


This work was funded the NSF RAPID Award ID 2108526, NSF Award ID 2229996, and CDC Contract #75D30120C09570.

Author information

Authors and Affiliations



LG, and HD contributed to the conceptualization and design of the study. HD and SS collected the data and conducted the analysis. HD led the writing of the original draft. HD, SS, and LG edited the manuscript, discussed results, and provided feedback regarding the manuscript. LG supervised the study and acquired funding. HD and SS have verified the underlying data. All authors had full access to the data and approved the manuscript for publication.

Corresponding author

Correspondence to Hongru Du.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1:

Supplementary Data. Supplementary Methods. Supplementary Results

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Du, H., Saiyed, S. & Gardner, L.M. Association between vaccination rates and COVID-19 health outcomes in the United States: a population-level statistical analysis. BMC Public Health 24, 220 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: