Exploring the relationship between mobility and COVID− 19 infection rates for the second peak in the United States using phase-wise association

Human mobility plays an important role in the dynamics of infectious disease spread. Evidence from the initial nationwide lockdowns for COVID− 19 indicates that restricting human mobility is an effective strategy to contain the spread. While a direct correlation was observed early on, it is not known how mobility impacted COVID− 19 infection growth rates once lockdowns are lifted, primarily due to modulation by other factors such as face masks, social distancing, and the non-linear patterns of both mobility and infection growth. This paper introduces a piece-wise approach to better explore the phase-wise association between state-level COVID− 19 incidence data and anonymized mobile phone data for various states in the United States. Prior literature analyzed the linear correlation between mobility and the number of cases during the early stages of the pandemic. However, it is important to capture the non-linear dynamics of case growth and mobility to be usable for both tracking and forecasting COVID− 19 infections, which is accomplished by the piece-wise approach. The associations between mobility and case growth rate varied widely for various phases of the epidemic curve when the stay-at-home orders were lifted. The mobility growth patterns had a strong positive association of 0.7 with the growth in the number of cases, with a lag of 5 to 7 weeks, for the fast-growth phase of the pandemic, for only 20 states that had a peak between July 1st and September 30, 2020. Overall though, mobility cannot be used to predict the rise in the number of cases after initial lockdowns have been lifted. Our analysis explores the gradual diminishing value of mobility associations in the later stage of the outbreak. Our analysis indicates that the relationship between mobility and the increase in the number of cases, once lockdowns have been lifted, is tenuous at best and there is no strong relationship between these signals. But we identify the remnants of the last associations in specific phases of the growth curve.


Introduction
COVID− 19 has spread rapidly worldwide, nearing 99 million confirmed cases, and more than 2.1 million deaths were reported globally as of January 25, 2021 [1]. Public health officials continue to promote social distancing, face masks, and handwashing as effective mechanisms to contain the COVID− 19 outbreak [2], especially due to delays in mass vaccination and the growing number of new COVID− 19 strains [3]. The importance of tracking human mobility as an essential measure to understand and predict the spread of COVID− 19 has been highlighted by many prior studies [2,[4][5][6][7]. Local governments continue to track human mobility in their communities through anonymized cell phone data made available through various data providers [8][9][10]. Several prior studies [11][12][13][14][15][16][17][18][19][20] report a strong association between mobility and stay-at-home orders. Further analysis by Gatalo et al. [21] found that the early association between mobility and COVID− 19 incidence withered after the analysis was expanded to later epochs. This weakening association was likely due to non-pharmaceutical interventions such as face masks, handwashing, maintaining physical distance, avoiding large gatherings, and school closings.
The incidence rate of COVID− 19 disease does not follow a linear pattern, but rather follows a pattern of rise and fall, i.e., a logistic or power-law pattern, depending on the community's response to contain the spread [22][23][24][25]. Similarly, as communities started to reopen, mobility also does not follow a linear pattern. It is important to capture the association of mobility and infection rate in presence of this nonlinearity. To our knowledge, no study has comprehensively evaluated the association of mobility with multiple phases of the pandemic growth pattern. This analysis becomes important as public health officials or policymakers can potentially use mobility as a predictor to detect impending local spikes of cases that are increasingly beginning to overwhelm clinical capacity.
This study proposes the association of mobility with the incidence growth rate of COVID− 19 by segmenting different infection growth rates into multiple phases. We used the formulations of different phases of Batista and Wu's logistic growth model [25,26] to extract the phases of the epidemic curve for various states in the United States. The United States presents a unique scenario since the peaks occurred across distinct epochs, within different geographic regions with independent administrative units. Thus, influenced by sets of multiple factors. We specifically study how mobility affected the case growth for various states following the lifting of initial stay-at-home orders.

Data collection and analysis Infection data
The confirmed case data was retrieved from the data from The New York Times [27] and JHU CSSE COVID − 19 data [28]. This data provides county-level information on daily reported cases. We aggregated the data to weekly reports at the state level.

Mobility data
State-level mobility datasets and metrics were provided by Descartes Labs. Descartes index provides a normalized aggregated mobility measure obtained from anonymized mobile device locations. The mobility metric we use is the percentage change in mobility Fig. 1 Various phases of the Logistic growth model for the State of Arizona relative to pre-pandemic baseline behavior (02/17/ 2020 to 03/07/2020) [7].

Infection growth rate
We use the infection growth rate from the Classical Logistic Growth Model, originally developed by Haberman [29]. The logistic growth model is defined by the differential eq. 1.
C(t) is the cumulative number of confirmed cases at any given time t for an individual state, and dCðtÞ dt is the rate of change in the number of cases. The intrinsic growth rate per unit time r and the expected epidemic capacity K are estimated from the cases recorded over time using non-linear least square curve fitting methods, namely Levenberg-Marquardt, Trust Region Reflective Method, and Nelder-Mead methods that are available as part of the LMFIT Python package [30]. The peak of the curve (i.e., inflection point) is denoted by t p is when d 2 C dt 2 and the number of cases C p = K/2.

Phase-wise correlation
Our central idea is to separate the epidemic curve (which we also refer to as the peak) into multiple phases (or intervals) for the model, rather than fit the correlation between mobility and for the entire epidemic curve. We adopt a piece-wise correlation to study how the correlation varies across these distinct phases of the epidemic curve. We draw our inspiration from the general idea of piece-wise correlation and conditional correlation (also referred to as time-varying correlation, or dynamic correlation) that has been applied in several domains such as image processing [31], econometrics [32,33], and bioinformatics [34] in situations where the distribution of relationship between variables is non-linear, and as a result, the degree of correlation, slope, and intercept vary across space or time.
The separation of the epidemic curve into 5phased intervals was done using the empirical approach adopted by Batista and Wu [25,26] to separate the logistic growth curve generated. The phased intervals are as follows: (a) Phase-I is called the early growth phase (or ascending) where t < t p − 2 r (b) Phase-II is the fast growth phase which falls between the end of the lag phase (or slow growth phase t < t p − 2 r ) and the peak t p , i.e. t p − 2 r < t < t p , (c) Phase-III is the fast growth to steady-state t p < t < t p þ 2 r , (d) Phase-IVsteady-state t p þ 2 r < t < 2t P and finally (e) Phase-V is the ending phase t > 2t p . These phases are illustrated for the logistic growth curve for the State of Arizona in Fig. 1.
The peak time t p is at the 27th week, the slow growth phase is until in t = 22 weeks, and the fast growth phase, shown in grey shading is between 23rd to 27th week. While theoretically five phases have been defined, in a practical setting, Phase-V is only apparent in the posthoc analysis of the epidemic. Given that many states have multiple peaks or constant changes in incidence growth rate, Phase-V tail ends into the Phase-I of the next peak and is considered the start of the second peak of the epidemic. Given that our focus is to understand if increased mobility affects growth in the number of cases, we limited our analysis for Phase− 1 to Phase-IV. Our sample includes all the 20 states that have completed a peak after the relaxation of initial stay-at-home orders (i.e. between July and September 2020). Before July, i.e. between March and June 2020, 14 states had a peak. Fig. 2 States classified according to whether they had an early peak (before July), mid peak (July to August), and late peak, (September onwards)  Several other states continued to have peaks after September, but we limited our analysis to states that had a complete peak before September.
The cross-correlation between mobility and growth rate for various time lags for each phase p is obtained using Pearson's correlation coefficient (provided in equation (eq. 2)) to compute the monotonic relationship between the two variables, incidence growth rate I and mobility index M for various phases (or intervals) of the logistic growth model.
where I p is the value of the incidence growth rate during phase p, and M p is the lagged change in mobility during the phase p. I p, i and M p, i are the incidence growth rate and mobility rate at each sample point i, and n p is the total number of weeks during the phase p.

Mobility trends of various states
The states were separated into three categories (as shown in Fig. 2). The states that had an early surge between March and June (early peak), states that peaked in July and August (mid peak), and the states that peaked after August (late peak). It is worth noting that (based on data until November 30) the groups of states that have reached their peak in these epochs have been largely contiguous i.e., they clustered in space. The northeast peaked early; states like New York and New Jersey peaked in March and April. The rest of the coast peaked in summer, with states like Louisiana and Florida leading the outbreak front. The midlands peaked last in Autumn and Winter. Figure 3 shows the mobility, reported cases, the case growth rate for all the 50 states in alphabetical order.
We notice that there are multiple surges (which we refer to as peaks). As discussed previously, when there is a surge in a state, public health officials respond through non-pharmaceutical interventions to flatten the curve. Three significant points were used to model the peak in each state. The inflection point (the tip of the peak) is the time when the infection rate reaches the highest number. The point of transition to the fastgrowth phase (i.e. Phase-II) is the point where the infection growth rate begins to transition from the slow growth phase (Phase-I) to the fast-growth phase. Finally, transition to Phase-III is where the infection growth rate transitions from the fastest deceleration phase to the slow deceleration phase. These phases are shown in Fig. 3. The states that had completed a peak (i.e. had at least all the 3 phases) between July and September were highlighted in a red dashed border. The background of each state plot is colored by the corresponding epidemic phase of the state. The three trend lines shown for each state are a.) the incidence of cases b.) an automated piece-wise logistic growth model fitted to the incidence of cases and c.) the mobility in the state as measured by Descartes Index. All the 3 trend lines have been applied min-max scaling to share common axes. The X-Axis stands for time, marked by months for major ticks and weeks for minor ticks. The Y-Axis stands for the scaled values of each series.
Of the 50 states in the United States, 14 states reported their peak number of cases early in the US epidemic during April and May, 20 states had their peak during July and August which are considered in the scope of our study and the other 16 states were still reporting an increasing number of cases in September. It should be noted that while the 20 states had their peak during July and August, their Phase-II may have started as soon as May, and their Phase-III may have ended as late as October. For example, the fast-growth rate phase for the state of New Mexico spanned from May to July. Similarly, the State of Iowa's Phase-III extended until November.

Relationship between mobility trends and infection growth-rate
The lifting of stay-at-home-order decisions varied across different states with no national mandate. States had variations in terms of prior infections, non-pharmaceutical interventions such as face masks, reopening guidelines, and population density. This combination of factors resulted in complex variations in mobility and infection growth rate across different states. Performing a linear continuation correlation across mobility and case growth rate did not yield any consistent correlations. This was also reported by Gatalo et al. [21]. We adopt a piece-wise association analysis approach to capture the non-linear nature of infection growth and understand if the correlations are associated with a particular phase. We also investigate the time lags between mobility and infection rates for various phases. The correlation was not consistently present across phases, and when correlation exists, it is not consistent within each phase. Table 1 shows the correlation between mobility and infection growth with a lag of 5 to 7 weeks. The correlation observations Phase-I, Phase-II, and Phase-II are not necessarily consistent. The average correlation is observed to be 0.15 ± 0.47 for the lag of 5 weeks between the change in mobility and the incidence rate of the cases. The similar number for Phases I, II, III, and IV are −0.49 ± 0.5, −0.17 ± 0.57, −0.32 ± 0.43, and − 0.33 ± 0.86, respectively. The associations are much stronger for Phase-II for states that peaked during the July and August of the US epidemic compared to the rest of the phases. Another interesting observation is that for phase-I, 13 out of 20 states have a strong negative correlation for a 6-week lag. This variation is due to a wide range of factors related to stay-at-home orders, prior infection seeding in communities, and increased testing rates. But the dominant factor for negative correlation is likely because the mobility numbers for 5 to 7-week lag, falls within March and April, where several states had mobility going down due to stay-at-home orders.
These results demonstrate that the piece-wise correlations capture the relationship between incidence growth rate and the change in mobility more accurately than performing linear and consistent correlation across multiple phases. Table 2 shows the correlation between the case incidence growth rate and the change in mobility for Phase I of the pandemic during the study period. States that had a longer Phase-I and Phase-II like Idaho and Iowa, have a higher correlation compared to states like Arizona and Nevada whose epidemic growth phases are considerably quicker. Tables 3 and 4 show the correlations for all 50 states for various lag periods for Phases II and III.

Discussion
Monitoring mobility trends could potentially inform mitigation measures towards slowing the spread of COVID-19. It can help predict the fast growth phase with exponential growth. The disparities in mobility and case incidence rate across the country, during fall and winter, indicate high variability in mitigation measures and pandemic behavior in various states across the United States. Given the non-linear nature of both mobility and case growth trends, we adopted a piece-wise approach to analyze the association between mobility and case growth rate.
Both mobility and the dynamic of epidemic spread vary quite widely in many aspects. First, the growth and lag dynamics are different across different scales, i.e. days, weeks, or months. Our choice of choosing weekly case numbers was motivated by the need to account for testing delays and the need to have an adequate sample size. We need to further investigate how the growth patterns vary across different temporal granularities. Second, we can observe that the growth curve patterns also differ across states. This difference is due to a combination of factors ranging from population density, differences in the actions of state and local authorities that introduce restrictions, and differences in how people adhere to social distancing restrictions in case of prolonged stay-at-home orders. Finally, mobility dynamics vary widely across different states. We observe that in most cases. The data until November shows that states with low mobility had lower per-capita cases, whereas most states had case increase in November irrespective of stay-at-home orders or mobility. These combinations of factors affect the relationship between mobility and infection growth rate.   This study uses an interval (or piece-wise) approach to address the non-linear trend arising from nonpharmaceutical interventions and pandemic behavior to flatten the curve. We used a Logistic Growth Model to separate the infection growth curve into multiple phases and apply correlation to individual phases. Most communities are experiencing multiple COVID-19 infection waves, as infection rates are modulated by lockdowns and other forms of nonpharmaceutical interventions followed by periods of relaxation.
We found that mobility has a strong correlation with and fast-growth phase with a lag of 5 to 7 weeks, but only in states with an early second peak. We found that the second peak characteristics differed in relation to the first peak and were consistently more protracted in their response to the mobility signal. When we examined the correlation between mobility and the number of cases for March to September, we observed that the correlations were not consistent (Fig. 4). In this representative figure, we show the relationship between the mobility index and the incidence growth values for the state of Louisiana with a lag of 6 weeks for the pandemic and each of the phases described in Fig. 1.
The lack of consistency between mobility and the number of cases for March to September could partly be due to variations in how states have relaxed social distancing guidelines over time. During the initial stay-at-home-order period, all the states underwent a sudden drop in mobility. This decrease in mobility correlated well with a drop in the number of cases for most states, after a lag period. However, the uptick in mobility in various states did not follow a consistent pattern after reopening. For instance, mobility in California stayed consistently below 60% until September. In contrast, states like Florida reopened quickly; with the mobility returning to as much as 80%. We did not observe a consistent positive correlation across multiple states during the relaxing of stay-at-home orders.
Prior literature explored a linear association between mobility and the number of cases. In this work, we argue that the distribution is not a simple linear trend, and instead we adopt a logistic growth model that is more faithful to the characteristics of the data and show the differential contribution of different phases which vary significantly amongst themselves, but also show consistent patterns that can be exploited in predicting subsequent spikes. In other words, a spike in mobility 5 to 7 weeks ahead may be an indicator of a state experiencing a peak. This advanced warning could potentially help states in advanced preparation for when hospitals may be overwhelmed.
As we see in this instance and most states with a cotemporal second peak, the correlation is much stronger in Phase-II compared to the other phases. The association between mobility and the number of cases is weaker in other phases. While the mobility decreases slightly after the number of cases increases, it is likely that the change in public behavior due to increases in social distancing, masks, testing, and other precautionary measures lead to a rapid decrease in the number of cases. More research is needed to confirm the influence of these factors, and to understand the reduction in the number of cases while the mobility stayed the same during the remaining phases of the pandemic in some instances.
Mobility is a useful indicator and publicly available mobility datasets from Safegraph and Descartes Labs can be leveraged in the early part of the pandemic to monitor population behavior in response to public health directives. However, their value wears off rapidly. We attempted a finer-grained analysis to examine whether the correlation is maintained at specific phases. While some interesting insights were gained, the analysis was not a practical tool that can generally predict the number of cases beyond the first peak due to a more noisy set of factors complicating the analytic space. The approach however may be useful in countries where a more strict and consistent set of mitigation directives have been applied, thus presenting a simpler analytic space.

Limitations
This study presents a novel way to examine the association between mobility and infection rates for various states in the United States. There are several areas where this study can be potentially improved. First, this study focuses on the association between the change in mobility and its effect on the increases in the number of cases. This study does not take into effect, the many other factors like usage of masks, social distancing, the effect of regulations, and the varying compliance from the public that could have contributed to the number of cases. Without detailed information for each of these variables, it would not be possible to model for the causal effect of these factors in the incidence growth rate. Second, the associations are computed at a weekly granularity to overcome the non-uniform case reporting issues where a higher number of cases are reported over Mondays and Tuesdays while the number of cases reported over the weekend is lower. This leads to a sample size of 37 weeks during the study period, but as more data is collected, future studies can look at longer periods and larger sample sizes to validate these results. Third, the case data might be prone to reported errors due to both reporting issues as well as the outliers in testing when states update their case numbers post-hoc. We partly handle these issues by computing the incidence growth rate fitted to the Logistic Growth Model which smoothens the data, rather than the number of cases directly. Finally, the mobility data considers the distance traveled by individuals but does not capture the number of individuals making the trip. Incorporating the number of trips or individuals might help enhance the relationship between the number of cases and the mobility of individuals.

Conclusion
We analyzed the trends in reported COVID-19 cases and mobility for various states in the United States. We noticed that the prior literature explored the correlation between a power-law distribution in the case distribution using simple linear modeling and we instead modeled using a Logistic Growth Model, more faithful to the distribution of the case data. This allowed us to perform multiple piece (phase)-wise linear correlations. We however found the associations to be quite asymmetrical across the phases. Despite attempting to fit both linear and piece-wise correlations into the second wave, we were unable to find consistent patterns that would allow us to predict the rise in the number of cases. Although we note several insights in the distribution of the case data and its associations with mobility, we conclude that it is not productive to associate mobility with cases beyond the first peak. This is consistent with the findings of Gatalo et.al [21].