Skip to main content
  • Research article
  • Open access
  • Published:

Using GAM functions and Markov-Switching models in an evaluation framework to assess countries’ performance in controlling the COVID-19 pandemic

Abstract

Background

The COVID-19 pandemic has initiated several initiatives to better understand its behavior, and some projects are monitoring its evolution across countries, which naturally leads to comparisons made by those using the data. However, most “at a glance” comparisons may be misleading because the curve that should explain the evolution of COVID-19 is different across countries, as a result of the underlying geopolitical or socio-economic characteristics. Therefore, this paper contributes to the scientific endeavour by creating a new evaluation framework to help stakeholders adequately monitor and assess the evolution of COVID-19 in countries, considering the occurrence of spikes, "secondary waves" and structural breaks in the time series.

Methods

Generalized Additive Models were used to model cumulative and daily curves for confirmed cases and deaths. The Root Relative Squared Error and the Percentage Deviance Explained measured how well the models fit the data. A local min-max function was used to identify all local maxima in the fitted values. The pure Markov-Switching and the family of Markov-Switching GARCH models were used to identify structural breaks in the COVID-19 time series. Finally, a quadrants system to identify countries that are more/less efficient in the short/long term in controlling the spread of the virus and the number of deaths was developed. Such methods were applied in the time series of 189 countries, collected from the Centre for Systems Science and Engineering at Johns Hopkins University.

Results

Our methodology proves more effective in explaining the evolution of COVID-19 than growth functions worldwide, in addition to standardizing the entire estimation process in a single type of function. Besides, it highlights several inflection points and regime-switching moments, as a consequence of people’s diminished commitment to fighting the pandemic. Although Europe is the most developed continent in the world, it is home to most countries with an upward trend and considered inefficient, for confirmed cases and deaths.

Conclusions

The new outcomes presented in this research will allow key stakeholders to check whether or not public policies and interventions in the fight against COVID-19 are having an effect, easily identifying examples of best practices and promote such policies more widely around the world.

Peer Review reports

Background

Pandemics and major epidemic outbreaks are not unlikely events, contrary to what common sense may imply. They are real threats. History tells us the effects on mankind of the Black Death in the 14th century and the Spanish flu in 1918. Over the past three decades, the number of reported outbreaks of highly pathogenic or highly transmissible infectious diseases has increased enormously [1, 2].

The number of deaths directly attributable to these outbreaks is not always large. However, a pandemic might have a catastrophic impact if it is not taken seriously, due to the non-linearity of its transmission in a world that is highly interconnected through long-range transportation [3, 4]. This is an ideal setting for the widespread transmission of COVID-19. As of September 15, 2021, more than 226 million people worldwide have been infected, and more than 4.1 million people have died since the first case was detected in December 2019 in China, according to data gathered by the Centre for Systems Science and Engineering [CSSE] at Johns Hopkins University [5, 6].

Several initiatives are conducting careful research worldwide to better understand the behavior of COVID-19, such as those modelling the reproductive ratio [7, 8], the mortality rate [9, 10], the influence of climatic variables [11, 12], and the short-run impact on the global economy [13, 14], whilst adopting the precautionary principle of averting the risk of ruin [4, 15].

In parallel, some projects are being undertaken to monitor the evolution of COVID-19 across countries [5, 16, 17], which naturally leads to comparisons made by those who use the data. Clearly, these “at a glance” comparisons may be misleading because the curve that should explain the evolution of COVID-19 is different across countries as a result of the underlying geopolitical or socio-economic characteristics.

To deal with this situation, [18] proposed a framework to monitor and evaluate the performance of public policies in confronting COVID-19 that are more/less efficient in the short/long term, employing a set of non-linear growth functions (exponential, logistic, Gompertz, Weibull, Richards) through a quadrants system. These functions are not highly accurate in the long run owing to spikes and “second waves” now evident in the COVID-19 data worldwide [19]. They predict only one inflection point, i.e., the global maximum and do not detect local maximum or minimum over time. The exception is exponential function, which goes to infinity and which may be anticipating a regime change in the COVID-19 series, and growth functions find it difficult to model such behavior.

This paper contributes to the scientific endeavor by creating a new evaluation framework to help stakeholders (policymakers, public sector health workers, resilience managers, and the general public) to:

  1. 1

    Adequately monitor the evolution of reported confirmed cases and deaths in countries, considering the occurrence of spikes and “secondary waves”;

  2. 2

    Identify structural breaks in the confirmed cases and deaths curves;

  3. 3

    Assess the performance of their actions in the face of the spread of COVID-19.

We incorporate new evidence that pandemic fatigue is taking hold. This decrease in commitment to fighting the pandemic alters the behavior of the forecast errors present in the COVID-19 curves, causing a structural break in the variance of the residuals, or forecast errors. We are therefore able to use Markov-Switching models on the residuals of our forecasts to identify regime-switching in the COVID-19 time series. Our new methodology proves more effective in explaining the evolution of COVID-19 than growth functions worldwide, including highlighting several inflection points and regime-switching moments. Moreover, results from this research can be used by managers, for example, to provide an econometric justification for the prioritizing of vaccination programmes in the health care sector.

Methods

The generalized additive models

Generalized Additive Models (GAM) are generally regarded as a particular case of generalized linear models [20, 21]. These models use linear predictors which are themselves sums of smooth functions, e.g., polynomial, bin, running mean, among others, of predictor variables, where their basic building blocks are splines used to model relationships. This is the main difference from linear models: the latter use predictors directly in the model multiplied by a scalar [22]. The linear model is a special and limiting case of a GAM [23].

The GAM framework allows for the dependence of the response on the predictor variables to be specified flexibly. The model is, therefore, specified in terms of smooth, or basis, functions. However, to obtain this convenience and flexibility, it is necessary to:

  • Determine an appropriate representation for the smooth functions;

  • Choose how smooth they should be.

A GAM, in its simplest form, can be represented as the Eq. 1, where yt is a response variable, xt is a predictor variable, bj(x) is a basis function as described above, βj are the unknown coefficients, k is the basis dimension, which controls the degree of the model smoothness, the number of knots in a basis function, and it is part of model specification, and εt is a zero-mean, i.i.d. random variable.

$$ y_{t} = \sum_{j=1}^{k}b_{j}(x)\beta_{j} + \epsilon_{t} $$
(1)

Thus, Eqs. 2 and (3) show the GAM functions for confirmed cases and deaths from COVID-19, respectively, with \(\zeta _{t} \thicksim N(0,\sigma ^{2})\) random variables.

$$\begin{array}{@{}rcl@{}} {confirmed}_{t} &=& \sum_{j=1}^{100}b_{j}({day}_{t})\beta_{j} + \zeta_{t} \end{array} $$
(2)
$$\begin{array}{@{}rcl@{}} {deaths}_{t} &=& \sum_{j=1}^{100}b_{j}({day}_{t})\beta_{j} + \zeta_{t} \end{array} $$
(3)

These equations do not directly calculate the occurrence of the inflection points, however: these are the moments when the growth rate is going to decrease. This is common when using growth functions [24] to investigate outbreaks and epidemics [2, 18, 25]. To counter this limitation, Eqs. 4 and (5) show the GAM functions for daily cases and deaths from COVID-19.

$$\begin{array}{@{}rcl@{}} {dailycases}_{t} &=& \sum_{j=1}^{50}b_{j}({day}_{t})\beta_{j} + \varsigma_{t} \end{array} $$
(4)
$$\begin{array}{@{}rcl@{}} {dailydeaths}_{t} &=& \sum_{j=1}^{50}b_{j}({day}_{t})\beta_{j} + \varsigma_{t} \end{array} $$
(5)

Using time series of daily or deaths cases, unlike cumulative series, it is possible to estimate their smoothed curves. These curves approximate their growth rate functions, which allow us to identify their inflection points. In this case, we set k = 50 to increase the degree of smoothness, using \(\varsigma _{t} \thicksim Pois(\lambda)\) i.i.d. random variables. Recall: the lower the k values, the smoother the fitted curve is, which helps to find the most relevant inflection points, the local maxima. Otherwise, the fitted curve would be very wiggly, making all the points local minima and maxima.

The coefficients of Eqs. (15) are estimated by restricted maximum likelihood (REML) as a bias-reducing alternative to maximum likelihood (ML), given that the latter tends to underestimate the variance components. Moreover, compared with the generalized cross-validation (GCV) estimator, REML tends to be more resistant to occasional severe over-fitting. Its optimum tends to be more pronounced relative to sampling variability, and it has less tendency to develop phantom minima when there is no real signal in the data, with an O(n−4/5) computational cost [21, 26].

To measure how well the models fit the data, the Root Relative Squared Error (RRSE) was used for Eqs. (23), while the percentage deviance explained was used for Eqs. (45). The percentage deviance explained, which is a generalization of R2, is based on the sum of squares of the deviance residuals, as the fitted model deviance, divided by the sum of squares of the deviance residuals when the covariate effects are set to zero, as the null deviance [21]. The higher the values, the better.

The local.min.max function from the spatialEco R package [27] was used to identify all local maxima or peaks in the fitted values, also known as inflection points. This method is simpler to explain and more straightforward than simulating multivariate normal random deviates, as proposed by [28].

The Markov-Switching structural break functions

When using Eqs. (23), we assumed that the residuals follow a normal distribution, with a constant variance: \(\zeta _{t} \thicksim N(0,\sigma ^{2})\).

It appears, however, that the occurrence of a new more contagious wave of COVID-19 is spreading faster than the first outbreak in spring 2020 according to top scientists [19, 29]. Member States across the WHO European Region are reporting emerging pandemic fatigue in their populations. Pandemic fatigue is an expected and natural reaction to the prolonged nature of this crisis and the associated inconvenience and hardship. It poses a serious threat to efforts to control the spread of the virus, see [30] for a Policy framework for reinvigorating the public to prevent the pandemic.

In the same vein,[19] provides evidence that people relaxed their commitment to non-pharmacological measures to combat COVID-19: mask-wearing, hand-washing, and social distancing after the first wave of contagion. Furthermore, [31] showed that social distancing can result in an estimated 65% reduction in new COVID-19 cases, while [32], using the situation in Manitoba, Canada as an example, verified that relaxing social distancing to levels of contact that are 50% of what they were before COVID-19 may result in over 35% of the population infected at the same time. Both studies corroborate our contention, reinforced by [30], that pandemic fatigue is taking hold.

This decrease in commitment to fighting the pandemic alters the behavior of the forecast errors present in the COVID-19 curves, causing a structural break in the variance of the residuals, or forecast errors. We are particularly looking for this phenomenon in the residuals because the lower the RRSE, the greater the possibility of the residuals of the series being stationary, thus allowing, for instance, the use of the Markov-Switching (MS) models.

The main features of the MS models are:

  1. 1

    The regime that occurs at time t is determined by an unobservable random process, \(S^{i}_{t}\);

  2. 2

    Each regime is assumed to be a first-order Markov process, that is, the current regime only depends on the previous one [33].

The Lamoureux and Lastrapes test [34] was used to verify the occurrence of structural breaks in the variance of the residuals in equations (2-3). The test consisted of estimating the α and β coefficients of a GARCH model (1,1) applied to standardized residuals. If the sum of these coefficients is very close to 1, the existence of structural breaks in the residuals is confirmed, which implies the existence of at least two regimes. This finding suggests that a change in people’s behavior might be contributing to a greater increase in infected people, as pointed out by [19].

The next step was to select the best approach to represent the structural breaks in the residuals of Eqs. (23). For simplicity of modeling these approaches, we assume the existence of two regimes: low (high) variance, which represents a stronger (weaker) commitment to non-pharmacological measures to combat COVID-19. Our first approach is to consider a pure Markov-Switching (MSwM) model of variance, as proposed by [35]. Here the persistence in the variance (previous values keep affecting posterior values) occurs due to the regime-switching of the variance process. This model can be described using the following set of equations and definitions:

$$ \begin{aligned} \zeta_{t} \thicksim N(0,\sigma^{2}) \end{aligned} $$
(6)
$$ \begin{aligned} {}\sigma^{2}_{t} = \sigma^{2}_{1}S_{1t} + \sigma^{2}_{2}S_{2t} \end{aligned} $$
(7)
$$ \begin{aligned} {}\sigma^{2}_{1} < \sigma^{2}_{2} \end{aligned} $$
(8)
$$ \begin{aligned} {} S_{kt} = 1, if S_{t} = k; otherwise S_{kt} = 0, k = 1,2 \end{aligned} $$
(9)
$$ \begin{aligned} {}p(S_{t} = 1 | S_{t-1} = 1) = p_{11};p(S_{t} = 2 | S_{t-1} = 1) = 1 - p_{11} \end{aligned} $$
(10)
$$ \begin{aligned} {}p(S_{t} = 2 | S_{t-1} = 2) = p_{22};p(S_{t} = 1 | S_{t-1} = 2) = 1 - p_{22} \end{aligned} $$
(11)
$$ {}\begin{aligned} L(\epsilon,\theta) = \sum_{t=1}^{T}\sum_{i=1}^{2} \frac{p_{ii}}{\sqrt{2\pi\sigma^{2}_{i}}}\exp\left\{\frac{-\left(\epsilon_{t} - \mu_{i} \right)^{2}}{2\sigma^{2}_{i}} \right\} \end{aligned} $$
(12)

where the vector of parameters \(\theta \equiv \left \{\mu _{1},\mu _{2},\sigma ^{2}_{1}, \sigma ^{2}_{2}, p_{11},p_{22} \right \}\) can be estimated by a log-likelihood function in Eq. 12 using numerical methods [36]. With p11 and p22, it is possible to construct the transition matrix, essential to calculate the 1-step ahead regime. We used the MSwM R package [37] to estimate the vector θ. When using this approach, it is important to highlight that we assume that the variance remains constant within each regime.

The second approach considers the family of Markov-Switching GARCH models [38], where the variance can be time-varying in each regime (k=1,2). This means that the persistence in the variance occurs both for the shocks and the regime-switching in the parameters of the variance process.

We used the set of equations and conditional distributions (with zero mean and unit variance) available from the MSGARCH R package, as shown in Tables 1 and 2, respectively [45]. Thus, it is possible to estimate up to 30 types of models.

$$ L\left(\psi,I_{t}\right) = \prod_{t=1}^{T}f(\epsilon_{t}|\psi,I_{t-1}) $$
(13)
Table 1 MSGARCH models
Table 2 MSGARCH conditional distributions

Let ψ be the vector of model parameters. The likelihood function is defined in Eq. 13, and the maximum likelihood estimator \(\widehat {\psi }\) is obtained by maximizing the logarithm of (13), where f(εt|ψ,It−1) denotes the density of εt given past observations, It−1 and model parameters ψ.

Results

Use of GAM functions to predict COVID-19 historical series

Countries’ data were collected using tidycovid19 R package [6]. The results for COVID-19 curves in the USA are shown in Fig. 1 as an example of the modeling proposed by Eqs. (25), respectively. The red points are the actual values, the black solid lines are the fitted values, the dashed blue lines are the 99% forecasting intervals, and the vertical dashed purple lines are the inflection points.

Fig. 1
figure 1

COVID-19 curves for the USA, as of September 15, 2021. Panel A shows cumulative confirmed cases; panel B shows cumulative deaths; panel C shows daily confirmed cases, and panel D shows daily deaths

Figure 2 shows the results of Eqs. (25) applied to several countries worldwide, represented by purple/red dots. The median of the RRSE shows that the models fit the data well, both for confirmed cases and deaths (A-B). Moreover, when countries are classified as “less accurate”, that is, the RRSE is above the median, the maximum RRSE is close to 0.15, except in one case, which is considered to be an outlier.

Fig. 2
figure 2

Boxplots for RRSE and Deviance explained worldwide, on September 15, 2021. Panel A shows cumulative cases; panel B shows cumulative deaths; panel C shows daily cases, and panel D shows daily deaths

Similarly, the panels in Fig. 2 reporting results for deviance explained also suggest a good fit for Eqs. (45) to smooth the daily cases and death time series for the countries, given that their medians are above 74% (the higher, the better).

As a comparison, we considered several growth functions (exponential, logistic, Gompertz, Weibull, and Richards) [24] to fit the confirmed cases of the countries. Next, we computed their RRSE. To estimate the growth functions parameters, we based our estimation on the previous study by [18], choosing the function that best fits each country using RRSE.

Figure 3 shows the RRSE for each of the five growth functions considered, for the confirmed cases, given that [18] analyzed only this time series. Our results clearly show that the GAM functions outperform those of [18] for the fitting of the COVID-19 time series, in addition to standardizing the entire estimation process in a single type of function.

Fig. 3
figure 3

Boxplots for RRSE of growth functions worldwide on September 15, 2021

Identifying structural breaks in the COVID-19 curves

Continuing with the COVID-19 for the USA series as an example, the GARCH (1,1) [\(\sigma ^{2}_{t} = \omega + \alpha \epsilon ^{2}_{t-1} + \beta \sigma ^{2}_{t-1}\)] coefficients were estimated for the residuals of Eqs. (23) using the rugarch R package [46], assuming a Normal distribution, as illustrated in Table 3, with 1% of significance. In both models, the sum of the α and β coefficients is equal to 1, confirming the existence of a structural break in the residuals, as shown in Figure 4. In other words, the parameters of the variance’s residuals change over time.

Table 3 GARCH models result for residuals for the USA, on September 15, 2021

Figure 4 shows the results for the pure Markov-Switching models (MSwM) for the confirmed cases and deaths from COVID-19 in the USA, on September 15, 2021. The black and red lines are the probabilities of being, respectively, in the regime of low (σ1,confirmed=0.32;σ1,deaths=0.48) and high (σ2,confirmed=1.85;σ2,deaths=1.59) variance/std. deviation, while the blue line is the standardized residuals.

Fig. 4
figure 4

MSwM probabilities for confirmed cases and deaths, linked to COVID-19 for the USA

Regarding confirmed cases, the MSwM models indicate that a regime-switching started on November 8th, 2020 (five days after the United States presidential election), from low variance to high variance. On February 16, 2021, the USA returned to the low variance regime since they administered the first vaccine on December 14, 2020 [47]. However, on July 17, 2021, they came again to the high variance regime, mainly due to the proliferation of the delta variant in the country, together with the decrease in the effectiveness of vaccines against this new variant of COVID-19 [48]. As for the number of deaths, the regime-switching from low variance to high variance, and vice-versa, follows a pattern similar to that observed for confirmed cases.

Figure 5 shows the results for the Markov-Switching GARCH models (MSGARCH) for the COVID-19 curves, for confirmed cases and deaths in the USA, as of September 15, 2021. As in Fig. 4, the black and red lines are the probabilities of being in the regime of low and high variance/std. deviation, while the blue line shows standardized residuals.

Fig. 5
figure 5

MSGARCH probabilities for confirmed cases and deaths, linked to COVID-19 for the USA

Given that 30 models were estimated for both curves, the choice of the best model was based on the following criteria: all model parameters must be significant at the 5% level, and the best model must have the lowest Bayesian Information Criteria [BIC]. The estimated coefficients for the best models are shown in Table 4, for confirmed cases and deaths.

Table 4 MSGARCH estimated parameters for the USA COVID-19 curves

Both Fig. 4 and 5 show the mean of the probabilities p(St=1|St−1=1)=p11 and p(St=2|St−1=2)=p22 for confirmed cases and deaths, from their transition matrices. The higher these values, the more persistent is the regime. Otherwise, the regime-changing would be easier, thus making no sense for the use of Markov-Switching models, and a single regime GARCH model could better explain the low/high variability of the residuals. Therefore, the best model is that with the highest mean(p11,p22), suggesting that MSGARCH slightly better explains the regime-switching for confirmed cases, while the MSwM better explains deaths.

Figures 6 and 7 display the classification of the Markov-Switching model that better explains the structural breaks amongst countries, on September 15, 2021, following the above-mentioned criteria of choice. For the confirmed cases and deaths, the MSwM accounts for 96 and 83 countries, respectively, while the MSGARCH accounts for 93 and 98 countries, in that order.

Fig. 6
figure 6

Markov-Switching classification for the confirmed cases worldwide

Fig. 7
figure 7

Markov-Switching classification for the deaths worldwide

When the MSGARCH is chosen, the eGARCH-ged model is selected 12 out of 93 times for the confirmed cases, and the eGARCH-sged and eGARCH-snorm models are selected 14 out of 98 times for the deaths, being those models their modes, respectively. It is essential to highlight that the variance in the eGARCH model respond asymmetrically to rises and falls in COVID-19 numbers, determined by the α2 parameter. In the USA example, for deaths, the variance increases when the residual is positive (α2>0) for the high regime, while when the regime is low, the variance increases when the residual is negative (α2<0).

Besides, we can add two other possibilities of classification for the countries regarding the occurrence of structural breaks in the confirmed cases and deaths from COVID-19 curves. Again, we contend that pandemic fatigue causes a change in people’s behavior and contributes to a greater increase in infected people, as pointed out by [19].

The first possibility was using the Partitioning Around Medoids (PAM) [49], given that it uses the K-medoid algorithm, which is a robust alternative to the K-means algorithm because it is less sensitive to noise and outliers. Moreover, it employs the silhouette method to find the optimal k clusters over a range of possible values, because it measures the quality of clustering: the higher, the better, in a scale of Average Silhouette Width (ASW), ranging from zero to one. We define 01 to 10 as the range of possible values for k, and we found two clusters as the best number of clusters for the confirmed cases (ASW=0.48) and deaths (ASW=0.49).

The second possibility was to define quadrants from the medians of the axes, equally splitting the countries’ sample per axis, and dividing the Cartesian plane into four regions (I-IV), interpreted in a counter-clockwise direction. Region I is defined as having small “low-variability”/great “high-variability” [confirmed=41; deaths=46], region II as having small “low-variability”/small “high-variability” [confirmed=54; deaths=45], region III as having great “low-variability”/small “high-variability” [confirmed=41; deaths=46], and region IV as having great “low-variability”/great “high-variability” [confirmed=53; deaths=44].

Overall, on September 15, 2021, Figs. 6-7 show how heterogeneous the countries were concerning the commitment of their populations to non-pharmacological measures to combat COVID-19. Countries in Region II seem to have the best performance in this regard.

A proposal for the COVID-19 evaluation framework

In Figs. 8-9, the x-axis represents the cumulative number of days since the first COVID-19 case, and the y-axis is the natural logarithm of the current number of COVID-19 confirmed cases or deaths, as well as the size of the circles for each country, to facilitate a relative comparison between them: the bigger, the worst. To mediate the relationship between them, one more variable is considered: the inflection point.

Fig. 8
figure 8

An evaluation framework for the confirmed cases worldwide

Fig. 9
figure 9

An evaluation framework for the deaths worldwide

Like Figs. 6, 7, 8 and 9 were divided into quadrants from the medians of the axes, equally splitting the countries’ sample per axis, and dividing the Cartesian plane into four regions (I-IV), also interpreted in a counter-clockwise direction.

Region I is defined as having “short-term inefficiency” [confirmed=22; deaths=26], region II as having “short-term efficiency” [confirmed=73; deaths=67], region III as having “long-term efficiency” [confirmed=22; deaths=24], and region IV as having “long-term inefficiency” [confirmed=72; deaths=64].

Therefore, the key concept of efficiency here is interpreted as preventing the number of confirmed cases and deaths from increasing over time, as well as to see, in the same chart, how the behavior of the growth rate.

Moreover, using the PAM clustering [49], clusters were also found in Figs. 8-9, thus showing heterogeneity in their performance, even in the same quadrant (ASW=0.89 for the confirmed cases, ASW=0.81 for the deaths).

Discussion

In the Background section, we mentioned the role of geopolitical and socio-economic characteristics in explaining the evolution of COVID-19. However, it is also important to comment the role of psychological effects on people caused by the spread of COVID-19 around the world.

For example, several studies have assessed the fear of healthcare professionals or medical students of being infected with COVID-19, as well as how this fear affects their physical, mental, and emotional health [5053]. Among the various results obtained using the Fear of COVID-19 Scale [54], women are more afraid than men of being infected by COVID-19, causing greater impacts on the quality of their physical, mental and emotional health. Perhaps, for these reasons, women, more than men, are also more likely to adopt non-pharmacological prevention measures.

Therefore, our evaluation framework also allows us to conjecture that countries classified as efficient, whether in the short or long term, also have the highest levels of the physical, mental, and emotional quality of their health professionals or medical students. The importance of this is that these professionals work on the front line in the fight against COVID-19.

Although not the main focus of this study, our evaluation framework can also be applied to assess the vaccination deployment worldwide, in order to contribute to the perception of vaccine safety and increase willingness to receive it, as pointed out by [55]. Recently, [56] have made available a free-to-access global dataset that tracks the scale and rate of vaccine rollout. For instance, Fig. 10 shows, on September 15, 2021, the distribution of countries for the total vaccinated per hundred, since the 1st dose applied. In this case, countries in regions I and IV are classified as “short-term efficiency” and “long-term efficiency”, respectively.

Fig. 10
figure 10

An evaluation framework for the total vaccinated per 100 worldwide

Finally, on September 15, 2021, the results presented in Figs. 89 for the countries can be summarized as follows, considering the regions as defined in the World Bank’s Development Indicators:

  1. 1

    For the confirmed cases, 105 out of the 189 countries showed a growth rate on a downward trend;

  2. 2

    For the deaths, 99 out of the 181 countries showed a growth rate on a downward trend;

  3. 3

    For confirmed cases and deaths, most of the countries on a downward trend are located in Sub-Saharan Africa, Europe & Central Asia. However, most of the countries on an upward trend are located in Europe & Central Asia, Sub-Saharan Africa, and Latin America & Caribbean, in this order;

  4. 4

    For the confirmed cases, most of the countries considered efficient (quadrants II and III) are located in Sub-Saharan Africa (42), East Asia & Pacific (16), and Latin America & Caribbean (16). On the other hand, most of the countries considered inefficient (quadrants I and IV) are located in Europe & Central Asia (38), Latin America & Caribbean (17), and Middle East & North Africa (17);

  5. 5

    For the deaths, most of the countries considered efficient (quadrants II and III) are located in Sub-Saharan Africa (39), Europe & Central Asia (17), and Latin America & Caribbean (13). On the other hand, most of the countries considered inefficient (quadrants I and IV) are located in Europe & Central Asia (34), Latin America & Caribbean (18), and Middle East & North Africa (13);

  6. 6

    The United States, India, and Brazil have the highest confirmed cases among all countries, but only the USA has its growth rate on a upward trend;

  7. 7

    Regarding the deaths, The United States, Brazil, India, Mexico and Peru have the highest number of victims among all countries, but Peru have its growth rate on an downward trend;

The figures mentioned above from our evaluation framework show that the most developed countries are not necessarily the most efficient in combating COVID-19. Europe is the most developed continent in the world and is home to 4 of the 7 members that constitute the G7, but most countries are trending upwards and considered inefficient, for confirmed cases and deaths.

Furthermore, even though 43.3% of the world’s population has received at least one dose of the COVID-19 vaccine [56], Figs. 8 and 9 illustrate that several countries are again facing waves of contagion, including those that were pioneers in vaccinating their populations, like Israel [57]. Another example comes from Figure 1: the USA is experiencing a new wave of infections and deaths similar to what they experienced in mid-December/2020 when they started their vaccination campaign against COVID-19.

Conclusion

The purpose of this paper is to propose a new framework to monitor and assess, daily, the performance of countries in the fight against COVID-19. Our process will provide a greater understanding by stakeholders (policy-makers, public health workers, and the general public) of the evolution of the disease in each country, thereby improving public policies for mitigating or suppressing the effects of COVID-19 on society ahead of obtaining a vaccine. Our new methodology proves more effective in explaining the evolution of COVID-19 worldwide than traditional growth functions, including highlighting several inflection points and regime-switching moments. Moreover, results from this research can be used by managers, for example, to provide an econometric justification for the prioritizing of vaccination programmes in the health care sector.

The use of GAM functions to predict the confirmed cases and deaths prove adequate, even with the occurrence of spikes or “second waves” in these series. Our new approach even allows the identification of several inflection points throughout the daily confirmed cases and deaths series: an advance when compared to traditional growth functions.

However, we recognize that for monitoring pandemics and epidemic outbreaks in the early stages, growth functions are still important for this purpose, as demonstrated by [18]. This brings us to the main limitation of our evaluation framework: the size of the epidemiological time series. It was empirically verified that the smallest size for the proper use of GAM functions and Markov-Switching models is 60 observations.

We incorporate new evidence that pandemic fatigue is taking hold, especially after the start of vaccination. This decrease in commitment to fighting the pandemic alters the behavior of the forecast errors present in the COVID-19 curves, causing a structural break in the variance of the residuals, or forecast errors. This allows us the opportunity to take advantage of Markov-Switching (MS) models, built on the residuals of our forecasts, to specifically identify this behavior using regime-switching in the COVID-19 time series. The application of the Markov-Switching models in the residuals of the GAM functions proves to be viable concerning the identification of structural breaks in these series, effectively pointing to a prevalence of the MSwM models for the confirmed cases, and the MSGARCH for deaths. Besides, when the MSGARCH is chosen, the prevalent model indicates that the variance in the eGARCH model responds asymmetrically to rises and falls in COVID-19 numbers.

Finally, our new framework for assessing the effectiveness of countries in controlling the spread of COVID-19 in their territories, as well as the number of deaths, provides a new lens for visualizing and understanding the world panorama, helping to identify the countries with the most effective strategies, and even allowing additional new explanatory variables to be used in the y-axis, such as the death rate from infected people. The new outcomes presented in this research will allow key stakeholders to check whether or not public policies and interventions in the fight against COVID-19 are having an effect. We can easily identify examples of best practice and promote such policies more widely around the world. Not least, the application of our evaluation framework to the vaccine dataset developed by [56] is our main recommendation for future studies.

Availability of data and materials

Countries’ data were collected using tidycovid19 R package [6]. Once the package is installed, public access to the database is free and open. To access the link for the global database of COVID-19 vaccinations, see [56]. Besides, all codes written in R used in this study (Additional file 1, Additional file 2, and Additional file 3) are available in supplementary material, as well as a list of countries abbreviations (Additional file 4), to promote and disseminate our findings widely.

Abbreviations

CSSE:

Centre for Systems Science and Engineering

GAM:

Generalized Additive Models

REML:

restricted maximum likelihood

ML:

maximum likelihood

GCV:

generalized cross-validation

RRSE:

Root Relative Squared Error

WHO:

World Health Organization

MSwM:

pure Markov-Switching

MSGARCH:

Markov-Switching Generalized Autoregressive Conditional Heteroskedasticity

GED:

generalized error distribution

PAM:

Partitioning Around Medoids

ASW:

Average Silhouette Width

References

  1. Houlihan CF, Whitworth JA. Outbreak science: recent progress in the detection and response to outbreaks of infectious diseases. Clin Med. 2019; 19(2):140–44. https://doi.org/10.7861/clinmedicine.19-2-140.

    Article  Google Scholar 

  2. Chowell G, Hyman JM. Mathematical and Statistical Modeling for Emerging and Re-emerging Infectious Diseases. Cham: Springer; 2016, p. 356. https://doi.org/10.1007/978-3-319-40413-4. https://link.springer.com/10.1007/978-3-319-40413-4.

    Book  Google Scholar 

  3. Bar-Yam Y. Transitions to extinction: pandemics in a connected world. 2016. https://necsi.edu/transition-to-extinction. Accessed 04 Jan 2021.

  4. Norman J, Bar-Yam Y, Taleb NN. Systemic risk of pandemic via novel pathogens – Coronavirus: A note. 2020. https://necsi.edu/systemic-risk-of-pandemic-via-novel-pathogens-coronavirus-a-note. Accessed 04 Jan 2021.

  5. Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect Dis. 2020; 20(5):533–34. https://doi.org/10.1016/S1473-3099(20)30120-1.

    Article  CAS  Google Scholar 

  6. Gassen J. tidycovid19: Download, Tidy and Visualize Covid-19 Related Data. 2020. https://joachim-gassen.github.io/tidycovid19/. Accessed 07 Dec 2020.

  7. Annan JD, Hargreaves JC. Model calibration, nowcasting, and operational prediction of the COVID-19 pandemic. medRxiv. 2020; preprint. https://doi.org/10.1101/2020.04.14.20065227.

  8. Hilton J, Keeling MJ. Estimation of country-level basic reproductive ratios for novel Coronavirus (SARS-CoV-2/COVID-19) using synthetic contact matrices. PLoS Comput Biol. 2020; 16(7):1008031. https://doi.org/10.1371/journal.pcbi.1008031.

    Article  CAS  Google Scholar 

  9. Vasconcelos GL, Macêdo AMS, Ospina R, Almeida FAG, Duarte-Filho GC, Brum AA, Souza ICL. Modelling fatality curves of COVID-19 and the effectiveness of intervention strategies. PeerJ. 2020; 8:9421. https://doi.org/10.7717/peerj.9421.

    Article  Google Scholar 

  10. Baud D, Qi X, Nielsen-Saines K, Musso D, Pomar L, Favre G. Real estimates of mortality following COVID-19 infection. Lancet Infect Dis. 2020; 20(7):773. https://doi.org/10.1016/S1473-3099(20)30195-X.

    Article  CAS  Google Scholar 

  11. Wu Y, Jing W, Liu J, Ma Q, Yuan J, Wang Y, Du M, Liu M. Effects of temperature and humidity on the daily new cases and new deaths of COVID-19 in 166 countries. Sci Total Environ. 2020; 729(10):139051. https://doi.org/10.1016/j.scitotenv.2020.139051.

    Article  CAS  Google Scholar 

  12. Shi P, Dong Y, Yan H, Zhao C, Li X, Liu W, He M, Tang S, Xi S. Impact of temperature on the dynamics of the COVID-19 outbreak in China. Sci Total Environ. 2020; 728(1):138890. https://doi.org/10.1016/j.scitotenv.2020.138890.

    Article  CAS  Google Scholar 

  13. McKibbin W, Fernando R. The Global Macroeconomic Impacts of COVID-19: Seven Scenarios. 2020. https://www.brookings.edu/research/the-global-macroeconomic-impacts-of-covid-19-seven-scenarios/. Accessed 08 June 2020.

  14. Lin Z, Meissner C. Health vs. Wealth? Public Health Policies and the Economy During Covid-19. Technical report. Cambridge: National Bureau of Economic Research; 2020. https://doi.org/10.3386/w27099. http://www.nber.org/papers/w27099.pdf.

  15. Taleb NN, Read R, Douady R, Norman J, Bar-Yam Y. The precautionary principle (with application to the genetic modi cation of organisms). 2014. https://arxiv.org/abs/1410.5787. Accessed 04 Jan 2021.

  16. EndCoronavirus.org. Some are winning, some are not. Which countries do best in beating COVID-19? 2021. https://www.endcoronavirus.org/countries. Accessed 08 June 2020.

  17. Financial Times. Coronavirus tracked: the latest fi gures as countries fight to contain the pandemic. 2021. https://www.ft.com/content/a2901ce8-5eb7-4633-b89c-cbdf5b386938. Accessed 08 June 2020.

  18. de Oliveira AMB, Mandal A, Power GJ, Felipe I. J. d. S.Monitoring COVID-19 in Brazil: an application of growth functions to assess the performance of States. Rev Tecnol Soc. 2021; 17(46):264. https://doi.org/10.3895/rts.v17n46.12363.

    Article  Google Scholar 

  19. Maragakis LL. Coronavirus Second Wave? Why Cases Increase. 2020. https://www.hopkinsmedicine.org/health/conditions-and-diseases/coronavirus/first-and-second-waves-of-coronavirus. Accessed 07 Dec 2020.

  20. Hastie TJ, Tibshirani RJ. Generalized Additive Models. London: Chapman and Hall; 1990, p. 335.

    Google Scholar 

  21. Wood SN. Generalized Additive Models. New York: Chapman and Hall/CRC; 2017, p. 496. https://doi.org/10.1201/9781315370279. https://www.taylorfrancis.com/books/9781498728348.

    Book  Google Scholar 

  22. Wright DB, London K. Modern Regression Techniques Using R: a Practical Guide for Students and Researchers. London: Sage publications Ltd; 2009, p. 216.

    Book  Google Scholar 

  23. Jones K, Almond S. Moving out of the Linear Rut: The Possibilities of Generalized Additive Models. Trans Inst Br Geogr. 1992; 17(4):434. https://doi.org/10.2307/622709.

    Article  Google Scholar 

  24. Seber GAF, Wild CJ. Nonlinear Regression. Wiley Series in Probability and Statistics. Hoboken: John Wiley and Sons; 1989, p. 768. https://doi.org/10.1002/0471725315. http://doi.wiley.com/10.1002/0471725315.

    Google Scholar 

  25. Rodriguez CRS, Valdés LS, Shkedy Z, Vega VS, Escobar CR, Pérez AM. Model uncertainty in the comparison of two single dengue outbreaks. Rev Invest Oper. 2020; 41(3):344–51.

    Google Scholar 

  26. Wood SN. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 2011; 73(1):3–36. https://doi.org/10.1111/j.1467-9868.2010.00749.x.

  27. Evans JS. spatialEco: Spatial Analysis and Modelling Utilities. 2021. https://cran.r-project.org/package=spatialEco. Accessed 07 Dec 2020.

  28. Izadi F. Generalized additive models to capture the death rates in Canada COVID-19. arXiv. 2020:1–19. http://arxiv.org/abs/2008.01030. Accessed 09 Nov 2020.

  29. Skynews. Coronavirus: Second COVID-19 wave faster than the first, warns top European scientist. 2020. https://news.sky.com/story/coronavirus-second-covid-19-wave-faster-than-the-first-warns-top-european-scientist-12112000. Accessed 07 Dec 2020.

  30. WHO. Pandemic Fatigue: Reinvigorating the Public to Prevent Pandemic Fatigue. 2020. https://apps.who.int/iris/bitstream/handle/10665/335820/WHO-EURO-2020-1160-40906-55390-eng.pdf. Accessed 04 Jan 2021.

  31. McGrail DJ, Dai J, McAndrews KM, Kalluri R. Enacting national social distancing policies corresponds with dramatic reduction in COVID19 infection rates. PLoS ONE. 2020; 15(7):0236619. https://doi.org/10.1371/journal.pone.0236619.

    Article  CAS  Google Scholar 

  32. Shafer LA, Nesca M, Balshaw R. Relaxation of social distancing restrictions: Model estimated impact on COVID-19 epidemic in Manitoba, Canada. PLoS ONE. 2021; 16(1):0244537. https://doi.org/10.1371/journal.pone.0244537.

    Article  CAS  Google Scholar 

  33. Franses PH, van Dijk D. Nonlinear Time Series Models in Empirical Finance. Cambridge: Cambridge University Press; 2000, p. 298. https://doi.org/10.1017/CBO9780511754067. https://ebooks.cambridge.org/ref/id/CBO9780511754067.

    Book  Google Scholar 

  34. Lamoureux CG, Lastrapes WD. Persistence in Variance, Structural Change, and the GARCH Model. J Bus Econ Stat. 1990; 8(2):225–34. https://doi.org/10.1080/07350015.1990.10509794.

    Google Scholar 

  35. Kim C-J, Nelson CR, Startz R. Testing for mean reversion in heteroskedastic data based on Gibbs-sampling-augmented randomization. J Empir Finance. 1998; 5(2):131–54. https://doi.org/10.1016/S0927-5398(97)00015-7.

    Article  CAS  Google Scholar 

  36. Engel C, Hamilton JD. Long swings in the dollar: are they in the data and do markets know it?Am Econ Rev. 1990; 80(4):689–713.

    Google Scholar 

  37. Sanchez-Espigares JA, Lopez-Moreno A. MSwM: Fitting Markov Switching Models. 2018. https://cran.r-project.org/package=MSwM. Accessed 08 June 2020.

  38. Haas M. A New Approach to Markov-Switching GARCH Models. J Financ Econ. 2004; 2(4):493–530. https://doi.org/10.1093/jjfinec/nbh020.

    Google Scholar 

  39. Engle RF. Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation. Econometrica. 1982; 50(4):987. https://doi.org/10.2307/1912773.

    Article  Google Scholar 

  40. Bollerslev T. Generalized autoregressive conditional heteroskedasticity. J Econ. 1986; 31(3):307–27. https://doi.org/10.1016/0304-4076(86)90063-1.

    Article  Google Scholar 

  41. Nelson DB. Conditional Heteroskedasticity in Asset Returns: A New Approach. Econometrica. 1991; 59(2):347. https://doi.org/10.2307/2938260.

    Article  Google Scholar 

  42. Glosten LR, Jagannathan R, Runkle DE. On the Relation between the Expected Value and the Volatility of the Nominal Excess Return on Stocks. J Financ. 1993; 48(5):1779–801. https://doi.org/10.1111/j.1540-6261.1993.tb05128.x.

    Article  Google Scholar 

  43. Zakoian J-M. Threshold heteroskedastic models. J Econ Dyn Control. 1994; 18(5):931–55. https://doi.org/10.1016/0165-1889(94)90039-6.

    Article  Google Scholar 

  44. Trottier D-A, Ardia D. Moments of standardized Fernandez–Steel skewed distributions: Applications to the estimation of GARCH-type models. Finance Res Lett. 2016; 18:311–16. https://doi.org/10.1016/j.frl.2016.05.006.

    Article  Google Scholar 

  45. Ardia D, Bluteau K, Boudt K, Catania L, Trottier D-A. Markov-Switching GARCH Models in R : The MSGARCH Package. J Stat Softw. 2019; 91(4):1–38. https://doi.org/10.18637/jss.v091.i04.

    Article  Google Scholar 

  46. Ghalanos A. rugarch: Univariate GARCH models. 2020. https://cran.r-project.org/package=rugarch. Accessed 08 June 2020.

  47. BBCUSACanada. Covid-19: firrst vaccine given in US as roll-out begins. 2020. https://www.bbc.com/news/world-us-canada-55305720. Accessed 11 Jan 2021.

  48. Guarino B, McGinley L. Vaccines show declining effectiveness against infection overall but strong protection against hospitalization amid delta variant. 2021. https://www.washingtonpost.com/health/2021/08/18/covid-vaccine-effectiveness/. Accessed 30 Aug 2021.

  49. Kassambara A. Practical Guide to Cluster Analysis in R: Unsupervised Machine Learning. Marseille: STHDA; 2017, p. 187. http://www.sthda.com/english/articles/25-clusteranalysis-in-r-practical-guide/. Accessed 08 June 2020.

    Google Scholar 

  50. Konstantinov V, Berdenova S, Satkangulova G, Reznik A, Isralowitz R. COVID-19 Impact on Kazakhstan University Student Fear, Mental Health, and Substance Use. Int J Ment Health Addict. 2020. https://doi.org/10.1007/s11469-020-00412-y.

  51. Barbosa-Camacho FJ, García-Reyna B, Cervantes-Cardona GA, Cervantes-Pérez E, Chavarria-Avila E, Pintor-Belmontes KJ, Guzmán-Ramírez BG, Bernal-Hernández A, Ibarrola-Peña JC, Fuentes-Orozco C, González-Ojeda A, Cervantes-Guevara G. Comparison of Fear of COVID-19 in Medical and Nonmedical Personnel in a Public Hospital in Mexico: a Brief Report. Int J Ment Health Addict. 2021. https://doi.org/10.1007/s11469-021-00600-4.

  52. Isralowitz R, Konstantinov V, Gritsenko V, Vorobeva E, Reznik A. First and Second Wave COVID-19 Impact on Russian Medical Student Fear, Mental Health and Substance Use. J Loss Trauma. 2021; 26(1):94–96. https://doi.org/10.1080/15325024.2021.1872274.

    Article  Google Scholar 

  53. Mittal R, Su L, Jain R. COVID-19 mental health consequences on medical students worldwide. J Community Hosp Intern Med Perspect. 2021; 11(3):296–98. https://doi.org/10.1080/20009666.2021.1918475.

    Article  Google Scholar 

  54. Ahorsu DK, Lin C-Y, Imani V, Saffari M, Griffiths MD, Pakpour AH. The Fear of COVID-19 Scale: Development and Initial Validation. Int J Ment Health Addict. 2020. https://doi.org/10.1007/s11469-020-00270-8.

  55. Hao F, Wang B, Tan W, Husain SF, McIntyre RS, Tang X, Zhang L, Han X, Jiang L, Chew NWS, Tan BY-Q, Tran B, Zhang Z, Vu GL, Vu GT, Ho R, Ho CS, Sharma VK. Attitudes toward COVID-19 vaccination and willingness to pay: comparison of people with and without mental disorders in China. BJPsych Open. 2021; 7(5):146. https://doi.org/10.1192/bjo.2021.979.

    Article  Google Scholar 

  56. Mathieu E, Ritchie H, Ortiz-Ospina E, Roser M, Hasell J, Appel C, Giattino C, Rodés-Guirao L. A global database of COVID-19 vaccinations. Nat Hum Behav. 2021; 5(7):947–53. https://doi.org/10.1038/s41562-021-01122-8.

    Article  Google Scholar 

  57. Avis D. Israel’s Covid surge shows the world what’s coming next. 2021. https://www.bloomberg.com/news/articles/2021-09-07/israel-s-covid-surge-shows-the-world-what-s-coming-next. Accessed 13 Sep 2021.

Download references

Acknowledgements

Not applicable.

Funding

No funding was obtained for this study.

Author information

Authors and Affiliations

Authors

Contributions

Authors’ contributions

All authors conceptualised the idea for this paper. Data collection and data analysis were undertaken by AMBO. AMBO wrote the first draft of the manuscript with critical reviews from JMB, AM, LK and GJP. All authors reviewed and contributed to all drafts of the manuscript including the final manuscript. The authors read and approved the final manuscript.

Authors’ information

Not applicable.

Corresponding author

Correspondence to Abdinardo M. B. de Oliveira.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1

Script1.

Additional file 2

Script2.

Additional file 3

Script3.

Additional file 4

List of countries’ abbreviations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

de Oliveira, A.M., Binner, J.M., Mandal, A. et al. Using GAM functions and Markov-Switching models in an evaluation framework to assess countries’ performance in controlling the COVID-19 pandemic. BMC Public Health 21, 2173 (2021). https://doi.org/10.1186/s12889-021-11891-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12889-021-11891-6

Keywords