Using meta-learning to recommend an appropriate time-series forecasting model

Talkhi, Nasrin; Akhavan Fatemi, Narges; Jabbari Nooghabi, Mehdi; Soltani, Ehsan; Jabbari Nooghabi, Azadeh

doi:10.1186/s12889-023-17627-y

Research
Open access
Published: 10 January 2024

Using meta-learning to recommend an appropriate time-series forecasting model

Nasrin Talkhi¹,
Narges Akhavan Fatemi²,
Mehdi Jabbari Nooghabi²,
Ehsan Soltani³ &
…
Azadeh Jabbari Nooghabi³

BMC Public Health volume 24, Article number: 148 (2024) Cite this article

1248 Accesses
Metrics details

Abstract

Background

There are various forecasting algorithms available for univariate time series, ranging from simple to sophisticated and computational. In practice, selecting the most appropriate algorithm can be difficult, because there are too many algorithms. Although expert knowledge is required to make an informed decision, sometimes it is not feasible due to the lack of such resources as time, money, and manpower.

Methods

In this study, we used coronavirus disease 2019 (COVID-19) data, including the absolute numbers of confirmed, death and recovered cases per day in 187 countries from February 20, 2020, to May 25, 2021. Two popular forecasting models, including Auto-Regressive Integrated Moving Average (ARIMA) and exponential smoothing state-space model with Trigonometric seasonality, Box-Cox transformation, ARMA errors, Trend, and Seasonal components (TBATS) were used to forecast the data. Moreover, the data were evaluated by the root mean squared error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and symmetric mean absolute percentage error (SMAPE) criteria to label time series. The various characteristics of each time series based on the univariate time series structure were extracted as meta-features. After that, three machine-learning classification algorithms, including support vector machine (SVM), decision tree (DT), random forest (RF), and artificial neural network (ANN) were used as meta-learners to recommend an appropriate forecasting model.

Results

The finding of the study showed that the DT model had a better performance in the classification of time series. The accuracy of DT in the training and testing phases was 87.50% and 82.50%, respectively. The sensitivity of the DT algorithm in the training phase was 86.58% and its specificity was 88.46%. Moreover, the sensitivity and specificity of the DT algorithm in the testing phase were 73.33% and 88%, respectively.

Conclusion

In general, the meta-learning approach was able to predict the appropriate forecasting model (ARIMA and TBATS) based on some time series features. Considering some characteristics of the desired COVID-19 time series, the ARIMA or TBATS forecasting model might be recommended to forecast the death, confirmed, and recovered trend cases of COVID-19 by the DT model.

Peer Review reports

Introduction

In December 2019, a novel coronavirus emerged in Wuhan City, Hubei province of China [1]. Its high prevalence caused the virus to spread rapidly around the world and became a pandemic. Reports from the World Health Organization (WHO) showed that the virus expanded to all countries of the world, negatively affecting personal life, economy, industry, etc. [2].

This virus could survive on the surface for a few days and transmit rapidly from human to human [3]. The symptoms of this disease were fatigue, general weakness, difficulty breathing, chest pain, sore throat, fever, acute respiratory distress, muscular pain, etc. However, the majority of people had no symptoms [3, 4].

Governments implemented interventions and strategies such as maintaining social distancing, wearing masks, staying at home, not gathering in public places, etc., to reduce the pandemic trend [5]. However, after more than a year and many interventions to deal with the virus, this disease was still the cause of death of many people [6]. Considering its widespread distribution, the virus could recombine the genomes and create a new mutation. Therefore, this infectious disease was likely to appear periodically in humans [1].

If we compare SARS-COV-2 coronavirus with some other previous pandemics, we can observe that SARS-COV-2 had a considerably bigger impact than SARS coronavirus pandemic. In terms of mortality, COVID-19 is comparable with previous flu pandemics. But COVID-19 compared to the swine flu pandemic -also H1N1 (2009) Spanish flu (1918)- seemed relatively severe, because COVID-19 required more people to get hospitalized, while the swine flu pandemic did not [7, 8]. In 2014, Ebola emerged as a virus with an average fatality rate of 50%.

One major difference between Ebola and COVID-19 is the method of spread. Ebola is spread during the last stage of the disease through blood and sweat. Coronavirus is having airborne transmission.

In conclusion, regardless of the mortality rate or the number of confirmed cases, COVID-19 had devastating worldwide impact. Undoubtedly, scientists, statisticians, etc. will continue to learn more about how COVID-19 stacks up against other viruses [9].

In the previous studies, statistical, machine-learning forecasting methods also were applied to forecast these pandemics [10,11,12].

Forecasting of confirmed cases, death, and recovery in the future informs the increasing or decreasing trend of the COVID-19 disease in the future and makes necessary measures to save people’s lives to be thought of, therefore forecasting models can be very important and helpful.

In the current situation, data analysts have an important role to play. Forecasting the future behavior of the viral infection such as coronavirus with the help of statistical, mathematical, and machine learning models can provide prior useful information to governments and politicians regarding the behavior of the virus and predict the number of infected and death cases in the coming days. Nowadays, statistical techniques and machine learning algorithms are widely used in the medical field with successful results [13].

Time series forecasting has been a very active area of research since the 1950s [14]. The guidelines for time series analysis, finding an appropriate Auto Regressive Integrated Moving Average (ARIMA) model, and investigating autocorrelation function (ACF) and partial autocorrelation function (PACF) values of a time series are summarized in [15]. In the 1990s, the characteristics extracted from univariate time series were used to select the appropriate forecasting model for the first time [14].

Meta-learning supports data mining tasks [16]. The term ‘meta-learning’ was used for the first time in the literature of time series [17]. In the time series area, meta-learning demonstrates the process of automatically acquiring knowledge to identify the best forecasting model, which is based on the machine-learning community [14]. In other words, meta-learning refers to the process of investigating the relationship between learning strategies and tasks [16]. In fact, the main property of meta-learning is to understand the nature of data and learn based on the characteristics extracted from time series, to choose the best forecasting model for a particular data type [16].

The meta-learning framework includes three major steps as follows: (a) fitting the forecasting models and their performance evaluation, (b) extracting the characteristics of time series, and (c) rule induction (Fig. 1), which can lead to a recommender system.

Many studies have been done in this area. For instance, Malki et al. conducted three studies in the field of COVID-19 disease [18,19,20]. In one of these studies, the Seasonal AutoRegressive Integrated Moving Average (SARIMA) model was applied to predict the spread of the coronavirus in several countries. In another, machine learning approaches are considered to predict the spread of COVID-19 in many countries. In a study conducted by Harbola et al., the COVID-19 outbreak was forecasted using long short-term memory (LSTM). LSTM model showed the trend of infected cases of COVID-19 increased exponentially every week [21].

Therefore, along with all the studies that have been done [22,23,24,25,26,27], the existence of a recommender system that suggests the appropriate model for making future predictions can also be helpful and practical as well as save time and costs.

In this study, despite spending time and cost, the main goal was to achieve a recommender system design using a meta-learning approach. This system selects and recommends the best forecasting model from two popular forecasting models, Autoregressive Integrated Moving Average (ARIMA) and exponential smoothing state-space model with Trigonometric seasonality, Box-Cox transformation, ARMA errors, Trend, and Seasonal components (TBATS), using time series features. The result of this research can be used for the prevalence of other infectious diseases such as respiratory diseases, etc.

Methodology

Data description

In this study, the COVID-19 data included the absolute numbers of confirmed, death, and recovered cases per day from February 20, 2020, to May 25, 2021, for 187 countries. This data was obtained from the GitHub online repository. Therefore, there were a total of 561 series (187 series related to confirmed cases, 187 to death cases, and 187 to recovered cases). We selected a total of 400 series to construct the model, randomly.

Forecasting methods

ARIMA model was applied to non-stationary time series models and became stationary with operators such as difference, logarithm, root, etc. It is the most well-known model used for time series forecasting problems [25, 27]. The model is a combination of an auto-regressive (AR) model and a moving average (MA) model, and a white noise process [28]. In fact, the ARIMA model is an ARMA time series model that has been differentiated d times and it is indicated by the ARIMA (p,d,q) symbol [29]. Multiplicative seasonal ARIMA models are defined with non-seasonal orders p, d, and q, seasonal orders P, D, and Q, and seasonal period s (ARIMA(p, d, q)(P, D, Q)[s]) [29].

BATS and TBATS are two interesting models of time series able to capture seasonality patterns in modeling series [30]. In fact, the BATS model is an extension of traditional seasonal models or state-space models. In addition, these models handle nonlinearity models using Box-Cox transformation. The B notation in BATS refers to Box-Cox transformation. Other notations (e.g., A, T, and S) in BATS refer to errors of ARIMA, trend, and seasonal components, respectively. A flexible approach of BATS was introduced as TBATS. This model used the Fourier series to the representation of seasonal components of time series [31].

In this study, 80% of the observations at the beginning of each series were used as training data, and the remaining 20% as testing data. Two models, including the ARIMA and TBATS, were fitted to each of the 400 series. Using forecasting evaluation metrics or error measures, time series were labeled based on the most appropriate model among ARIMA and TBATS.

Error measures

Four error measures were used to evaluate and validate the forecasting performance of the models. All four evaluation metrics measure the difference between the prediction values and the real outcome values or errors. The root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and symmetric mean absolute percentage error (SMAPE) were applied to compare the accuracy of forecasting models. Smaller values of these error measures indicate more accurate model prediction. The formulas for these error measures are as follows:

$$ \text{R}\text{M}\text{S}\text{E}=\sqrt{\frac{1}{\text{N}}\sum _{\text{i}=1}^{\text{N}}{({\text{Y}}_{\text{i}}-{\widehat{\text{Y}}}_{\text{i}})}^{2} }$$

$$ \text{M}\text{A}\text{E}=\frac{1}{\text{N}}\sum _{\text{i}=1}^{\text{N}}\left|{\text{y}}_{\text{i}}-{\widehat{\text{y}}}_{\text{i}}\right|,$$

$$ \text{M}\text{A}\text{P}\text{E}=\frac{1}{\text{N}}\sum _{\text{i}=1}^{\text{N}}\frac{|{\text{y}}_{\text{i}}-{\widehat{\text{y}}}_{\text{i}}|}{{\text{y}}_{\text{i}}}\text{*}100\text{\%}$$

,

$$ \text{s}\text{M}\text{A}\text{P}\text{E}=\frac{1}{\text{N}}\sum _{\text{i}=1}^{\text{N}}\frac{|{\text{y}}_{\text{i}}-{\widehat{\text{y}}}_{\text{i}}|}{|{\text{y}}_{\text{i}}+{\widehat{\text{y}}}_{\text{i}}|}\text{*}100\text{\%}$$

.

Meta-feature extraction

The various characteristics of time series based on univariate time series structure were investigated [14, 32, 33]. We considered a set of hand-selected features in our study. These features describe the characteristics of time series. The functions available in R software v.4.0.2 were implemented to extract the time series characteristics or meta-features. To summarize the time series structure, 30 characteristics were selected and listed in Table 1.

Table 1 Hand-selected extracted features on time series

Full size table

Meta-learning

The goal of meta-learning is to “…understand how learning itself can become flexible according to the domain or task under study” [34]. The process of meta-learning transforms the problem space into a feature space. Moreover, the extracted meta-features are applied as input and class labels are applied as the outcome in meta-learners. The class labels are the best forecasting algorithm for each time series [35].

The meta-learner may be a machine-learning algorithm. So, there are several supervised machine-learning algorithms [36]. In this study, four machine-learning algorithms, including decision tree (DT), support vector machines (SVM), artificial neural networks (ANN), and random forest (RF) were applied as meta-learners.

DT is a machine-learning and non-parametric method. It is a popular and tree-based method applied to both classification and regression [33, 37]. In fact, the advantage of tree-based methods is that they are flexible and are used to solve non-linear problems with large dimensions and simplify the interpretability of the model [37]. SVM is a powerful and effective technique [38]. It is a learning system from data applied for classification and regression problems [3]. One of the other well-known algorithms in the artificial intelligence area is the ANN algorithm. This algorithm works based on biological human neurons [39]. In many fields, the ANNs have achieved great success in solving various real-world problems. Moreover, these algorithms are recognized as a powerful tool in identifying and exploring the relationship between network inputs (extracted meta-features in the current study) and outputs (created labels in the current study) [39]. The random forest model is one of the most successful ensemble methods [40]. This method is a stronger learner machine than a decision tree learner machine [41].

In the following, to perform this study, the various characteristics of each time series were extracted and the dataset for classification analysis was obtained. Then, this data was randomly divided into training (80%) and testing data (20%). Next, the desired meta-learners were used to predict the ARIMA or TBATS model as a recommended forecasting model for each time series. As well, DT can provide some practical rules for prediction tasks.

Results

According to the results (Fig. 2), an increasing and periodic trend was observed in the global trend of confirmed, death, and recovered cases from February 20, 2020, to May 25, 2021. The maximum numbers of infected, death, and recovered cases in the world were 1,498,213, 21,577, and 6,606,167, respectively. For a better representation, the observation value of 6,606,167 was multiplied by 0.01.

Two of the most powerful forecasting models, including ARIMA and TBATS, were fitted on all of the time series. The RMSE, MAE, MAPE, and SMAPE were used to evaluate and find the more appropriate models among ARIMA and TBATS for each time series. Therefore, all the 400 series were labeled with ARIMA or TBATS labels.

In the next phase, we extracted 30 features as meta-features from the available time series. Finding appropriate time series features for classification is not straightforward, as time series analysis is a complex issue. Thus, the used features were hand-selected manually. These features and their descriptions are summarized in Table 1.

After providing the data frame required for the classification task, in the next phase, we intended to classify time series. For this purpose, SVM, DT, ANN, and RF were applied as meta-learners or classifiers.

The 10-fold cross-validation (k-fold CV) method was considered for hyper-parameters tuning and model evaluation on the training and test datasets using RMSE, MAE, MAPE, and SMAPE criteria. Then, the model with the less predicted error was selected. The accuracy of meta-learners is presented in Table 2.

Table 2 Accuracy of classification algorithms

Full size table

The DT model had a better performance in the classification of time series. The detailed results of the DT classifier, including confusion matrix, accuracy, sensitivity, specificity, etc. are shown in Table 3.

Table 3 Confusion matrix of DT algorithm in the train and test phases

Full size table

The tree plot is visualized in Fig. 3. The extracted 18 rules are reported in Table 4. Rule 1 shows that if a time series has features such as e_acf1< -0.0513 and curvature< -70.3295, with a 0.962 probability, ARIMA is an appropriate model for forecasting its future trend. In other words, having the values of characteristics e_acf1< -0.0513 and curvature< -70.3295 of a time series, with a 0.962 probability, the appropriate forecasting model will be ARIMA. Meanwhile, in a desired time series, if e_acf1>= -0.0513, nonlinearity > = 0.0931, flat_spots > = 65, linearity < 16070.36, mean < 699.7255, and mean > = 459.6322, then the predicted class would be TBATS with a probability of 0.909. Therefore, using these characteristics, we can predict TBATS as an appropriate forecasting model for this time series.

Table 4 Extracted rules of DT algorithm

Full size table

Discussion

Today, forecasting is widely applied in many fields, such as marketing, finance, healthcare, etc. An accurate forecast of the future can be very helpful and provide information on efficiency and cost reduction [42]. Machine learning has grown rapidly and dramatically in the fields of medicine and healthcare [43]. Moreover, it has been used in the field of prediction with successful results [43].

Also, it is a modern method including sophisticated algorithms used in time series and forecasting. In fact, machine learning attempts to discover and extract the patterns and concepts embedded in large data and predict the desired target [44].

In this study, to recommend ARIMA and TBATS forecasting models, the meta-learner of the meta-learning process (DT algorithm) achieved an accuracy of 87.50% and 82.50% in the training and test phases, respectively. Two other meta-learners (i.e., SVM and ANN algorithms) had less accuracy than the DT algorithm in both the training and test phases.

The sensitivity and specificity in the training phase of the DT algorithm were obtained as 86.66% and 88.38%, respectively. In addition, these values in the test phase were 73.33% and 88%, respectively. Thus, the meta-learning approach can predict the appropriate forecasting model (ARIMA and TBATS) with 82.50% accuracy.

It should also be noted that initially four of the strongest statistical models for time series forecasting, including ARIMA, TBATS, ETS (Error Trend and Seasonality, or exponential smoothing), and multiple aggregation prediction algorithm (MAPA) were considered. In the labeling phase, ETS and MAPA had a low frequency and were excluded. Thus, the analyses were performed using the ARIMA and TBATS models.

To the best of our knowledge, this approach has not been applied to recommend a forecasting model based on meta-learning so far. However, many studies in different countries have been conducted to find the best forecasting model with the least forecasting error.

In our previous study [22], the appropriate models to forecast the number of confirmed and death cases were identified the MLP and Holt-Winter model. The web application for visualizing the results is available at.

http://shiny.um.ac.ir/jabbarinm/Covid19/.

Some previous studies have concluded that machine-learning models performed better than classical models such as the ARIMA. Yadav et al. applied some models such as the support vector regression (SVR) model to forecast the future number of total, active, and recovered cases. They also compared the results of the proposed method with other well-known regression models such as simple linear regression and polynomial regression [3].

Yang et al. (2020) used the ARIMA models to forecast the number of new confirmed and death cases in Italy [45]. The ANN was applied by Farooq and Bazaz in the five worst-affected states of India. An online incremental learning technique was performed along with the ANN model. They forecasted the future behavior of COVID-19 disease for the coming 30 days [46].

Christie et al. compared three forecasting methods, including ARIMA, single exponential smoothing (SES), and double exponential smoothing (DES) using the MAPE, and RMSE measures. They showed that the ARIMA is the best model for forecasting COVID-19 disease [47].

Rostami-Tabar and Rendon-Sanchez used a simple multiple linear regression model using the calls received in a call center (phone call data) and fitted the ARIMA, ETS, seasonal naive, prophet, and a regression model without call data. They concluded that the simple multiple linear regression model with call data performed better than other models [48]. It is believed that the models used in this study are very accurate with important predictive variables and high predictability. However, the main limitations of this study include the unavailability of more data and effective predictor variables.

Moftakhar et al. used two ANN and ARIMA models to forecast the number of future cases in the coming 30 days in Iran. They concluded that the ARIMA model was a more accurate method [24], which is similar to our results.

The MLP model proposed by Pantoh et al. was identified as an appropriate model for forecasting the numbers of confirmed, death, and recovered cases using cumulative data [4].

Khan, Saeed, and Ali used the daily absolute confirmed, death, and recovered cases in Pakistan from March 8 to June 27, 2020. They fitted a VAR model to forecast new infected cases and new recovered cases in the next 10 days, i.e. on the 3rd of July [49].

It should be noted that it cannot be said with certainty that machine-learning models perform better than other existing models or that classical models perform better than machine-learning models. Each model can have different results over a time window compared to other models. This depends on the type and nature of the data, the circumstances, and the time window under consideration.

This study has some limitations. First, data for some countries were not fully reported and thus were not usable and hence, we had to exclude them. Second, even though we considered four of the strongest statistical forecasting models, two models ETS and MAPA were selected as appropriate models for a few time series, and as a result, we had to leave these two models aside and the study process continued with the other two models. Third, machine learning forecasting models were not used along with statistical models due to complexity, time-consuming, and cost, and therefore, only statistical forecasting models were used. Fourth, there are various machine-learning algorithms that can be used as meta-learners, and four of them were compared in the current study due to the extensive and time-consuming work. However, the index values showed that the final model (DT) has relatively accurate predictive ability.

Conclusion

In this study, we achieved a recommender system to select a forecasting model among ARIMA and TBATS using the meta-learning process. Our results showed that among the four meta-learners, namely SVM, DT, ANN, and RF, the DT algorithm had a better predictive accuracy. Therefore, the DT algorithm with 87.50% accuracy in training and 82.50% accuracy in the test phase as the best model, was provided some practical rules. These rules recommended one of two models ARIMA and TBATS to forecast the health time series data such as the confirmed, death, and recovered COVID-19 cases in each country according to the characteristics of their time series.

Data availability

The dataset used to train and evaluate the models is publicly available at https://ourworldindata.org/coronavirus-source-data/. Additionally, datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Zhu N, et al. A novel coronavirus from patients with pneumonia in China, 2019. New England journal of medicine; 2020.
Brem A, Viardot E, Nylund PA. Implications of the coronavirus (COVID-19) outbreak for innovation: which technologies will improve our lives? Technol Forecast Soc Chang. 2021;163:120451.
Article Google Scholar
Yadav M, Perumal M, Srinivas M. Analysis on novel coronavirus (COVID-19) using machine learning methods. Volume 139. Chaos, Solitons & Fractals; 2020;110050.
Pontoh RS, et al. Covid-19 modelling in South Korea using a Time Series Approach. Int J Adv Sci Technol. 2020;29(7):1620–32.
Google Scholar
Belhadi A, et al. Manufacturing and service supply chain resilience to the COVID-19 outbreak: lessons learned from the automobile and airline industries. Technol Forecast Soc Chang. 2021;163:120447.
Article Google Scholar
Ballı S. Data analysis of Covid-19 pandemic and short-term cumulative case forecasting using machine learning time series methods. Chaos Solitons Fractals. 2021;142:110512.
Article PubMed Google Scholar
Joi P. How does COVID-19 compare to past pandemics? 2020; Available from: https://www.gavi.org/vaccineswork/how-does-covid-19-compare-past-pandemics.
WHO. Coronavirus disease (COVID-19): Similarities and differences between COVID-19 and Influenza. 2021; Available from: https://www.who.int/news-room/questions-and-answers/item/coronavirus-disease-covid-19-similarities-and-differences-with-influenza#:~:text=Both%20viruses%20share%20similar%20symptoms,COVID%2D19%20can%20be%20fatal.
How does COVID-19 compare to other pandemics (H1N1, Ebola). 2023; Available from: https://www.tulsaspinehospital.com/virtualcare/articles/how-does-covid-19-compare-other-pandemics-h1n1-ebola.
Morris M, et al. Neural network models for influenza forecasting with associated uncertainty using web search activity trends. PLoS Comput Biol. 2023;19(8):e1011392.
Article CAS PubMed PubMed Central Google Scholar
Ristic B, Dawson P. Real-time forecasting of an epidemic outbreak: Ebola 2014/2015 case study. in 2016 19th International Conference on Information Fusion (FUSION). 2016.
Tsan YT et al. The prediction of influenza-like illness and respiratory Disease using LSTM and ARIMA. Int J Environ Res Public Health, 2022;19(3).
Srinivas M, Lin YY, Liao HYM. Deep dictionary learning for fine-grained image classification. in 2017 IEEE International Conference on Image Processing (ICIP). 2017.
Lemke C, Gabrys B. Meta-learning for time series forecasting and forecast combination. Neurocomputing. 2010;73(10):2006–16.
Google Scholar
Makridakis S, Wheelwright S, Hyndman R. Forecasting: Methods and Applications, third ed., John Wiley, New York, 1998. 1998, New York: John Wiley.
Wang X, Smith-Miles K, Hyndman R. Rule induction for forecasting method selection: Meta-learning the characteristics of univariate time series. Neurocomputing. 2009;72(10):2581–94.
Article Google Scholar
Prudêncio R, Ludermir T. Using machine learning techniques to combine forecasting methods. in Australasian Joint Conference on Artificial Intelligence. 2004. Springer.
Malki Z, et al. ARIMA models for predicting the end of COVID-19 pandemic and the risk of second rebound. Neural Comput Appl. 2021;33(7):2929–48.
Article Google Scholar
Malki Z, et al. The COVID-19 pandemic: prediction study based on machine learning models. Environ Sci Pollut Res. 2021;28(30):40496–506.
Article Google Scholar
Malki Z, et al. Association between weather data and COVID-19 pandemic predicting mortality rate: machine learning approaches. Chaos Solitons Fractals. 2020;138:110137.
Article Google Scholar
Khanna A et al. Data Analytics and Management: Proceedings of ICDAM. 2021: Springer.
Talkhi N, et al. Modeling and forecasting number of confirmed and death caused COVID-19 in IRAN: a comparison of time series forecasting methods. Biomed Signal Process Control. 2021;66:102494.
Article Google Scholar
Nishiura H, et al. The rate of Underascertainment of Novel Coronavirus (2019-nCoV) infection: estimation using Japanese passengers data on evacuation flights. J Clin Med. 2020;9(2):419.
Article Google Scholar
Moftakhar L, Seif M, Safe MS. Exponentially Increasing Trend of Infected Patients with COVID-19 in Iran: A Comparison of Neural Network and ARIMA Forecasting Models Iranian Journal of Public Health, 2020;49(Supple 1).
Yonar H, et al. Modeling and forecasting for the number of cases of the COVID-19 pandemic with the curve estimation models, the Box-Jenkins and Exponential Smoothing methods. Eurasian J Med Oncol. 2020;4(2):160–5.
Google Scholar
Ceylan Z. Estimation of COVID-19 prevalence in Italy, Spain, and France. Sci Total Environ. 2020;729:138817.
Article Google Scholar
Papastefanopoulos V, Linardatos P, Kotsiantis S. COVID-19: a comparison of Time Series methods to Forecast percentage of active cases per Population. Appl Sci. 2020;10(11):3880.
Article CAS Google Scholar
Almasarweh M, Alwadi S. ARIMA model in predicting banking stock market data. Mod Appl Sci. 2018;12(11):4.
Google Scholar
Cryer JD, Chan KS. Time Series Analysis: With Applications in R. Vol. 2nd edition. 2008: Springer-Verlag New York.
Grzegorz S. Forecasting Time Series with Multiple Seasonalities using TBATS in Python. 2019; Available from: https://medium.com/intive-developers/forecasting-time-series-with-multiple-seasonalities-using-tbats-in-python-398a00ac0e8a.
De Livera AM, Hyndman RJ, Snyder RD. Forecasting Time Series with Complex Seasonal patterns using exponential smoothing. J Am Stat Assoc. 2011;106(496):1513–27.
Article Google Scholar
Ma S, Fildes R. Retail sales forecasting with meta-learning. Eur J Oper Res. 2021;288(1):111–28.
Article Google Scholar
Tanaka S et al. A clinical prediction rule for predicting a delay in quality of life recovery at 1 month after total knee arthroplasty: a decision tree model. J Orthop Sci, 2020.
Vilalta R, Drissi Y. A Perspective View and Survey of Meta-Learning. Artif Intell Rev. 2002;18(2):77–95.
Article Google Scholar
Ali AR, Gabrys B, Budka M. Cross-domain Meta-learning for time-series forecasting. Procedia Comput Sci. 2018;126:9–18.
Article Google Scholar
Prudêncio RBC, Ludermir TB. Meta-learning approaches to selecting time series models. Neurocomputing. 2004;61:121–37.
Article Google Scholar
Yang L, et al. A regression tree approach using mathematical programming. Expert Syst Appl. 2017;78:347–57.
Article Google Scholar
Yousaf M, et al. Statistical analysis of forecasting COVID-19 for upcoming month in Pakistan. Chaos Solitons Fractals. 2020;138:109926.
Article PubMed PubMed Central Google Scholar
Niazkar HR, Niazkar M. Application of artificial neural networks to predict the COVID-19 outbreak. Global Health Research and Policy. 2020;5(1):50.
Article Google Scholar
Yoon J. Forecasting of real GDP growth using machine learning models: gradient boosting and Random Forest Approach. Comput Econ. 2021;57(1):247–65.
Article Google Scholar
Xue L, et al. A data-driven shale gas production forecasting method based on the multi-objective random forest regression. J Petrol Sci Eng. 2021;196:107801.
Article CAS Google Scholar
Makridakis S. Forecasting: its role and value for planning and strategy. Int J Forecast. 1996;12(4):513–37.
Article Google Scholar
Doupe P, Faghmous J, Basu S. Machine Learning for Health Services Researchers. Value in Health. 2019;22(7):808–15.
Article Google Scholar
Shailaja K, Seetharamulu B, Jabbar MA. Machine Learning in Healthcare: A Review. in 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA). 2018.
Yang Q et al. Research on COVID-19 based on ARIMA model∆—Taking Hubei, China as an example to see the epidemic in Italy. J Infect Public Health, 2020.
Farooq J, Bazaz MA. A deep learning algorithm for modeling and forecasting of COVID-19 in five worst affected states of India. Alexandria Eng J. 2021;60(1):587–96.
Article Google Scholar
Christie N, Basri MH. Personal Protective Equipment Demand Forecasting and Inventory Management during COVID-19 Case Study: Public Hospital at Bandung, Indonesia, in international conference on management, economics & finance. 2021.
Rostami-Tabar B, Rendon-Sanchez JF. Forecasting COVID-19 daily cases using phone call data. Appl Soft Comput. 2021;100:106932.
Article Google Scholar
Khan F, Saeed A, Ali S. Modelling and forecasting of new cases, deaths and recover cases of COVID-19 by using Vector Autoregressive model in Pakistan. Volume 140. Chaos, Solitons & Fractals; 2020;110189.

Download references

Acknowledgements

The authors would like to thank all staff that gathered COVID-19 data, respected editors, and reviewers. This research (Mehdi Jabbari Nooghabi’s work) was supported by a grant from Ferdowsi University of Mashhad; No. 2/52975.

Funding

No fund.

Author information

Authors and Affiliations

Department of Biostatistics, School of Health, Mashhad University of Medical Sciences, Mashhad, Iran
Nasrin Talkhi
Department of Statistics, Ferdowsi University of Mashhad, Mashhad, Iran
Narges Akhavan Fatemi & Mehdi Jabbari Nooghabi
Surgical Oncology Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
Ehsan Soltani & Azadeh Jabbari Nooghabi

Authors

Nasrin Talkhi
View author publications
You can also search for this author in PubMed Google Scholar
Narges Akhavan Fatemi
View author publications
You can also search for this author in PubMed Google Scholar
Mehdi Jabbari Nooghabi
View author publications
You can also search for this author in PubMed Google Scholar
Ehsan Soltani
View author publications
You can also search for this author in PubMed Google Scholar
Azadeh Jabbari Nooghabi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Nasrin Talkhi, Department of Biostatistics, School of Health, Mashhad University of Medical Sciences, Mashhad, Iran. talkhin3@mums.ac.ir. Contribution: Formal analysis; Investigation; Methodology; Software; Writing– original draft. Narges Akhavan Fatemi, Department of Statistics, Ferdowsi University of Mashhad, Mashhad, Iran. n.akhavan_f@yahoo.com. Contribution: Formal analysis; Methodology. Mehdi Jabbari Nooghabi, Department of Statistics, Ferdowsi University of Mashhad, Mashhad, Iran. jabbarinm@um.ac.ir. Contribution: Writing– review & editing; Methodology; Validation; Supervision. Ehsan Soltani, Surgical Oncology Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.soltaniE@mums.ac.ir. Contribution: Data curation; Writing– review & editing. Azadeh Jabbari Nooghabi, Surgical Oncology Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.jabbarinaz@yahoo.com; jabbaria@mums.ac.ir. Contribution: Data curation; Writing– review & editing; Interpretation of Results..

Corresponding author

Correspondence to Mehdi Jabbari Nooghabi.

Ethics declarations

Ethics approval and consent to participate

This study was exempt from seeking explicit informed consent as it was a secondary analysis of existing data.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Talkhi, N., Akhavan Fatemi, N., Jabbari Nooghabi, M. et al. Using meta-learning to recommend an appropriate time-series forecasting model. BMC Public Health 24, 148 (2024). https://doi.org/10.1186/s12889-023-17627-y

Download citation

Received: 05 August 2023
Accepted: 31 December 2023
Published: 10 January 2024
DOI: https://doi.org/10.1186/s12889-023-17627-y

Using meta-learning to recommend an appropriate time-series forecasting model