Skip to main content

Blood donation projections using hierarchical time series forecasting: the case of Zimbabwe’s national blood bank

Abstract

Background

The discrepancy between blood supply and demand requires accurate forecasts of the blood supply at any blood bank. Accurate blood donation forecasting gives blood managers empirical evidence in blood inventory management. The study aims to model and predict blood donations in Zimbabwe using hierarchical time series. The modelling technique allows one to identify, say, a declining donor category, and in that way, the method offers feasible and targeted solutions for blood managers to work on.

Methods

The monthly blood donation data covering the period 2007 to 2018, collected from the National Blood Service Zimbabwe (NBSZ) was used. The data was disaggregated by gender and blood groups types within each gender category. The model validation involved utilising actual blood donation data from 2019 and 2020. The model's performance was evaluated through the Mean Absolute Percentage Error (MAPE), uncovering expected and notable discrepancies during the Covid-19 pandemic period only.

Results

Blood group O had the highest monthly yield mean of 1507.85 and 1230.03 blood units for male and female donors, respectively. The top-down forecasting proportions (TDFP) under ARIMA, with a MAPE value of 11.30, was selected as the best approach and the model was then used to forecast future blood donations. The blood donation predictions for 2019 had a MAPE value of 14.80, suggesting alignment with previous years' donations. However, starting in April 2020, the Covid-19 pandemic disrupted blood collection, leading to a significant decrease in blood donation and hence a decrease in model accuracy.

Conclusions

The gradual decrease in future blood donations exhibited by the predictions calls for blood authorities in Zimbabwe to develop interventions that encourage blood donor retention and regular donations. The impact of the Covid-19 pandemic distorted the blood donation patterns such that the developed model did not capture the significant drop in blood donations during the pandemic period. Other shocks such as, a surge in global pandemics and other disasters, will inevitably affect the blood donation system. Thus, forecasting future blood collections with a high degree of accuracy requires robust mathematical models which factor in, the impact of various shocks to the system, on short notice.

Peer Review reports

Introduction

Blood transfusion requirements are on the rise globally as a result of accidents, diseases and advanced surgeries. Zimbabwe often experiences shortages in the majority blood group O during public holiday periods. This is mainly due to the high demand for clinical blood transfusion as a result of a surge in road accidents and injuries during such periods [1]. Type O blood is the most needed blood type in transfusion centres and more than 52% of Zimbabweans are in blood group O [2]. Accurate forecasts of the number of volunteer donors and blood donations help blood service managers in managing their categories of blood inventories and plan accordingly for the education and recruitment of voluntary non-remunerated blood donors and subsequent blood collection.

In the blood supply chain, future forecasting of blood supply is a critical step to ensure the adequate availability of safe blood when clinical transfusion is required. Accurate and coherent blood donation forecasting provides blood managers with empirical evidence regarding when to order blood, educate and recruit new blood donors, the estimated quantities required of each blood group to collect and potential donor categories to target.

The blood supply chain is dynamic, and as such, some studies have expressed concern over the potential reduction in blood donations emanating from multiple factors including donor demographical variations [3]. Time series analysis can be used to understand the patterns in blood donation data and help blood managers in predicting future blood donations. This information is useful for minimising volatility in blood stocks and preventing blood stockouts. Understocking blood has detrimental effects on patient safety in the healthcare system, while overstocking results in wasteful discards of outdated blood [4,5,6,7,8,9]. Accurate and coherent forecast of blood donations are vital as part of the decision support system for blood centre authorities as they need to know the future of blood supply when given the surge in daily blood demand [10].

The blood supply chain is dependent upon a finite number of donors. This is then aggravated by the fact that blood donation is very irregular and uncertain [11]. Blood donations/demand estimates based on employee opinions, experience, and intuition rather than quantitative models are currently used to determine both the current and future blood provisions in most blood centres globally, and especially in developing countries [12]. The unavailability or non-use of quantitative models in estimating blood donations can indeed cause volatility and uncertainty in the blood supply chain. Without these models, it can be difficult to accurately predict how much blood will be available for clinical transfusions, and this can lead to shortages or excesses of blood in certain areas. The application of quantitative prediction models in a blood bank helps to reduce errors in decision-making about the quantity of blood to be supplied and demanded [13].

With increased demand for blood and blood components against a declining voluntary blood donor pool, improving the availability and safety of the blood supply, and forecasting become vital for sustaining any blood bank to meet its core mandate. Numerous techniques have been applied in time series forecasting in general, such as autoregressive integrated moving average (ARIMA), exponential smoothing (ES), fuzzy systems (FS), artificial neural networks (ANN), logistic regression, support vector machine (SVM) and hierarchical time series forecasting. Hierarchical time series forecasting allows forecasting of time series at different levels of a hierarchical structure whilst preserving the relationships and dependencies within the hierarchy. Furthermore, the forecasts at each hierarchical level are aggregated or disaggregated to give the forecasts at higher or lower levels in the hierarchy.

The correlation between data of donor specific characteristics and blood donations can result in huge datasets of times series, which can then be classified into clusters or hierarchies. The essence of hierarchical forecasting in blood donation is derived from the fact that the blood donors can be categorised into various clusters such as gender, blood group type, and donor status. Data from the NBSZ indicates that male blood donors constitute about 54% of the donor pool and the female donors accounting for the remaining 46%. Also, the donations are classified according to the ABO donor blood group system with blood group O accounting for 54%, blood group A constitutes 24%, blood group B, 18% and blood group AB, 4%.

Hierarchical time series is effective in forecasting hierarchically organised data which can be aggregated and disaggregated at different levels [14]. In the blood supply chain, the total blood donation forecast is required at the top level of the hierarchy for inventory planning, resource allocation and other blood drive logistics. It is possible to create a hierarchical structure that captures the relationships between these different categories of donors. For example, at the highest level of the hierarchy, there may be forecasts for the overall national blood supply. At the next level down, there may be forecasts for the different gender of donors, viz: male and female. At the next level down, there may be forecasts for the different blood group types, viz: A, B, AB, and O. When these different levels based on donor characteristics are not factored in, this may result in incoherent time series forecasting, less targeting of interest groups, resulting in not being able to meet blood demand of a particular type in a given area(s).

The aim of the study is to use a hierarchical time series forecasting approach to predict blood donation patterns. By using this approach, it is possible to create more accurate and detailed forecasts for blood donations, taking into account the relationships and dependencies between different categories of donors. This can help blood banks to better manage their inventory and ensure that they have enough blood for the right group, adequate number of units in an area given at the right time to meet patient needs. To the best of our knowledge, the application of hierarchical time series in the blood supply chain problems has not been investigated, especially in the context of Zimbabwe and Africa, considering the available blood supply chain forecasting literature. Hierarchical forecasting is a very instrumental statistical technique to support decision-making in most supply chains [15], hence its application in blood donation projections is vital.

Literature review

Forecasting hierarchical time series is a relatively new to the forecasting phenomenon. Hierarchical time series forecasting has gained wider application in recent years [16]. Many phenomena in the real world, such as stock prices, weather, consumer demand, tourism demand, blood supply system, just to mention a few, can be modelled using the hierarchical time series. However, the correlation of different points of the time series makes some of the algorithms less versatile in forecasting [13]. A multivariate time-series model based on long-short-term memory (LSTM) in forecasting blood donation and demand during the Covid-19 pandemic at Tehran Blood Centre in Iran [17]. The LSTM is a recurrent neural network-based deep learning model. The study results showed that the forecasting model reduced blood shortage and wastage by 5.5% when compared to existing forecasting methods, such as the ARIMA, used the time series models in forecasting blood donation at a university medical care centre in Portugal [11]. The study developed six models, viz: ETS, Holt-Winters, autoregressive neural networks, ARIMA, double-seasonal Holt-Winters, and exponential smoothing (ES). The study concluded that trend lines of donations were better modelled by different models with different forecasting horizons. However, the ARIMA model outperformed all the other models in generating forecasts, hence the ARIMA model is part of the hierarchical forecasting approach to be adopted in this study, forecasted the supply of blood at blood centres in Taiwan using data from the Taiwan Blood Services Foundation [18]. They applied two different techniques in forecasting, viz: times series and machine learning. Under time series, they employed autoregressive (AUTOREG), ARMA, ARIMA, seasonal ARIMA (SARIMA), seasonal exponential smoothing model (ESM) and Holt-Winters. Under the machine learning algorithms, they used ANN and multiple regression. The study results showed that time series forecasting methods (seasonal ESM and ARIMA models) generated accurate predictions when compared to machine learning algorithms. Hence, this study will adopt ARIMA and ETS models, concurred that blood donation was influenced by transfusion demand [19]. The study forecasted red blood cells demand using three-time series methods, viz: ARIMA, Holt-Winters and neural-network-based method. The study results showed that a SARIMA model produced accurate forecasts over a shorter time horizon of one year. The ES outperformed the other methods over longer time horizons stated that managing blood supply and demand was difficult in most blood banks globally [20]. They highlighted the need for accurate and reliable blood supply and demand forecasting models. They conducted a study at the National Health Service Blood and Transplant in England using four different time series methods which were selected using the minimum mean squared error (MMSE) and weighted least squares error (WLSE). The methods yielded similar results.

A study concluded that there is no single statistical forecasting technique that is universally better and applicable at all times [21]. The authors conducted blood demand forecasting in Finland and the Netherlands. They applied moving averages (MA), ETS, ARIMA, autoregressive neural networks (NNAR), seasonal naïve (SNAIVE), method averaging (AVG), seasonal trend decomposition methods (STL and STLF), dynamic seasonal method (TBATS), dynamic regression (DYNREG), multilayer perceptrons (MLP) and extreme learning machine (ELM). The model performances were compared using mean absolute percentage errors (MAPEs). The results show that DYNREG performed better than the other approaches in generating forecasts emphasised the importance of accurate predictions in blood provision [22]. Their study at Shirazi blood centre in Iran applied ARIMA, ANN and hybrid approaches in forecasting different blood groups demand. Mean Square Error (MSE) and Mean Absolute Error (MAE) were used to compare and validate the fitted models. The results showed that ARIMA model outperformed the other models in the forecasting accuracy [23] forecasted the demand in the blood supply chain using platelets at Canadian Blood Services. The study used five different forecasting methods, viz: ARIMA, Prophet, lasso regression (least absolute shrinkage and selection operator), random forest and LSTM (Long Short-Term Memory) networks. The results showed that with limited data, multivariate models performed better than univariate models. However, with adequate data, ARIMA models produced similar results to multivariate methods. The current study uses both the ARIMA and the ETS under the hierarchical forecasting approach models.

Material and methods

Secondary data used in this study corresponds to the grand total of blood collections (blood units) from the five regional blood centres in Zimbabwe based on specific donor characteristics (gender and blood group). This information is useful for developing a hierarchical structure for forecasting blood donations, as it enables the researchers to consider the relationships and dependencies between different categories of blood group types and blood donors. The data was collected from the NBSZ Laboratory Information Management System (LIMS) and annual reports which are freely available on the link https://nbsz.co.zw/, where certain blood donations information is captured in aggregate form. Monthly blood donation data covering the period 2007 to 2018 was used in the forecasting, giving a total of 144 monthly observations.

Using the approach [14], a tree diagram of blood donations comprising a two-level hierarchical structure is presented in Fig. 1. The tree diagram is constructed based on the disaggregated blood data that was categorised according to two variables: gender (Male and Female) and blood group type (A, B, AB and O). Level 0 represents the total blood donations in Zimbabwe. Level 1 denotes the first disaggregation by gender (Male (M) and Female (F)). Level 2 denotes further disaggregation by blood groups according to the ABO blood group system (A, B, AB and O).

Fig. 1
figure 1

Blood donations hierarchical structure based on donor gender and blood group type

The R-package HTS is used to generate the forecasts using the bottom-up, top-down and the optimal combination methods. The EST and ARIMA methods are used to generate the forecasts.

Hierarchical forecasting techniques

(Fig. 1. Blood donations hierarchical structure based on donor gender and blood group type).

According to Fig. 1, level 0 gives completely aggregated blood donations (Total blood donations) denoted by \({Y}_{TB,t}\) where \(t=1, 2, 3, . . . , 144\) and are obtained by adding all the series at level 1 or level 2. Level 1 represents data disaggregated according to gender (male and female). Level 1 and level 2 series can be denoted by \({Y}_{i,t}\), where \(i\) denotes the node in the hierarchical tree diagram. The data consists of 144 monthly observations (t = 1, 2, …, 144). Forecasts for each level were estimated using the bottom-up, top-down and optimal combination approaches. The approach with a lower accuracy measure estimated by MAPE was used to generate forecasts for the blood centre.

Let \({{\varvec{Y}}}_{t}\) and \({{\varvec{S}}}_{11X8}\) be the vector of the blood data and a summing matrix storing the hierarchical structure shown in Fig. 1 respectively.

$${{\varvec{Y}}}_{t}=\left[{Y}_{TB,t}, {Y}_{M,t}, {Y}_{F,t},{Y}_{AM,t}, {Y}_{BM,t}, {Y}_{ABM,t} {Y}_{OM,t}, {Y}_{AF,t}, {Y}_{BF,t}, {Y}_{ABF,t}, {Y}_{OF,t}\right]{\prime}$$
(1)
$${\varvec{S}}= \left(\begin{array}{c}1 1 1 1 1 1 1 1\\ 1 1 1 1 0 0 0 0\\ 0 0 0 0 1 1 1 1\\ 1 0 0 0 0 0 0 0\\ 0 1 0 0 0 0 0 0\\ 0 0 1 0 0 0 0 0\\ 0 0 0 1 0 0 0 0\\ 0 0 0 0 1 0 0 0\\ 0 0 0 0 0 1 0 0\\ 0 0 0 0 0 0 1 0\\ 0 0 0 0 0 0 0 1\end{array}\right)$$
(2)

Making use of the summing matrix (S), Eq. 1 can be written as

$${{\varvec{Y}}}_{t}={{\varvec{S}}{\varvec{Y}}}_{2,t}$$
(3)

The bottom-up method

The bottom-up approach involves forecasting individually for each series at the lowest levels of the hierarchy and then aggregates the forecasts upwards to generate forecasts for higher levels. The method is based on forecasting the individual blood donations from the blood group type A, B, AB and O first. Total number of blood donations for each gender can be calculated by summing up the forecasted donations made by individuals of all blood groups. Then, by summing up the donations made by each gender, one can determine the total number of blood donations for the blood bank. In other words, the approach is concerned with producing individual base forecasts at the lower level of the hierarchy and combining the forecasts upwards through \({\varvec{S}}\). Thus, the approach starts by producing h-step-ahead forecasts for individual bottom level time series (\(n = 8\)):

\({\widehat{Y}}_{AM,h},\) \({\widehat{Y}}_{BM,h,}\) \({\widehat{Y}}_{ABM,h},\) \({\widehat{Y}}_{OM,h},\) \({\widehat{Y}}_{AF,h},\) \({\widehat{Y}}_{BF,h,}\) \({\widehat{Y}}_{ABF,h},\) and \({\widehat{Y}}_{OF,h}\)

These forecasts are aggregated to get the h-step-ahead forecasts for the higher level (level 1). Level 1 h-step-ahead forecasts (\({\widetilde{Y}}_{AM,h}, {\text{and}} {\widetilde{Y}}_{BF,h}\)) are given by

$${\widetilde{Y}}_{M,h}={\widehat{Y}}_{AM,h}+{\widehat{Y}}_{BM,h}+ {\widehat{Y}}_{ABM,h}+{\widehat{Y}}_{OM,h}$$
(4)
$${\widetilde{Y}}_{F,h}={\widehat{Y}}_{AF,h}+{\widehat{Y}}_{BF,h}+ {\widehat{Y}}_{ABF,h}+{\widehat{Y}}_{OF,h}$$
(5)

The summing matrix (\({\varvec{S}})\) will combine the h-step-ahead forecasts up the hierarchical structure. For the bottom-up approach, the forecasts are combined using the formula:

$${\widetilde{Y}}_{h}=S{\widehat{Y}}_{K,h},$$
(6)

where \(k=0, 1, 2.\)

The advantage of the bottom-up approach is that no information is lost since forecasts are generated at the lowest or base level of the hierarchy. The major setbacks of the method are that, it performs poorly on highly aggregated data and it does not take into account the correlations between the series. Also, too much data points in the base level of the hierarchy requires more runtime to generate forecasts. The bottom-up method is not effective in the case of complex and multi-layered hierarchies [24]. The time series at the lowest levels often have little structure and are therefore difficult to forecast and this can result in forecasting errors which can be aggregated over numerous upper hierarchies.

The top-down method

This method forecast the highest level of the hierarchy first and then split up the forecast to generate estimates for the lower levels through the use of some proportions or factors. These proportions include average historical proportions, proportions of the historical averages and forecast proportions [16, 25]. Historical data is used in the calculation of the proportions and the approach has the ability to yield reliable forecasts for the aggregate levels [26]. The average historical proportions formula is:

$${p}_{i}= \frac{1}{N}\sum_{t=1}^{N}\frac{{Y}_{i,t}}{{Y}_{t}}$$

where \(i=\mathrm{1,2}, \dots , {m}_{k}.\) According to [26], every proportion reveals the average of the historical proportions of the bottom level series over time relative to the aggregated series (\({Y}_{t})\) for \(t=1, 2, 3, . . . , N \left(N=144\right).\) Using one of the nodes in Fig. 1 and the bottom level series \({Y}_{OF,t}\) as an example, we can have;

$${p}_{OF}=\left(\frac{{\widehat{y}}_{OF,t}}{{\widehat{S}}_{F,t}}\right)\left(\frac{{\widehat{y}}_{F,t}}{{\widehat{S}}_{Total,t}}\right)$$

where \({\widehat{S}}_{Total,t}={\widehat{Y}}_{M,t}+ {\widehat{Y}}_{F,t}\) and \({\widehat{S}}_{F,t}={\widehat{y}}_{AF,t}+{\widehat{y}}_{BF,t}+{\widehat{y}}_{ABF,t}+{\widehat{y}}_{OF,t}\)

Advantages of the method is that it provides reliable forecasts for higher levels in the hierarchy and is useful when the lower-level series are noisy and difficult to forecast. The major setback of the method is that there is general loss of information resulting in less accurate forecast being generated at base or lower levels of the hierarchy [27].

Optimal combination method

Handyman RJ et al. [14] proposed an optimal combination approach for forecasting that utilises all the available information and combinations in a hierarchy. This approach involves making independent forecasts at all levels, which are then reconciled using a linear regression model. The resulting forecasts are coherent and based on weights obtained by solving a system of equations that respect the relationships between the different levels of the hierarchy. This method can estimate the unknown future expectation values of the lowest level of the dataset, K. Given a vector of the unknown means (\({{\varvec{\beta}}}_{n}(h))\), thus,

$${{\varvec{\beta}}}_{n}(h)=E[{{\varvec{Y}}}_{k,n+h}|{{\varvec{Y}}}_{1}, {{\varvec{Y}}}_{2},\dots ,{{\varvec{Y}}}_{n}]$$

Since \({{\varvec{Y}}}_{t}\) represents the vector of all observations at time t while and \({{\varvec{Y}}}_{k,n+h}\) represents the vector of observations in the bottom level K. The base forecasts (\({\widehat{{\varvec{Y}}}}_{n}\left(h\right))\) are presented in a regression format to give:

$${\widehat{{\varvec{Y}}}}_{n}\left(h\right)= {\varvec{S}}{{\varvec{\beta}}}_{n}(h)+{{\varvec{\varepsilon}}}_{h}$$

where \({{\varvec{\varepsilon}}}_{h}\) denotes a white noise process with covariance matrix \(\sum h\) which is difficult to find in large hierarchies [26]. However, [14] proposed estimating the white noise process by the forecast error in the bottom level, thus,\({{\varvec{\varepsilon}}}_{h}\approx {\varvec{S}}{{\varvec{\varepsilon}}}_{k,h}\). With this hypothesis, errors satisfy the same aggregation constraint as the dataset, resulting in

$$\sum h={\varvec{S}}\boldsymbol{ }{\text{Var}}({{\varvec{\varepsilon}}}_{k,h}){{\varvec{S}}}{\prime}$$

The optimal combination approach has a key advantage in that it is capable of producing highly accurate forecasts in comparison to both top-down and bottom-up methods. Additionally, it allows for unbiased forecasts to be generated at all levels while minimising the loss of information. This approach also enables the utilisation of diverse independent forecasting methods, such as ARIMA and ETS, at each level to generate the most accurate forecasts possible. However, one significant drawback of the optimal combination approach is that it can become very complex and computationally intensive when dealing with numerous time series.

Forecasting individual series

The ETS and ARIMA are the common methods used. The general ARIMA model can be expressed as

$${Y}_{t}-{\Phi }_{1}{Y}_{t-1}-{\Phi }_{2}{Y}_{t-2}-\dots -{\Phi }_{p}{Y}_{t-p}={a}_{t }+{\Theta }_{1}{a}_{t-1 }+{\Theta }_{2}{a}_{t-2 }+\dots +{\Theta }_{q}{a}_{t-q}$$

where \(\Phi {\prime}s\) and \(\mathrm{\Theta {\prime}}{\text{s}}\) are model parameters.

\({Y}_{t}\) – is the stationary series,

\({\Phi }_{p}\) – is the coefficient of the p th AR term, where p is the order of the AR term,

\({\Theta }_{q}\) – is the coefficient of the q th MA term, where q is the order of the MA term,

\({a}_{t}\)—is the error term.

The general forms of the Holt-Winters with permanent constant, linear trend and multiplicative seasonal variations are:

$${\widetilde Y}_t=\alpha\left(\frac{Y_t}{S_{t-s}}\right)+\left(1-\alpha\right)({\widetilde Y}_{t-1}+{\widetilde B}_{t-1}),$$
$${\widetilde B}_t=\beta\left({\widetilde Y}_t-{\widetilde Y}_{t-1}\right)+(1-\beta){\widetilde B}_{t-1,}$$
$$S_t=\gamma\left(\frac{Y_t}{{\widetilde Y}_t}\right)+\left(1-\gamma\right)S_{t-s,}$$

where the smoothing parameters (\(\gamma ,\) \(\alpha\) and \(\beta\)) take values between 0 and 1. The smoothed series and seasonality period is denoted \({\widetilde{Y}}_{t}\) and \(s\) respectively. Both the ETS and the ARIMA default algorithms are incorporated in the R forecast package HTS. The mean absolute percentage error (MAPE) was used to assess forecasting performance of the models. The MAPE formula is:

$$MAPE=\frac1m\sum\frac{\left[y_t-{\widehat y}_t\right]}{y_t},$$

where \({y}_{t}\) are the actual blood donation values observed, \({\widehat{y}}_{t}\) are predicted blood donation values by the model and \(m\) is the prediction period.

Model validation

The disruptions caused by the Covid-19 pandemic altered blood donation patterns, complicating the forecasting of future blood donations for this specific period of the pandemic. The model validation involved utilising actual blood donation data from January 2019 to December 2020. The model's performance was evaluated through the Mean Absolute Percentage Error (MAPE) suggesting alignment with previous years' donations during the pre-pandemic period, however there MAPE confirmed notable discrepancies between forecasts and observed values during the period of the Covid-19 pandemic.

Results

Data and descriptive statistics

Table 1 gives information on the structure of the hierarchy as depicted in Fig. 1.

Table 1 Hierarchy of blood donations by gender and blood group types

The descriptive statistics of the data are shown in Table 2.

Table 2 Descriptive statistics

Monthly mean blood donations for blood type A were 725.02 and 591.47 for males and females, respectively. Blood group O had the highest monthly mean as expected, 1507.85 and 1230.03 for male and female donors respectively. Blood group AB had the least mean donations 115.09 and 94.69 for male and female donors respectively. The negative kurtosis (platykurtic) shows that more donation data are located near the mean and less values are located on the tails thus no cases of extreme values or outliers.

The characteristics of the disaggregated blood donations are depicted in Fig. 2.

Fig. 2
figure 2

Time series plots based on donor gender and blood group type from 2007 – 2018

In Fig. 2, the total blood donations at level 0 exhibit some seasonality. There are no significant variations in the blood donation patterns even though there were some periods of declines in blood donations. At level 1, the male donations (M) surpassed their female counterparts (F). It is evident from Fig. 2 that blood group O donations (OM and OF) have the highest volumes, followed by blood group A and blood group AB being the least. At level 2, male blood group O had a maximum of 2888 units, female blood group O had a maximum of 2269 units and female blood group AB had the least maximum of 175 units. Such insights help blood centre authorities to plan for blood donor education and recruitment scheduling, fixed and mobile drives, blood collection and also meeting clinical blood transfusion needs.

Forecasting accuracy evaluation

The MAPE accuracy measure was used to assess the forecasting performance of the models. An out-of-sample forecasting accuracy measure is done. Table 3 presents accuracy done for both the ETS and ARIMA as forecasting methods.

Table 3 Forecast error measures (MAPE)

The average accuracy measures from each model are under the row named “Average”. It is shown in Table 3 that the TDFP approach produces small MAPE values under the ARIMA forecasting method forecasting method. The TDFP under ARIMA with MAPE error of 11.30 is the best and is used to forecast future blood quantities. Table 4 and Fig. 3 show out-of-sample forecasted future values for 60 months and their graphical display.

Table 4 Out-of-sample future blood forecasts
Fig. 3
figure 3

Blood donation forecasts from 2019 – 2023 using TDFP under ARIMA

From Fig. 3, future blood donations forecasts are indicated by the dashed/dotted line(s) while the historical data are represented by solid line(s). At level 1, future projections show that, male donations are higher than female donations. Similarly, at level 2, the projected donations for blood group O for males (OM) are higher than for their female (OF) counterparts. It is evident from all the three-levels in Fig. 3 that there could be a steady to slight decline in future blood donations for all the donor categories based on the projections. This can be attributed to a real problem of a continuous decline in numbers of regular voluntary blood donors in most blood centres. Low blood donations for blood group AB are projected to continue in the short to long term periods. This calls for the development of sound policies and interventions in blood donor and blood management (Table 5).

Table 5 Actual and forecast blood donations for 2019 and 2020

The blood donation predictions for 2019 had a MAPE value of 14.80, suggesting alignment with previous years' donations. However, starting in April 2020, the Covid-19 pandemic disrupted blood collection, leading to a significant decrease in blood donation and hence a decrease in model accuracy, and this is then reflected in a high MAPE value of 84.06.

Discussion

The aim of this study was to explore blood donation forecasting technique that could generate accurate and coherent predictions. The blood donation data in Zimbabwe recorded by the NBSZ was categorised according to donor specific characteristics. These categories gave rise to hierarchical time series forecasting. The forecasts from the approach suggest the need for effective donor education and recruitment drives targeting blood group O donors since they are universal blood donors and blood type O is always on high demand. These methods are strongly recommended as they give feasible solutions. They capture blood donor data dynamics, produce precise and sensible forecasts.

Previous studies have attributed patterns in blood supply to socio-demographic characteristics [28,29,30]. These donor specific characteristics give rise to clusters warranting the application of hierarchical forecasting. Therefore, there is need to make blood donation projections based on blood donor socio-demographic characteristics.

A study by [31] projected future blood donors in Birjand City, Iran using decision trees. Their models yielded poor performance based on the measures of accuracy. They concluded that the trees had numerous disaggregation of the data leading to data overfitting. Results from the current study indicated that the data disaggregation helped in generating accurate and coherent forecasts.

A time series analysis by aggregating blood donation frequency by month was conducted by [32]. The study results showed a stable blood supply for most months except in June and September periods which coincide with religious festivals in Saudi Arabia. The current study results showed seasonality in the donation patterns. The seasonality is linked to public holiday months and school holidays in Zimbabwe during the months of April, August and December each year.

Forecasting blood donation based on blood group prevalence is vital in managing blood supply at a blood bank [33]. Keeping track of dynamic changes in the donation prevalence of different blood groups is important since the distribution of the blood groups varies with time [34, 35]. Also, [36] concluded that blood donors with blood group O had higher frequency of blood donations and a lesser risk of lapsing, leading to the need for high blood donation volumes compared to other blood groups.

The current study shows that blood donations from blood group O donors have the highest volumes of donations compared to the other blood groups. This can be associated with the fact that the proportion of blood group O is highest in the donor population in Zimbabwe (52%). At the same time, blood group O donors are referred to as universal donors because blood type O can be transfused to blood A and B patients in emergencies where there was no time for matching blood types. However, it is current best practice to transfuse group-specific blood. Such insights help blood centre authorities to plan for blood donor education and recruitment, schedule blood drives and blood collections and also meeting clinical blood transfusion needs.

The results from the blood donation projections by gender concurs with other researchers where male donations are consistently higher than their female donations [37,38,39]. The current study also shows similar trends where male blood donors had higher mean blood donations compared to the female donors. Males have a higher frequency of donations as they are allowed, through regulation, to donate blood after every 12 weeks compared to 16 weeks interval for the female donors. Other researchers have attributed the lower donation volumes of female donors to high donor lapsing compared to male donors [36, 40].

Women generally donate blood less than men due to deferrals as a result of iron depletion through menstrual blood loss.

The blood donation projections from the study have some clinical implications. Some previous studies have shown that the survival rate of patients transfused with blood from male donors was higher compared to female donors [41]. Therefore, the higher proportion of male donors in the pool is vital in clinical blood transfusion. Also, Zimbabwe often experiences shortages in blood group type O. Therefore, the higher proportion of blood group O donors in the projections will help blood authorities in rationalising blood donor education and recruitment to minimise blood shortfalls.

Blood collections trend took a down turn from April 2020 as the government of Zimbabwe introduced Covid-19 lockdown restrictions to reduce the spread of the pandemic. These measures rendered most blood collection sites inaccessible as movement of people was restricted. The NBSZ had to rely on community based and walk in blood donors and this resulted in a 40% decrease in units of blood collected compared to 2019. The same negative impact of the Covid-19 pandemic can be observed from 2021 up to June 2022. This means that alternative models could be developed in future studies to analyse the impact of pandemics in forecasting blood donations. A time series with intervention model would be an ideal alternative candidate. The model focuses on the shock or pulse that results after say, a pandemic.

Conclusion

The discrepancy between blood supply and demand and the perishability of blood and blood components can be alleviated somewhat through accurate forecasts of the blood supply at any blood bank. Such accurate and coherent forecasts help in safeguarding the risks of understocking and overstocking the scarce and perishable resource, blood. Thus, accurate statistical forecasting methods play a significant role in future blood donation projections. The top-down, bottom-up and optimal combination approaches were adopted in the study with each approach having its own merits and demerits. The EST and ARIMA methods were used to generate the forecasts. The TDFP under ARIMA with the smallest MAPE was considered to be the best and was then used to forecast future blood donations.

Future blood forecasts indicated a slight decrease in total blood donations. This suggests the need for blood centre authorities to develop sound blood donor management interventions. Such interventions include an integrated strategy of the entire blood safety value chain, including donor education, targeted recruitment and retention, scheduled fixed and mobile blood donation drives, safe blood collection and donor care and adequate resource allocation.

Study results showed that blood donations from blood group O donors have the highest volumes of donations compared to the other blood groups. Also, blood donations by the male gender are higher than donations by their female counterparts. These trends are attributed to the higher proportions of donors in these categories.

This study will contribute to the board of knowledge on the adoption of coherent and accurate hierarchical forecasting methods in ensuring an adequate and safe blood supply chain in a low resource setting like Zimbabwe.

This study has potential limits. The lack of prior research studies on the topic limited the scope of the current study. The impact of the Covid-19 pandemic distorted the blood donation patterns such that the developed model did not capture the significant drop in blood donations during the pandemic period. Other shocks such as, a surge in global pandemics and other disasters, will inevitably affect the blood donation system. This means that future blood supplies remain under threat. Thus forecasting future blood collections with a high degree of accuracy requires robust mathematical models which factor in the impact of various shocks to the system on short notice. Door to door blood donation drives are not out of the question in such instances .

Availability of data and materials

The data that support the findings of this study are available from the corresponding author and the National Blood Service Zimbabwe upon reasonable request.

Abbreviations

NBSZ:

National Blood Service Zimbabwe

ETS:

Error-Trend-Seasonality

ARIMA:

Autoregressive Integrated Moving Average

MAPE:

Mean Absolute Percentage Error

TDFP:

Top-Down Forecasting Proportions

FS:

Fuzzy Systems

ANN:

Artificial Neural Networks

SVM:

Support Vector Machine

HTS:

Hierarchical Time Series

MSE:

Mean Square Error

MAE:

Mean Absolute Error

References

  1. Muleya T. (2021, December 24). Blood group ‘O’ in short supply. The Herald. https://www.herald.co.zw/blood-group-o-in-short-supply/.

  2. Moyo-Ndlovu T. (2022, January 19). Blood Group O in short supply: NBSZ. https://www.chronicle.co.zw/blood-group-o-in-short-supply-nbsz/.

  3. An M-W, Reich NG, Crawford SO, Brookmeyer R, Louis TA, Nelson KE. A Stochastic Simulator of a Blood Product Donation Environment with Demand Spikes and Supply Shocks. PLoS ONE. 2011;6(7):e21752.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Mansur A, Vanany I, Indah AN. Blood Supply Chain Challenges: Evidence from Indonesia. 2019.

  5. Maeng J, Sabharwal K, Ülkü MA. Vein to vein: exploring blood supply chains in Canada. Journal of Operations and Supply Chain Management, [S.l.], v. 11, n1,p.1–13; 2018. ISSN1984–3046. http://bibliotecadigital.fgv.br/ojs/index.php/joscm/article/view/62179. Date accessed: 19 Feb. 2020. doi: https://doi.org/10.12660/joscmv11n1p1-13.

  6. Najafi M, Ahmadi A, Zolfagharinia H. Blood inventory management in hospitals: Considering supply and demand uncertainty and blood transhipment possibility. Oper Res Health Care. 2017;15:43–56.

    Article  Google Scholar 

  7. Pierskalla WP. Supply chain management of blood banks. In Operations research and health care (pp. 103–145). 2005. Springer, Boston, MA.

  8. Fortsch SM, Khapalova EA. Operations Research for Health Care Reducing uncertainty in demand for blood. Oper Res Health Care. 2016;9:16–28.

    Article  Google Scholar 

  9. Hosseinifard Z, Abbasi B, Fadaki M, Clay NM. Post disaster Volatility of Blood Donations in an Unsteady Blood Supply Chain*. Decis Sci. 2020. https://doi.org/10.1111/deci.12381.

    Article  Google Scholar 

  10. Alajrami E, Abu-Nasser BS, Khalil AJ, Musleh MM, Barhoom AM, Naser SA. Blood donation prediction using artificial neural network. Int J Acad Eng Res. 2019;3(10):1–7.

    Google Scholar 

  11. Bischoff F, Koch MC, Rodrigues PP. Predicting Blood Donations in a Tertiary Care Centre Using Time Series Forecasting. Studies in health technology and informatics vol. 258; 2019, 135–139.

  12. Williamson LM, Dana VD. Challenges in the management of the blood supply. Lancet (London, England). 2013;381(9880):1866–75. https://doi.org/10.1016/S0140-6736(13)60631-5.

    Article  PubMed  Google Scholar 

  13. Gökler SH, Boran H. Prediction of Demand for Red Blood Cells Using Artificial Intelligence Methods. Academic Platform Journal of Engineering and Smart Systems 10(2), 86–93, 2022; https://doi.org/10.21541/apjess.1078920.

  14. Hyndman RJ, Ahmed RA, Athanasopoulos G, Shang HL. Optimal combination forecasts for hierarchical time series. Computational Statistics and Data Analysis, 2011, 2579 – 2589

  15. Abolghasemi M, Tarr G, Bergmeir C. Machine learning applications in hierarchical time series forecasting: Investigating the impact of promotions. Int J Forecast. 2022. https://doi.org/10.1016/j.ijforecast.2022.07.004.

    Article  Google Scholar 

  16. Athanasopoulos G, Ahmed RA, Hyndman RJ. Hierarchical forecasts for Australian domestic tourism. Int J Forecast. 2009;25(1):146–66.

    Article  Google Scholar 

  17. Shokouhifar M, Ranjbarimesan M. Multivariate time-series blood donation/demand forecasting for resilient supply chain management during COVID-19 pandemic. Cleaner Logistics and Supply Chain. 2022;5:100078. https://doi.org/10.1016/j.clscn.2022.100078.

    Article  PubMed Central  Google Scholar 

  18. Shih H, Rajendran S. Comparison of Time Series Methods and Machine Learning Algorithms for Forecasting Taiwan Blood Services Foundation’s Blood Supply. J Healthc Eng. 2019;2019:6123745. https://doi.org/10.1155/2019/6123745. (PMID:31636879;PMCID:PMC6766103).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Pereira A. Performance of time-series methods in forecasting the demand for red blood cell transfusion. Transfusion. 2004;44(5):739–46. https://doi.org/10.1111/j.1537-2995.2004.03363.x.

    Article  PubMed  Google Scholar 

  20. Nandi AK, Roberts DJ, Nandi AK. Prediction paradigm involving time series applied to total blood issues data from England. Transfusion. 2020;60(3):535–43. https://doi.org/10.1111/trf.15705.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Turkulainen EV, Wemelsfelder ML, Janssen MP, Arvas M. A robust autonomous method for blood demand forecasting. Transfusion. 2022;62(6):1261–8. https://doi.org/10.1111/trf.16870.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Sarvestani SE, Hatam N, Seif M, Kasraian L, Lari FS, Bayati M. Forecasting blood demand for different blood groups in Shiraz using auto regressive integrated moving average (ARIMA) and artificial neural network (ANN) and a hybrid approaches. Sci Rep. 2022;12(1):22031. https://doi.org/10.1038/s41598-022-26461-y.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Motamedi M, Li N, Down D, Heddle N. Demand Forecasting for Platelet Usage: from Univariate Time Series to Multivariate Models; 2021.

  24. Dangerfield BJ, Morris JS. Top-down or bottom-up: Aggregate versus disaggregate extrapolations. Int J Forecast. 1992;8(2):233–41.

    Article  Google Scholar 

  25. Gross CW, Sohl JE. Disaggregation methods to expedite product line forecasting. J Forecast. 1990;9(3):233–54.

    Article  Google Scholar 

  26. Morgan L. Forecasting in Hierarchical models. 2015.

  27. Pennings CL, van Dalen J. Integrated hierarchical forecasting. Eur J Oper Res. 2017;263(2):412–8.

    Article  Google Scholar 

  28. Shenga N, Thankappan K, Kartha C, Pal R. Analyzing sociodemographic factors amongst blood donors. J Emerg Trauma Shock. 2010;3(1):21–5. https://doi.org/10.4103/0974-2700.58667.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Burgdorf KS, Simonsen J, Sundby A, Rostgaard K, Pedersen OB, et al. Socio-demographic characteristics of Danish blood donors. PLoS ONE. 2017;12(2):e0169112. https://doi.org/10.1371/journal.pone.0169112.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Greinacher A, Fendrich K, Hoffmann W. Demographic changes: the impact for safe blood supply. ISBT Science Series. 2010; 5. 239 - 243. https://doi.org/10.1111/j.1751-2824.2010.01377. x.

  31. Ashoori M, Alisade S, Hosseiny Eivary HS, Hosseiny Eivary SS. A model to predict the sequential behaviour of healthy blood donors using data mining; J Research Health, Early View 10 Jan 2015.

  32. Alkahtani A, Jilani M. Predicting Return Donor and Analyzing Blood Donation Time Series using Data Mining Techniques. International Journal of Advanced Computer Science and Applications. 2019; 10. https://doi.org/10.14569/IJACSA.2019.0100816.

  33. Ferguson E, Murray C, O’Carroll RE. Blood and organ donation: health impact, prevalence, correlates, and inventions. Psychol Health. 2019;34:1073–104. https://doi.org/10.1080/08870446.2019.1603385.

    Article  PubMed  Google Scholar 

  34. Piersma TW, Bekkers R, de Kort W, Merz EM. Blood donation across the life course: the influence of life events on donor lapse. J Health Soc Behav. 2019;60:257–72. https://doi.org/10.1177/0022146519849893.

    Article  PubMed  Google Scholar 

  35. Debele GJ, Fita FU, Tibebu M. Prevalence of ABO and Rh Blood Group Among Volunteer Blood Donors at the Blood and Tissue Bank Service in Addis Ababa. Ethiopia J Blood Med. 2023;14:19–24. https://doi.org/10.2147/JBM.S392211.

    Article  PubMed  Google Scholar 

  36. Weidmann C, Müller-Steinhardt M, Schneider S, Weck E, Klüter H. Characteristics of Lapsed German Whole Blood Donors and Barriers to Return Four Years after the Initial Donation. Transfus Med Hemother. 2012 Feb;39(1):9–15. https://doi.org/10.1159/000335602. Epub 2011 Dec 23. PMID: 22896761; PMCID: PMC3388618.

  37. Bakhos J, Khalife M, Teyrouz Y, Saliba Y. Blood Donation in Lebanon: A Six-Year Retrospective Study of a Decentralized Fragmented Blood Management System. Cureus. 2022;14(2):e21858. https://doi.org/10.7759/cureus.21858.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Bani M, Giussani B. Gender differences in giving blood: a review of the literature. Blood Transfus. 2010;8(4):278–87. https://doi.org/10.2450/2010.0156-09.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Kasraian L, Esfahani SA, Foruozandeh H. Reasons of under-representation of Iranian women in blood donation. Haematol Transfus Cell Ther. 2021;43(3):256–62. https://doi.org/10.1016/j.htct.2020.03.009.

    Article  Google Scholar 

  40. Germain M, Glynn SA, Schreiber GB, Gélinas S, King M, Jones M, Bethel J, Tu Y. Determinants of return behavior: a comparison of current and lapsed donors. Transfusion. 2007;47(10):1862–70. https://doi.org/10.1111/j.1537-2995.2007.01409.x. (PMID: 17880613).

    Article  PubMed  Google Scholar 

  41. Siekierska B, Tomaszek L, Kurleto P, Turkanik E, Mędrzycka-Dąbrowska W. Blood donation practice and its associated factors among Polish population: secondary data analysis. Front Public Health. 2003;11:1251828. https://doi.org/10.3389/fpubh.2023.1251828.

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to extend their sincere gratitude to the NBSZ staff for their critical role in facilitating access to the data used in this study.

Funding

There is no funding to this study.

Author information

Authors and Affiliations

Authors

Contributions

CC contributed towards the study conceptualisation and design, literature, data collection and analysis, results interpretation and the manuscript write-up. TM assisted in data analysis. DC reviewed and corrected misconceptions and approved the manuscript for submission.

Corresponding author

Correspondence to Coster Chideme.

Ethics declarations

Ethics approval and consent to participate

The blood donations data used in this study were as approved by the General/Human Research Ethics Committee (GHREC) of the University of the Free State, South Africa (Ethical Clearance number: UFS-HSD2023/1370). Furthermore, permission to use the data was granted by the NBSZ authorities. There was no direct interaction with the individual blood donors and the identity of the blood donors remained anonymous, only identification numbers were used for each donor.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chideme, C., Chikobvu, D. & Makoni, T. Blood donation projections using hierarchical time series forecasting: the case of Zimbabwe’s national blood bank. BMC Public Health 24, 928 (2024). https://doi.org/10.1186/s12889-024-18185-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12889-024-18185-7

Keywords