Source of data
This study obtained and analyzed data from the Demographic and Health Surveys (DHS) across all the participating twelve countries in West Africa these are; Ghana, Benin, Cote d’ Ivoire, Guinea, Liberia, Mali, Niger, Nigeria, Sierra Leone, Burkina Faso, Gambia and Togo. Countries that were included in this study had their DHS conducted between the years 2012 to 2014. Countries that collected their data before 2012 and after 2014 were excluded in this analysis. This study was considered only in West Africa due to its relatively high rates of under-five mortality in the African Continent and the World as a whole. The DHS conducted across some selected countries in the world are based on nationally representative samples from all the participating countries. The collection of this data across West African countries use a similar selection approach via stratified two-stage sampling techniques. The information gathering approach is via structured interviews administered by well-trained research assistants or field workers. Data used in this study were extracted from the child and women’s questionnaire which contains information from 5 years prior to the survey date and questions about maternal factors, birth history and community factors as well. Country specific samples were; Burkina Faso (15,162), Benin (12,290), Cote D’Ivoire (7149), Ghana (5675), Gambia (7798), Guinea (6977), Liberia (6432), Mali (9964), Nigeria (30,540), Niger (12,767), Sierra Leone (8359) and Togo (6581). The overall combined sample is 129,693.
Study outcome
The main outcome for this study was time-to-under-five mortality (death occurring to a child before he/she reaches the age of five) among twelve different countries in the West African sub-Region. Under-five mortality is defined as mortality occurring from the age of zero months to 59 months. Therefore, the dependent variable in this study was defined as “the risk of death occurring between 0 to 59 months period. The outcome variable was thus survival time in months for the children under the age of five. Children who died under the age of five were deemed to have had the event and assigned the number 1. Those who did not die within the period were censored and assigned the number 0. This study allowed us to determine whether factors that have either positive or negative effects in under-fives are similar or different across the region. Women were asked about the age of children, including (month and year) birth of a child born alive, sex of the child and whether the child was still alive or dead. With dead children, information from their mothers regarding age at death was also obtained. Stillbirth or miscarriages were not included in this study.
Study setting
Explanatory variables
The exposure variables that were of interest and considered in the analysis are; mother’s age group (15–19, 20–24, 25–29, 30–34, 35–39, 40–44, 45–49), type of residence (urban, rural), mother’s level of education (No education, Primary, Secondary and Higher), birth status (singleton birth, multiple births), sex of the child (male, female), wealth index (poorest, poorer, middle, rich, richer), birth order (1st - 3rd, 4th - 6th, 7th + children), religion (Christian, Muslim, No religion, other religion and Traditional), place of delivery (home vs health facility), mode of delivery (caesarean section or no caesarean section), weight of the child (small (less than 2.5 kg), average (2.5 kg < =weight < =4.5 kg) and large (> 4.5 kg)) these were birth weights recorded for children 5 years preceding the survey from written records or mother’s recall of the size of the child at birth and preceding birth interval (< 12 months, 11 to 23 months, 24 to 35 months, > 35 months).
Some of the included independent variables in the model were selected based on their significance at the bivariate level (hazard ratios of variables that did not have 1 included in their credible intervals) whilst others were based on recommendation from literature. We checked for a 2-way interaction effect using all possible combinations of the exposure variables via a two-model approach, that is a model with the interaction effect referred to as “full model” and another without the interaction effect called “half model”. After which, we run a likelihood ratio test (in that the half_model was nested within the full_model) and using the p-values with reference to an alpha level of 0.05, the model with interactions was rejected because it was not significant.
Analytical procedure
A Bayesian parametric proportional hazards modeling approach was adopted for this study. We looked at the effects of specifying different models with or without a frailty term on the distribution of under-five mortality rate estimates for each country and the combined data from all the countries. Frailties were modelled according to the number of regions (following DHS classification) of the country and further into whether the respondents were either residing in a rural or urban setting. For instance, Ghana had 10 regions at the time of the survey and so within a region, respondents were either residing in a rural community or urban. Therefore, Ghana had 20 strata. The rest are as follows; Burkina Faso 26, Benin 23, Cote D’ Ivoire 21, Gambia 14, Guinea 15, Liberia 10, Mali 11, Nigeria 12, Niger 15, Sierra Leone 8 and Togo 11 strata. The frailty was specified to control for the heterogeneity between residence and across regions [13]. We specified three different distributional (the exponential, Weibull and Gompertz) forms for the hazard function in two different dimensions. One dimension assumes that community level variations are constant or do not vary and therefore, there is no heterogeneity between groups. The second dimension assumes no heterogeneity within (clusters or community) groups but between (clusters or community) groups and so a shared (gamma) frailty model is specified. Therefore, in the first set of the models, it is assumed that community level effects are not of particular interest and therefore the data follows either the standard exponential, Weibull or Gompertz regression model. In the second stage, we assumed a variation between communities and therefore made use of a frailty term to account for the variations using the parametric proportional hazards model framework as specified above.
The Cox proportional hazards regression is one of the popular statistical models used in analyzing censored survival data. The Cox model does not assume any specific form of the baseline hazard function, as an alternative to the Cox model, one can make assumptions about the shape of the underlying hazard function by using a parametric model; parametric models directly estimate absolute effects in addition to relative effects [28]. The hazard function is often of fundamental interest since it represents an important aspect of the time course of the disease in question [29]. Due to our interest in estimating whether the hazards of death in under-fives among the twelve countries is either decreasing, increasing or constant, we made use of only parametric proportional regression models. One of the advantages of the parametric models is that, there are better fit models over Cox when the shape of the hazard is known.
There were six models specified for this work. The first three were Bayesian regression models (exponential, Weibull and Gompertz) specified and fitted with the assumption that community heterogeneity (frailty) was insignificant. The second three Bayesian regression models, same as above, include a gamma shared frailty term with the assumption of a significant unobserved effect (presence of heterogeneity). Analysis were carried out on each model via the Bayesian approach for all the data sets. Comparison of the models were carried out using the deviance information criteria (DIC) and the Bayes factors (BF). The DIC is the Bayesian version of the frequentist AIC and BIC. It has two components, the goodness of fit represented by \( \overline{D}\left(\theta \right) \) and the model complexity term pD. This in effect makes \( DIC=\overline{D}\left(\theta \right)+ pD \). Smaller values of the DIC are more preferable to larger values. The Bayes Factor relies on the expression that, the posterior odds are a product of the prior odds and the BF. If we assume that two models are equally probable, then the posterior odds will be equal to that of the Bayes Factor. Therefore, a model with a Bayes Factor > 1 compared to the other is more preferable. These analyses were carried out using Stata version 15 software.
Test of proportionality under survival analysis
Schoenfeld residual test and a graphical approach were used to test for the proportional hazard’s assumption conditions. The Schoenfeld test hypothesizes that some variables do not vary with time. This hypothesis implies that variables remain constant over the study period and therefore satisfy the proportionality assumption under the PH model.
Models with and without frailty terms
The proportional hazards model without a frailty term
The proportional hazards model specifies that the hazard at some time t for an individual with covariate x can be expressed as
$$ h\left(t|x\right)={h}_0(t)\exp \left(\overset{`}{X}\beta \right) $$
(1)
where h0(t) is the baseline hazard function, X’ represents the vector of covariates, β the regression coefficients and S0(ti) the survival function.
The likelihood function L(D| h0(t), β) that can be expressed in the form of a right censored data (for the under-five mortality) on n number of subjects is
$$ L\left(D|{h}_0(t),\beta \right)={\prod}_{i=1}^n{\left\{{h}_0\left({t}_i\right)\mathit{\exp}\left({\overset{`}{X}}_i\beta \right)\right\}}^{\delta_i}\left({S}_0{\left({t}_i\right)}^{\mathit{\exp}\left({\overset{`}{X}}_i\beta \right)}\right) $$
(2)
The proportional hazards model with a frailty term
In this analysis, we specify a shared frailty model which implies that similar observations within a group have similar characteristics or frailty but these frailties differ between groups. Frailty models in survival analysis account for unobserved heterogeneity that occurs because some observations are more failure-prone and therefore, more “frail” than other observations. We assume that the survival times for say the ith subject (i = 1 . . . n) in the jth group (j = 1 . . . m) is denoted by Tij with an unobserved frailty parameter given as ui (for the jth group). With this, the hazard function for the proportional hazards model is given as
$$ h\left({t}_{ij}|{X}_{ij},{u}_j\right)={h}_0\left({t}_{ij}\right)\exp \left({\overset{`}{X}}_{ij}\beta \right){u}_j $$
(3)
where u1, . . ., um represent the frailty and h0(t), Xij and β are the baseline hazards, vector of covariates and regression coefficients respectively. The uj’s are independently and identically distributed with mean 1 and variance θ. The frailty distribution for each of uj is assumed to be independent gamma following Clayton [12] and given as
$$ {u}_j\sim Gamma\left(\eta, \eta \right),j=1,\dots, m $$
(4)
where η is the unknown variance of uj. We specify the following distribution for the frailty, which is
$$ X\sim Gamma\left(a,b\right)\propto {x}^{a-1}\exp \left(- bx\right),\mathrm{for}\ x>0,a>0\ \mathrm{and}\ b>0 $$
Description of exponential, Weibull and Gompertz distributions
Exponential and Weibull distributions
The exponential distribution is a special case of the Weibull distribution, that is suitable for modeling data with constant hazard. In other words, the hazards of the exponential distribution of an event occurring is constant. The Weibull distribution is more suitable for modeling data with monotone hazard rates that are either increasing or decreasing exponentially with time.
The hazard and survival functions of the Weibull distribution are
$$ h(t)= p\alpha {t}^{p-1} $$
$$ S(t)=\exp \left(-\alpha {t}^p\right) $$
(5)
If p = 1, the hazard and survival function of the Weibull distribution as described in eq. (5) reduces to that of the exponential. The parameter α is known as the scale parameter of the Weibull distribution. This parameter is parametrized for both exponential and Weibull regression models as
$$ {\alpha}_j=\exp \left({X}_j\beta \right) $$
(6)
This expression, eq. (6) is similar to that given in eq. (1). In this case, there is no auxiliary variable for the exponential distribution but for the Weibull which is the shape parameter (p).
Therefore, the proportional hazards models as described in eq. (3) if specified for the exponential and Weibull distributions have their baseline hazards given respectively as
$$ {h}_0(t)=p{t}^{p-1} $$
(7)
where p is the shape parameter estimated from the data.
Gompertz distribution
The Gompertz distribution has been extensively used in the medical field for modeling mortality data. Like the Weibull distribution, the Gompertz is also a two-parameter distribution. The hazard and survival functions of the Gompertz distribution are
$$ h(t)=\alpha\ \exp \left(\gamma t\right) $$
$$ S(t)=\mathit{\exp}\left\{-\alpha \gamma \left(\gamma t-1\right)\right\} $$
(8)
The baseline hazards for the Gompertz regression model is
$$ {h}_0(t)=\exp \left(\upgamma t\right) $$
(9)
where γ is an auxiliary parameter estimated from the data.
When the auxiliary parameter (γ) is positive, its hazard function increases with time but if negative, it decreases with time. It is worth mentioning that if γ is zero, the hazard function is reduced to the exponential.
Bayesian proportional hazards model with/without a frailty term
The posterior probability density function which summarizes our beliefs about a particular parameter is obtained via the Bayes’ rule as
$$ \pi\ \left(\theta |D\right)=\frac{\pi \left(\theta \right)L\left(D|\theta \right)}{\int_{\Theta}\pi \left(\theta \right)L\left(D|\theta \right) d\theta} $$
(10)
Which can be summarized as
$$ \pi \left(\theta |D\right)\propto \pi \left(\theta \right)L\left(D|\theta \right) $$
(11)
Therefore, the posterior distribution can be obtained from eq. (11) as
$$ \pi \left({h}_0(t),\beta |D\right)\propto {\prod}_{i=1}^n{\left\{{h}_0\left({t}_i\right)\mathit{\exp}\left({\overset{`}{X}}_i\beta \right){u}_i\right\}}^{\delta_i}\left({S}_0{\left({t}_i\right)}^{\mathit{\exp}\left({\overset{`}{X}}_i\beta \right)}\right)\pi \left(\beta \right) $$
(12)
where the baseline hazards function h0(ti) as provided in eq. (12) takes the form 1, ptp − 1 and exp(γt) for the exponential, Weibull and Gompertz distribution respectively. We specified normal distribution with mean μ0 = 0 and variance \( {\sigma}_0^2=100 \) as priors for the regression coefficients βs with a probability density function
$$ f\left(x|\ {\mu}_0,{\sigma}_0^2\right)=\frac{1}{\sqrt{2\pi {\sigma}_0^2}}{e}^{-\frac{{\left(x-{\mu}_0\right)}^2\ }{2{\sigma_0}^2}} $$
(13)
In analyzing the frailty parameter (u) via the Bayesian approach, we adopt a gamma distribution with mean = 1 and a variance = 1000 which is a conjugate prior for the hyperparameters η.