Understanding the determinants of under-five child mortality in Uganda including the estimation of unobserved household and community effects using both frequentist and Bayesian survival analysis approaches

Background Infant and child mortality rates are among the health indicators of importance in a given community or country. It is the fourth millennium development goal that by 2015, all the United Nations member countries are expected to have reduced their infant and child mortality rates by two-thirds. Uganda is one of those countries in Sub-Saharan Africa with high infant and child mortality rates, therefore it is important to use sound statistical methods to determine which factors are strongly associated with child mortality which in turn will help inform the design of intervention strategies Methods The Uganda Demographic Health Survey (UDHS) funded by USAID, UNFPA, UNICEF, Irish Aid and the United Kingdom government provides a data set which is rich in information on child mortality or survival. Survival analysis techniques are among the well-developed methods in Statistics for analysing time to event data. These methods were adopted in this paper to examine factors affecting under-five child mortality rates (UMR) in Uganda using the UDHS data for 2011 in R and STATA software. Results Results obtained by fitting the Cox-proportional hazard model with frailty effects and drawing inference using both the frequentists and Bayesian approaches at 5 % significance level, show evidence of the existence of unobserved heterogeneity at the household level but there was not enough evidence to conclude the existence of unobserved heterogeneity at the community level. Sex of the household head, sex of the child and number of births in the past one year were found to be significant. The results further suggest that over the period of 1990–2015, Uganda reduced its UMR by 52 % . Conclusion Uganda has not achieved the MDG4 target but the 52 % reduction in the UMR is a move in the positive direction. Demographic factors (sex of the household head) and Biological determinants (sex of the child and number of births in the past one year) are strongly associated with high UMR. Heterogeneity or unobserved covariates were found to be significant at the household but insignificant at the community level.


Background
Infant and child mortality rates are important indicators of societal and national development as they serve as key markers of health equity and access [1,2]. Despite huge investments by national governments and developmental partners in improving access to health care, the reduction of infant and child mortality rates by two-thirds between 1990-2015 as stipulated in the fourth Millennium Development Goal (MDG4) [3,4] has not been attainable within low and middle income countries [5].
In response to the MDG4 most countries in the Sub-Saharan Africa region have instituted mechanisms and policies aimed at addressing weaknesses in their health systems and engaging policy makers to look at inequalities in outcomes. Despite these measures, most countries in the region have not met the MDG4 target [6]. Uganda in East Africa is a low income Sub-Saharan African country with a high UMR. The UMR for the five years immediately preceding the 2011 Uganda Demographic and Health survey (corresponding roughly to 2006-2010) was reported to be 90 deaths per 1,000 live births [7]. Previous studies have not suggested declines in UMR in Uganda. Over the period 1995-2000, the UMR increased from 147.3 to 151.5 deaths per 1000 live births. There is further evidence in literature that the UMR remained unchanged in the period of 1991-1995 but declined in the period of 2001-2005 to 125 deaths per live births. These disconcerting figures can be attributed to numerous factors, key among them is the heavy HIV/ AIDS burden in Uganda. In 2011 there were an estimated 1.4 million people living with HIV/AIDS in Uganda, of whom an estimated 190,000 were children. An estimated 62,000 people died from AIDS in 2011 and 1.1 million children have been orphaned by the devastating epidemic in Uganda [8].
In order to develop measures to reduce infant and child mortality rates, an assessment of individual and contextual determinants of child survival is necessary [4]. Based on existing theoretical frameworks, sex of the child, place of residence, birth intervals and maternal education have been identified as significant predictors of child survival. Within the Ugandan context, previous studies suggest a need to identify determinants of child survival so as to design relevant interventions and programs, appropriate to local and national needs [9]. Most studies on child survival in Uganda have employed standard survival methodologies, like the Cox-proportional hazard model, to identify factors associated with child mortality, ignoring unobserved heterogeneity at cluster or household level. This study uses a shared frailty model within the Bayesian Integrated Nested Laplace Approximation (INLA) paradigm [10,11] to investigate determinants of UMR in Uganda.

The data
The data used in this study was collected during the 2011 Uganda Demographic Health Survey (UDHS) which was carried out from May through December 2011 [12]. This was the fifth comprehensive survey conducted in Uganda as part of the world wide demographic and health surveys [13].
A representative sample of 10,086 households was selected during the 2011 UDHS. The sample was selected in two stages. A total of 404 enumeration areas (EAs) were selected from among a list of clusters sampled for the 2009/10 Uganda National Household Survey (2010 UNHS). In the second stage of sampling, households in each cluster were selected from a complete listing of households. Eligible women for the interview were aged between 15-49 years of age who were either usual residents or visitors present in the selected household on the night before the survey. Out of 9,247 eligible women, 8,674 were successively interviewed with a response rate of 94 % (91 % in urban and 95 % in rural areas). The study population for this analysis includes infants born between exactly one and five years preceding the 2011 UDHS; who were the outcomes of singleton deliveries and who either survived the infancy period or not.
Children born to women aged between 15-49 years of age from 4285 households and 404 communities were considered for this analysis. One has to note that we excluded children that died before one month (28 days and below) and it is also important to note that we excluded all births in the year 2011 (the year of the survey). The number of observation at this level was 6,692 representing the number of children dead or alive, born in the period of five years preceding the date of the survey.

The outcome variable
Under-five child mortality is defined as mortality from the age of 1 months to the age of 59 months. Therefore, the dependent variable in this study is "the risk of death occurring in an age interval in the 1-59 month period". The outcome variable was thus survival time in months of the children under the age of five.

Explanatory variables
Based on a literature survey [4,14,15] and limitations like high level of missingness in the dataset used, we assessed the nature of the response variables and the following covariates: mother's age group (less than 20 years, 20-29,30-39,40 + years); type of residence (Urban, Rural); mother's level of education (illiterate, primary, secondary and higher); partner's level of education (Illiterate, primary, secondary and higher); birth status (Singleton birth, multiple births); sex of the child (male, female); wealth index (poorest, poorer, middle, richer, richest); children ever born (one child, two children, three children, four and more); birth order (first child, second to third child, 4th-6th child); religion (Catholic, Muslim ,other Christians, others); types of toilet facility (flush toilet, pit latrine, no facility); mother's occupation (not-working, sales and service, agriculture); births in the past one year (no births, one birth, two births); children under the age of five in the household (no child, one child, two children, three, four); sex of the household head (male, female); source of drinking water (piped water, borehole, well, surface/rain/ pond/lake, others); mother's age at first birth (less than 20 years, 20-29, 30-39 years).

Preliminary survival analysis
The Schoenfeld residual test [16] was carried out in the R software using the cox.zph command. Under this approach it is assumed that the regression parameters of covariates do not vary with time. All those whose regression parameters changed with time do not satisfy the proportional hazard assumption and were therefore not included in the final Cox-PH model. The results of this analysis have been presented in Table 2.
The estimation and results were performed using the R software [17].
Three non-Bayesian models were considered. The first model (Model I) was the standard Cox Proportional Hazards model; the second (Model II) was a model with a household specific frailty term; and the final model (Model III) was a model with a community specific frailty term. Appendix 1 provides a detailed mathematical description of these models.

Bayesian survival analysis
In this study, a model that assumed that the time to death of the children under-five followed a Weibull distribution was used. The Weibull model for time to event is a popular parametric model because it inherently relaxes the assumption of constant hazard as is the case with the exponential distribution. This model was implemented with and without family and community effects so as to investigate the main factors that affect UMR in Uganda. Bayesian inference was carried out using the R library INLA [18] which implements the Integrated Nested Laplace approximation approach for latent Gaussian models [11].
Four distinct Bayesian survival models were considered: the first (Model IV) was a Bayesian Weibull survival model; the second (Model V) was a Bayesian Cox-PH model; the third (Model VI) was Bayesian (Weibull) model with community level frailty; and finally (Model VII) was a Bayesian (Weibull) model with household level frailty. Appendix 2 provides a detailed mathematical description of these models.

Analysis approach
Data analysis was done by using R and STATA software. The R libraries Mass, survival and packages frailty pack were used for the analysis of the data. STATA inbuilt commands for survival analysis were used to do the analysis under the frequentist approach. STATA command stptime was used to compute mortality rates. The main reason for using both the frequestist and Bayesian approaches was to have confidence in the results when the two approaches agree.

Results
Five years prior to the survey, Uganda had an UMR of 90 per 1000 live births which was almost 15 times the average rate in high-income countries (6 deaths per 1000 live births) [19]. The MDG4 target is aimed at reducing UMR by two-thirds. The time this target was set Uganda had an UMR of 147 per 1000 live births. From the analysis, the UMR is estimated to be 71.28 per 1000 live births, which is still high compared to the global UMR of 46 per 100 live births [19]. Despite the fact that Uganda has not achieved the MDG4, the results from our analysis suggest that the UMR for the country has reduced by 52 %. Table 1 shows the distribution of deaths of the children under the age of five at each factor level included in the analysis.
In this study most of the variables considered were categorical. For variables which were not categorical, their categorizations were adopted from previous research [20][21][22][23][24]. Table 1 presents the distribution of death of children under the age of five for each covariate considered in the analysis. It shows that among the illiterate mothers, out of the 4493 children born, 7.7 % died before celebrating their fifth birthday which was the highest death proportion followed by mothers who had completed primary education with 6.7 % of the deaths and lastly, mothers who had acquired secondary and higher education with 4.2 % proportion of deaths. The table also summarises the distribution of deaths and births of children for all the other covariates considered in this study. Table 2 presents the results for testing the proportional hazard assumption. Mother's education, total number of children ever born, type of place of residence, type of birth, previous birth interval and wealth index violated the proportionality hazard assumption(p-values less than 0.05). They were therefore not included in the fitted Cox-PH model. The other variables like sex of the household head, father's education, sex of the child, and number of births in the past one year, mother's age group and mother's age at first birth are the only factors that satisfied the PH assumption and were therefore included in the final model. Table 3 presents the factors that were strongly associated with high UMR. This table also summarizes the results of all the models considered for the frequentist approach.
Based on the Cox-PH model, the number of births in the past one year and the sex of the household head were found to be strongly associated with high mortality rates. The children whose mothers had more than one birth in the past on year were at a higher risk of death than those whose mothers had no birth at all. The children born in households headed by women were at a high risk of death than those born in households where the man is the head. A study done by [25,26], pointed out factors associated to UMR in Uganda as; mother's education, sex of the child, place of residence, birth intervals, household size, mother's age at first birth, duration of breast feeding, household hardship, place of delivery and mother's education. Some of which agree with the results from our study.
Lastly, Table 4 presents the results from the Bayesian analysis which leads to generally similar results but identifies another factor strongly associated to a high UMR as mothers' age group. Figs. 1, 2, 3, and 4 show the survival and the cumulative hazard curves for selected covariates considered in this study. These figures confirm the results presented in Tables 1, 2, 3, and 4 for these covariates.
By using the likelihood ratio test with a null hypothesis that the variance of the community frailty term is zero (θ = 0), the chi-square test statistic yielded a p-value of 0.052. At 0.05 level of significance, it implies that there is not enough evidence to show the existence of unobserved heterogeneity at community level. This statement implies that the survival times of children under the age of five within the same community can be well explained by the observed covariates considered in the study using the hazard ratios presented by the results from the analysis without the community frailty term. In this case therefore one can use the standard Cox-PH model because the results suggest that there is no difference on the conclusions that would be drawn about the data set.
In the case of household or family frailty, there were 4285 households in the sample considered for analysis. The variance component of the frailty term (household frailty) is θ = 1.78 , which is significantly different from zero, and gives evidence of the existence of the unobserved heterogeneity at family or household level. This implies that there are other factors affecting under-five child mortality at household level that are not explained by the observed covariates included in the model. The sources of the unobserved heterogeneity at the household level can be attributed to access to food, child care, sanitation and other factors that cannot be easily measured or observed at household level. Note that the variables which failed the PH assumption and were not modelled could contribute to the significance of this effect. The results further suggest that some households were associated with a higher risk of children dying before celebrating their fifth birthday than others. However this is an area which needs further research in order to explain the reasons for this unobserved heterogeneity at a local level.
These results indicate that more efficient interventions may be those that target individual households rather than communities. Interventions among many may include visits by the health or government officials to the households with children under the age of five to record any vital information on their health and household environment, educating women on health care for children under the age of five at household level and lastly door to door education for women at child bearing age on contraception . These interventions may be expensive but may realistically help reduce the high UMR in Uganda and hence help in achieving or realising the millennium development goal.

Discussion
The UMR estimated in this paper indicates a decline in the UMR for Uganda but still high compared to target rate. This suggests that the MDG4 target for Uganda has not been met despite the fact that we have reached the deadline. Uganda just like the other sub-Saharan Africa countries has not met the MDG4 target but is showing a steady decline in national UMR.     Number of births in the past one year, sex of the child and sex of the household head are the factors associated with increased risks of UMR in Uganda. All the above mentioned factors relate to inappropriate child spacing, customs and norms practised by families and communities, and lastly family related problems (Family violence). Similar results have been reported elsewhere in the literature [14,20,27]. From the results, women who had given birth to more than two children in the year had their children at a higher risk of death before reaching the age of five. This factor explains the inappropriate child birth intervals and may be a result of lack of knowledge of the available family planning methods. As evidenced from the data, only 25 % of the women in the sample were using Modern family planning methods like injections, pills among others, 3.4 % were using traditional methods, 47 % were non-users and intend to use later and lastly 23 % of these women did not intend to use these family planning methods at the time of the survey. The health facilities where these women go for antenatal care have failed to inform the women about the family planning methods available to them , 50 % of the women confessed that the health facilities did not inform them about family planning and only 27 % of the women claimed to have been well informed and 22 % of these women had missing information. This article supports the view that mothers or women should be made more aware of the contraception options available to increase birth intervals. This would lead to a reduction in UMR in the country and hence help the country to achieve the MDG4 sometime in the future. Fig. 1 The estimated survival curve for children under the age of five in female headed households is above that of male headed households. This implies that female headed households are associated to a low under-five child survival rate Fig. 2 The estimated cumulative hazard curve for the male children is above that of the female children indicating that boys are at a higher risk of death before celebrating their fifth birthday than girls . Fig. 3 The estimated survival curves show that women whose age at first birth was below 20 years and that of those who were above 30 years put their children at a high risk of death before celebrating their fifth birthday Fig. 4 The estimated survival curves show that women with secondary school and higher education increased the chance of survival for their children under the age of five. The women with no formal education put their children below the age of five at a higher risk of death Male children were at a high risk of death than their female counterparts. This may be due to the fact that majority of the tribes in the country have a cultural norm of viewing the girl child as a source of wealth through bride price [28]. In order to achieve the MDG4 target, policies that target factors like education, poverty reduction among others especially with emphasis in rural communities will help to break such cultural norms.
Female headed households were associated with an increased risk of UMR than those that are headed by the males. Since in most of the tribes in the country, a man is considered to be the bread winner and head of the household, finding a household headed by a woman is directly linked to a family that is insecure in a number of ways such as food availability and previous history of home violence. These women choose to leave their original marital homes with their children because of such ills. This is a problem because most of the women cannot work and at the same time take care of the family which are often large. Laws on marriage aimed at protecting women and children from domestic violence and also addressing the issue of who takes care of the children in case of a divorce should be passed by the legislature. These laws need to be enforced at the local administrative level rather than only being discussed at the national level. Education for the girl child should be emphasized so that women are fundamentally and financially capable of taking care of their children in the event of a separation. As evidenced from the data, the types of jobs most of these women do are odd jobs due to their low level of education. The data suggests that 789 men had acquired secondary and higher level of education and less than half that number for women had aquired the same level of education (331 women had secondary and higher education) .
The household and community variations summarise the effects of biological, parental competence, genetic, customs and other unobserved factors that are not accounted for by the fixed effects at household and community level respectively.
The results suggest that deaths tend to cluster in some households and to a smaller extent in some communities.

Conclusions
The UMR of 71.28 [95 % CI:65.11-77.44] per 1000 live births indicates a decline in the UMR for Uganda but still lagging behind on the achievement of the MDG4 target despite the deadline. Government interventions must address issues like passing the marriage law to ensure that children under the age of five are safe even after the divorce of their parents. Education of a girl child especially in the rural communities should be emphasized to break the cultural norms like taking the child (girl child) as a family wealth through bride price.
The results also suggest that government interventions should focus on small communities containing few household rather than a big community in order to reduce on the heterogeneity across households.
The paper also shows that the results from the Bayesian approach are consistent with those from the frequetist approach but in most of the research papers on under-five mortality, researchers have ignored the use of the Bayesian approaches despite their advantages over the Frequestists approaches.

Limitations of the analysis
Demographic health survey datasets are cross-sectional in nature and therefore prone to problems like high level of missingness due to failure of the respondents in recalling past events and the fact that some covariates which could help in the analysis may not be captured in the survey. The high level of missingness was evident in the 2011 UDHS dataset and among the covariates that had a high level of missingness include; birth intervals both preceding and succeeding with 1261 and 3812 cases respectively and number of antenatal visits with 2950 missing cases. Thus possible extensions for further research include the use of models that account for missing data in surveys as well as considering more flexible survival analysis models that do not necessarily rely on the proportional hazards assumption. More advanced methods like survival trees and random survival forests are also better options when analyzing large datasets and identifying more frail groups (frailty effects).

Strengths of the analysis
We used Bayesian inference. This is very special because Bayesian approaches have been found to have some advantages over the frequestist approaches. Below are some of the advantages of using a Bayesian approach for analyzing data over the frequentist approach: 1. Bayesian models allow for informative priors such that prior knowledge can be used to inform the current model. 2. Bayesian inference assumes (the data to be fixed) the observed data is fixed and the unknown parameters to be random which is the opposite of the frequentist inference. The Frequentists estimation is therefore not based on the data at hand but data at hand plus hypothetical repeated sampling in future with similar data. 3. There is no Frequentists probability distribution associated with the unknown parameters or hypotheses. Bayesian inference therefore estimates a full probability model. 4. Bayesian inference estimates the probability of the hypothesis given the data were as the frequentists estimate the probability of the data given the hypothesis. Hypothesis testing itself suggests that one should test for the hypothesis given the data.
Other strengths of the analysis are derived from the fact that we have used the Integrated Nested Laplace Approximation(INLA) for Bayesian inference. This is a simple but powerful tool for Bayesian inference. The other tools for Bayesian inference have not been programmed to handle large data sets as this (over 6000 cases) and in case one succeeds with programming it for a large dataset, it takes days or even weeks to get the results. With INLA the model that took the longest time took about 1081.709 seconds to run and the results are known to be close to those one would get if they used other software's for Bayesian like WinBUGS [18].

Appendix 1
Survival analysis techniques were used in this paper on the 2011 Uganda Demographic Health Survey data to examine the effect of frailty and other factors provided in the data set on under-five children survival in Uganda.
The Cox-proportional hazard model is a more general and the most commonly used model in modelling the hazard and survival function. The Cox model [29] has the from: Model 1: h(t|X i ) = h 0 (t)exp(X i T β).
The Cox-proportional hazard model assumes a proportional hazard implying that the model cannot be used in situations where the assumption is violated. Its strength lies in its ability to leave the baseline hazard unspecified and not dependent on any parametric distribution which may not be easy to discern.
Let N denote the number of children considered in the UDHS 2011 data set with each child in the data set belonging to a given household found in a given community. Let the total number of households or communities be denoted by G such that, given the i-th household or community that consists of n i children under the age of five, then; X G i¼1 n i ¼ N: We define the variable δ ¼ 1; child is dead at the time of interview 0; child is still alive at the time of interview: & as the censoring indicator. The hazard function of the j-th child of the i-th household or community is given as: where X ij is a vector of covariates for child j in the i-th household or community, u i the unobserved covariates and h 0 (t) denotes the baseline hazard function.
As per the above model formulation it implies that the variable z i = exp (u i ) is the frailty term. The hazard function can therefore be written as: Note that it is assumed that the Z i 's are independent with an identical probability density function denoted as f(z).