Mind the gap: what explains the rural-nonrural inequality in diarrhoea among under-five children in low and medium-income countries? A decomposition analysis

Background Diarrhoea poses serious health problems among under-five children (U5C) in Low-and Medium-Income Countries (LMIC) with a higher prevalence in rural areas. A gap exists in knowledge on factors driving rural-non-rural inequalities in diarrhoea development among U5C in LMIC. This study investigates the magnitude of rural-non-rural inequalities in diarrhoea and the roles of individual-level and neighbourhood-level factors in explaining these inequalities. Methods Data of 796,150 U5C, from 63,378 neighbourhoods across 57 LMIC from the most recent Demographic and Health Survey (2010–2018) was analysed. The outcome variable was the recent experience of diarrhoea while independent variables consist of the individual- and neighbourhood-level factors. Data were analysed using multivariable Fairlie decomposition at p < 0.05 in Stata Version 16 while visualization was implemented in R Statistical Package. Results Two-thirds (68.0%) of the children are from rural areas. The overall prevalence of diarrhoea was 14.2, 14.6% vs 13.4% among rural and non-rural children respectively (p < 0.001). From the analysis, the following 20 countries showed a statistically significant pro-rural inequalities with higher odds of diarrhoea in rural areas than in nonrural areas at 5% alpha level: Albania (OR = 1.769; p = 0.001), Benin (OR = 1.209; p = 0.002), Burundi (OR = 1.399; p < 0.001), Cambodia (OR = 1.201; p < 0.031), Cameroon (OR = 1.377; p < 0.001), Comoros (OR = 1.266; p = 0.029), Egypt (OR = 1.331; p < 0.001), Honduras (OR = 1.127; p = 0.027), India (OR = 1.059; p < 0.001), Indonesia (OR = 1.219; p < 0.001), Liberia (OR = 1.158; p = 0.017), Mali (OR = 1.240; p = 0.001), Myanmar (OR = 1.422; p = 0.004), Namibia (OR = 1.451; p < 0.001), Nigeria (OR = 1.492; p < 0.001), Rwanda (OR = 1.261; p = 0.010), South Africa (OR = 1.420; p = 0.002), Togo (OR = 1.729; p < 0.001), Uganda (OR = 1.214; p < 0.001), and Yemen (OR = 1.249; p < 0.001); and pro-non-rural inequalities in 9 countries. Variations exist in factors associated with pro-rural inequalities across the 20 countries. Overall main contributors to pro-rural inequality were neighbourhood socioeconomic status, household wealth status, media access, toilet types, maternal age and education. Conclusions The gaps in the odds of diarrhoea among rural children than nonrural children were explained by individual-level and neighbourhood-level factors. Sustainable intervention measures that are tailored to country-specific needs could offer a better approach to closing rural-non-rural gaps in having diarrhoea among U5C in LMIC.

than rural areas in LMIC [2]. Consequently, the determinants of childhood diarrhoea could be country or regionally-specific based on the difference in the living environment. Household-level and neighbourhood-level factors may be important in clarifying rural-non-rural differences in diarrhoea morbidity and mortality. This, however, calls for immediate research to understanding the decomposition effect of rural-non-rural difference in the place of residence on diarrhoea episodes among U5C in LMIC.
Although the literature is replete on the risk factors of diarrhoea, there is a dearth of information on the decomposition of factors associated with the development of diarrhoea based on place of residence in LMIC. Hence, this study was designed to address this gap. Having good knowledge of the core individual-level and neighbourhood-level factors driving rural-non-rural health disparities would assist LMIC in planning appropriate intervention measures aimed at improving population health outcomes and reducing the burden of diarrhoea among U5C.

Study design and data
The data from the Demographic and Health Surveys (DHS) collected periodically across the LMIC was used in this study. The ICF Macro, the USA in conjunction with the ministry of health, the office of statistics, and the population commissions in respective LMIC conduct the periodical cross-sectional nationally representative population-based household DHS. We pooled data from the most recent DHS conducted within the last ten years (2010-2018) and available as of April 2020 and which provided information on diarrhoea among U5C. Only 57 LMIC met these inclusion criteria and were included in this study. We analysed data of 796,150 U5C, from 63, 378 neighbourhoods across the 57 LMIC. In each of the countries, DHS used a multi-stage (usually from states/ divisions/regions to the district to clusters), stratified sampling design. The households (the sampling units) are selected from the clusters known as the primary sampling units (PSU) [21,22]. We applied sampling weights provided in the data to all our analysis. This was to adjust for unequal cluster sizes and to ensure that our findings adequately represent the target population. The DHS uses similar surveys and research protocols, standardized questionnaire, similar interviewer training, supervision, and implementation in all the countries. For full details of the sampling methodologies, please visit dhsprogram.com.

Dependent variable
The outcome variable in this study is the recent experience of diarrhoea. Diarrhoea is defined as "passage of liquid stools three or more times a day" [4,5] and "recent experience of diarrhoea" as having any symptom of diarrhoea within two weeks before the interview date [23]. The mothers were asked if any of their U5C had diarrhoea within two weeks preceding the survey. The responses were binary: Yes or No.

Main determinant variable
The main determinate variable in this decomposition study is the rural-non-rural differentials in the location of the residence of the mothers. The DHS data have already classified study clusters into either rural or nonrural areas using similar standard classification procedures as of the time of the surveys with minimal differences in what rural areas were across the countries. We named children born to rural and non-rural women as rural and non-rural children respectively.

Independent variables
The identified variables in the literature [5,20,[24][25][26][27][28] and the Moseley's systematic conceptual framework on study of child survival in developing countries was used to select the explanatory variables in this study [29]. The independent variables used in the study were based on the identified factors associated with diarrhoea among U5C in the literature [5,20,[24][25][26][27][28]. These are made up of the individual-level and neighbourhood-level factors.

Individual-level factors
The individual-level consists of childs' characteristics, mothers' characteristics and the households' characteristics. Childs' characteristics: sex (male versus female), age in years (under 1 year and 12-59 months), weight at birth (average+, small and very small), birth interval (firstborn, < 36 months and > 36 months) and birth order (1, 2, 3 and 4+). Mothers' characteristics: maternal education (none, primary or secondary plus), maternal age (15 to 24, 25 to 34, 35 to 49), marital status (never, currently and formerly married), employment status (working or not working). Households' characteristics: access to media (at least one of radio, television or newspaper), sources of drinking water (improved or unimproved), toilet type (improved or unimproved), cooking fuel (clean fuel or biomass), housing materials (improved or unimproved) and household wealth index (poorest, poorer, middle, richer and richest).

Neighbourhood-level factors
The DHS uses "clusters" as the PSU as people of the same cluster shares similar contextual factors [21,22]. We used the word "neighbourhood" to describe the clustering of the children within the same cluster and "neighbours" as members of the same cluster. The PSUs were identified using the most recent census in each country where DHS held. In this study, we considered neighbourhood socioeconomic status (SES) as a community-level variable. It was computed using principal component factor comprised of the proportion of respondents within the same neighbourhood without education, belonging to a household in poor wealth quintiles and unemployed.

Statistical analyses
We used both descriptive and inferential statistics in this study. Descriptive statistics such as chats, tables, percentages were used to show the distribution of respondents by country, outcome variable and other key variables. Bivariable analysis was conducted to using the Z-test for equality of proportions who had diarrhoea among rural and non-rural children within each country and region (Table 1 (a) and (b)). We also determined the existence of an association between the explanatory variables and the outcome variable among the rural and non-rural groups of children (Table 2(a) and (b)). We carried out country-level comparison of the prevalence of diarrhoea in each of the countries by computing the risk difference (RD) in the development of diarrhoea between U5C from rural and nonrural settings and presented the results in Fig. 1. An RD greater than 0 suggests that diarrhoea is more prevalent among rural children (pro-rural inequality). Whereas, a negative RD indicates that diarrhoea is prevalent among non-rural children (pro-non-rural inequality). We estimated the fixed effects as the weighted country-specific risk differences and the random effect as the overall risk difference irrespective of a child's country of residence. As shown in Fig. 1, forest plot was used to illustrate this distributions. Charts were used to show the distributions of the RDs (Figs. 2 and 3). We conducted tests of heterogeneity to ascertain that the 57 countries are different with regards to the odds ratio of having diarrhoea among the rural and non-rural children using adapted z-test in Stata and carried out a test of homogeneity of ORs among the 20 countries with a significant odds ratio of having diarrhoea to determine if the odds of having diarrhoea in those countries are homogenous. Lastly, the adjusted logistic regression method was applied to the pooled cross-sectional data from the 57 LMIC to carry out a Fairlie decomposition analysis (FDA) and the results presented in Fig. 4.

Decomposition analysis
We applied Fairlie Multivariable decomposition based on the binary regression model. It belongs to the family of decomposition techniques used to quantify the contributions to differences in the prediction of an outcome of interest between two groups in multivariate models [30]. It is an extension of the Blinder-Oaxaca Decomposition Analysis (BODA) [31][32][33]. While the BODA works best for continuous outcomes Fairlie is renowned for the logit and probit model [34][35][36][37][38]. Fairlie et al. noted that the nonlinear decomposition techniques helped to overcome the  challenges of the BODA when group differences are large for an independent variable [35]. We used the Fairlie methods in this study as it was purposively developed for non-linear regression models including the logit and probit models [38]. The Fairlie works by decomposing the difference in proportions based on either the probit or logit models into the portion of the characteristic [30]. The decomposition analysis was carried out by calculating the difference between the predicted probability for one group (say Group A) using the other group's (say Group B) regression coefficients and the predicted probability for that group (Group A) using its regression coefficients [35]. The Fairlie decomposition technique works by constraining the predicted probability between 0 and 1.
Fairlie et al. showed that the decomposition for a nonlinear equation Y = F(X), can be expressed as: Where N A is the sample size for group J [39]. In eq. (1), Y is not necessarily the same as FðXβÞ , unlike in BODA where F(X i β) = X i β. The 1st term is the part of the gap in the binary outcome variable that is due to group differences in distributions of X, and the 2nd term is the part due to differences in the group processes determining levels of Y . The 2nd term also captures the portion of the binary outcome variable gap due to group differences in unmeasurable or unobserved endowments.
The estimation of the total contribution is the difference between the average values of the predicted probabilities. Using coefficient estimates from a logit regression model , the independent contribution of X 1 and X 2 to the group, gap can be written as respectively. The contribution of each variable to the gap is thus equal to the change in the average predicted probability from replacing the group B distribution with the group A distribution of that variable while holding the distributions of the other variable constant. To obtain an accurate decomposition estimate, Fairlie et al. recommended the replication of the decomposition from a minimum of 1000 subsamples and finding the mean values of estimates from each separate decomposition [35]. Further numerical details have been reported [35,36,[38][39][40]. Respectively, the contribution of each variable to the gap is thus equal to the change in the average predicted probability from replacing the group B distribution with the group A distribution of that variable while holding the distributions of the other variable constant. To obtain an accurate decomposition estimates, Fairlie et al. recommended the replication of the decomposition from a minimum of 1000 subsamples and finding the mean values of estimates from each separate decomposition [35]. Further numerical details have been reported [35,36,[38][39][40]. We used the "Fairlie" command in STATA 16 (Stata-Corp, College Station, Texas, United States of America) to carry out the decomposition analysis to enable the quantification of how much of the gap between the "advantaged" (non-rural) and the "disadvantaged" (rural) groups is attributable to differences in specific measurable characteristics. Using the generalised structure of the model, we fitted a logistic model to determine factors influencing diarrhoea occurrence among rural and non-rural children.

Sample characteristics and analysis of inequality
The percentage of children from rural areas was 68.0%, least (11.5%) in Jordan and the highest (90.8%) in Burundi. The overall weighted prevalence of diarrhoea was 14.2, 14.6% vs 13.4% among rural and non-rural children respectively (p < 0.001). The prevalence of diarrhoea among rural children ranged from 4.4% in the Maldives to 32.6% in Yemen, while it ranged from 2.7% in Armenia to 32.4% in Afghanistan among non-rural children. The z-test of   Table 1(a) and (b). As shown in Table 2(a) and (b), all the explanatory variables considered were significantly associated (p < 0.05) with the occurrence of diarrhoea among all the children and by rural-non-rural location of residence except the sex of household head that was insignificantly associated with the occurrence of diarrhoea (p = 0.058) among rural children. The prevalence of diarrhoea was consistently higher among infants compared with those aged 12-59 months both in the rural area (18% vs 14%) and in the non-rural areas (16% vs 13%).

Diarrhoea among rural and non-rural under-five children
The risk differences, a measure of inequality, in the risk of having diarrhoea among children of women from rural and non-rural areas across the countries studied are presented in Figs. 1, 2 and 3. Also, the prevalence of diarrhoea among both the rural and non-rural in each of the countries were computed and presented the results in Fig. 1. The prevalence of diarrhoea was generally higher in the rural areas than in the non-rural areas in all the countries except in Afghanistan, Angola, Burkina Faso, Chad, Congo, Congo DR, Cote d'Ivoire, Dominican Rep, Gambia, Haiti, Malawi, Mozambique, Nepal, Niger, Papua New Guinea, Sierra Leone, Tanzania, and Timor-Leste.
The pro-rural differences in diarrhoea were widest in Burundi (48.99/1000 children) and pro-non-rural RD was widest for Malawi (−47.63/1000) in Eastern Africa. In Middle Africa, the largest pro-rural difference was in Cameroon (57.89/1000) and pro-non-rural RD was highest for Congo (−68.11/1000). In The Caribbean, the prorural difference was widest in Myanmar (27.07/1000) and the pro-non-rural difference was widest in Timor Leste (−56.14). Irrespective of regions, the fixed effects of pro-rural differences was widest in Togo (69.09/1000) while the fixed effects of pro-non-rural differences were widest in Papua New Guinea (86.92/1000). Overall, the Fig. 4 Contributions of differences in the distribution of 'compositional effect' of the determinants of having diarrhoea to the total gap between rural and non-rural children by countries random effects, of the risk difference per 1000 children was 6.22/1000 children with a 95% confidence interval (CI): −0.50-12.93), evidence of insignificant overall prorural inequality. The greatest contribution (weight) to the random effect was found in Nigeria and India at 2% each while the least was in Comoros and Gabon at 1.4% each (Fig. 1).
In Figs. 2 and 3, we used the colours red, orange and to indicate statistically significant pro-rural inequality, insignificant inequality and statistically significant pronon-rural inequality respectively. Based on risk differences, four of the nine countries in Eastern Africa, one of the countries in Middle Africa, Egypt in Northern Africa, two in Southern Africa and seven countries in West Africa showed statistically significant pro-rural inequality. Two countries each in Western and Southern Asia, one country each in Central Asia, Central America and Southern Europe and none in South America and Oceania had statistically significant pro-rural inequality in children having diarrhoea (Figs. 1, 2 and 3).

Relationship between prevalence of diarrhoea and magnitude of inequality
The relationships between the prevalence of diarrhoea and the magnitude of rural-non-rural inequality, a function of RD, across the 57 countries considered in this study are presented in Fig. 3. We categorised the countries into four distinct categories based on their prevalence of diarrhoea and whether or not the differences were small or large: (i) High diarrhoea and high prorural inequality countries such as Togo, Yemen, Cameroun, Burundi and Namibia (ii) High diarrhoea and high pro-non-rural inequality countries such as Afghanistan, Congo, Malawi, and Papua New Guinea (iii) Low diarrhoea and high pro-rural inequality countries such as Nigeria, Egypt, South Africa and Egypt (iv) Low diarrhoea and high pro-non-rural inequality countries such as Timor Leste, Tanzania, and Mozambique.

Decomposition of rural-non-rural inequality in the prevalence of diarrhoea
We first computed Mantel-Haenszel pooled estimate of the odds ratio (OR) of having diarrhoea while controlling for the countries among all the children. We estimated OR = 1.06 (95% CI: 1.04-1.07) and tested if the OR = 1 using z-test; and obtained z = 7. 45  All the 20 countries have statistically significant odd ratios with 95% confidence interval higher than 1 and p-values less than 5% alpha level as shown in Table 3. Whereas, pro-non-rural inequalities were evident in nine countries while the remaining countries experienced insignificant inequalities.
For the purpose of confirmation that the 20 countries were homogeneous as far as significant higher odds of diarrhoea among rural children than among nonrural children is concerned, we computed Mantel-Haenszel pooled estimate of the odds ratio (OR) of having diarrhoea among the children in the 20 countries while controlling for the countries. We had OR = 1.20 (95% CI: 1.17-1.23) and tested the homogeneity of the ORs: X 2 = 144.75, degree of freedom (d.f.) = 19, and p = 0.000. All the tests were significant. We included only the 20 LMIC with significantly higher odds of having diarrhoea among the rural children compared with the non-rural children in the Fairlie decomposition analysis. Figures 4 show the detailed decomposition of the part of the pro-rural inequality caused by compositional effects of the determinants of diarrhoea among under-five children by the pro-rural and pro-non-rural inequality countries respectively. The "explained" (compositional component) and the "unexplained" (structural component) portions of the ruralnon-rural inequalities are depicted by red and blue colours respectively in Fig. 4. The lighter the red colour, the lower the percentage contribution of the "explained" portion and the lighter the blue colour, the lower the percentage contribution of the "unexplained" portion. We found wide variations in the factors associated with the pro-rural and pro-non-rural inequalities across the countries.
Generally, neighbourhood SES, household wealth quintile, access to media, toilet types and maternal age and education were the most important factors in most countries. Specifically, the largest contributions to prorural inequality in the prevalence of diarrhoea were neighbourhood SES (414% higher in communities with lowest SES), followed by household wealth index (128% higher among children from households in the poorest wealth quintiles), maternal education (79% higher among parents with no education), media access and toilet types in Myanmar. In India, the greatest contributors to the disparities are media access, neighbourhood SES, maternal education, maternal ages and birth order. The disparities were better explained by household wealth quintiles, toilet type, and maternal age in Yemen whereas the most significant contributors are neighbourhood SES, household wealth quintile, access to media, toilet types and sources of drinking water in Yemen. Other factors such as childbirth weight, age and sex, mothers' employment status, marital status had the lowest contribution to rural-non-rural inequality in the prevalence of diarrhoea across these countries.

Discussion
In this study, we pooled data from 57 LMIC to assess individual-and neighbourhood-level factors that explain the rural-non-rural inequalities in the development of childhood diarrhoea using the Fairlie Multivariable decomposition analysis. The study was designed on the premise that there are disparities in the health status of children living in rural non-rural areas in LMIC. We found significant disparities across countries in the factors associated with the pro-rural and pro-non-rural inequalities in the occurrence of diarrhoea. The findings in this study show the non-uniform variation in the prevalence of diarrhoea among children whose mothers reside in rural and non-rural communities. This alludes to the importance of residential inequalities in the occurrence of diarrhoea. The significant residential-related differences could be attributed to the individual characteristics across countries.
Similar to outcomes of the previous study, a higher prevalence of diarrhoea was found in the rural area as compared to non-rural areas of study. The pro-rural inequality found mostly in Asian countries as reported in the previous study [1] could be a result of a lack of social amenities and basic infrastructure needed in the rural area. In the non-rural settings, Southern and Western Asia shared the least and most burden of diarrhoea risk difference as reported in another study [41].
The study also identified factors associated with the occurrence of diarrhoea in LMIC. All the examined factors except sex of household head significantly predict the development of diarrhoea among U5C. Notably, in both rural and non-rural settings, the infants are said to be more predisposed to diarrhoea as found in the previous studies [17,42,43] which is said to be more pronounced among the female children though this is contrary to some studies [44,45]. Furthermore, the age of the mother is found to be significantly associated with the development of diarrhoea as children from young mothers age 15-24 are largely exposed. This could be because at this age many of the mothers are still teenagers and some might not have the financial capability and knowledge of raising a healthy child as supported in a previous study [46]. This study also revealed that children born to non-educated mothers are susceptible to diarrhoea. Educated mothers are more likely to have adequate knowledge on the importance of good hygiene as compared to uneducated mothers. This position is supported by Fikire et al. in their study on understanding the determinants of delay in care-seeking for diarrhoea diseases among mothers/caregivers with U5C in public health facilities in Southern Ethiopia [47].
Furthermore, this study shows that unemployment [48], low economic status, and lack of access to media gadget such as television and radio among mothers are associated with the risk of their children developing diarrhoea [49]. As affirmed by other studies, an average weighted child at birth [41] firstborns [46] and children born to parents with low access to infrastructural facilities such as improved toilet [50] and improved housing materials have a higher risk of diarrhoea occurrence. Access to improved toilet facilities allows for safer disposal of faeces and limit the risk of contact between diarrhoea causative organism and human host [2].
Moreover, obvious intercountry differences in the riskdifference in diarrhoea between rural and non-rural areas were recorded. In most of the countries, the prevalence of mortality was of higher proportion in the rural areas, with an exception in 20 countries. Based on regions, the largest statistically significant pro-rural inequality in children having diarrhoea was found in Eastern Africa (in Burundi, Uganda, Rwanda and Kenya), Northern Africa (in Egypt), Southern Africa (in Namibia, South Africa). There was no significant pro-rural inequality in South-Eastern Asia and Oceania. Moreover, no significant pro-non-rural was recorded for Central Asia, Northern Africa, Southern Africa, Western Asia, Central America and Southern Europe. The inequalities observed across the countries are a pointer to the worsening health situation in the rural areas and it calls for urgent intervention.
In decomposing pro-rural inequality in the prevalence of diarrhoea in LMIC, compositional effects were found in factors such as neighbourhood SES, household wealth, wealth index, toilet type, child's age, maternal age and contribute greatly to the prevalence of diarrhoea across countries. This invariably suggests a thorough investigation into these variables as these will go a long way in reducing Diarrhoea occurrence among children of LMIC. Specifically, in countries such as India, Yemen, and Myanmar, diarrhoea is linked to neighbourhood SES, wealth index, maternal education, and unimproved toilet types as supported by several studies [41,51,52]. Many of these countries are densely populated with a higher proportion of women with low socioeconomic status.

Limitations of the study
This study has some key strengths. The use of nationally representative data generated from standardised methods in 57 LMIC gave credibility to the findings of this study in terms of generalizability across countries. The study also presented a clear pattern of diarrhoea prevalence among U5C in LMIC. One of the drawbacks in the current study was that diarrhoea morbidities were measured based on individual self-reported information which may be subject to recall bias and under-reporting and thus, distorts data quality. However, the data originators ensured the reduction of such errors during data collection. Also, the timing of data collection which vary in the studied countries may result in a bias in the comparison of diarrhoea prevalence which occurred at different periods. Besides, the cross-sectional nature of the design of the study restricts the ability to adequately establish causality and determine how rural and non-rural disparities developed over time. Moreover, the definition and categorisation of rural and non-rural areas based on certain criteria differs across countries and could limit cross-country comparisons.

Conclusions
Our study shows significant rural-non-rural differences in diarrhoea prevalence in LMIC. The prevalence of diarrhoea was highest among children whose mothers reside in rural areas and has been linked to neighbourhood-level and individual-level factors. We found significant individual-level and community-level factors associated with pro-rural inequalities in many countries. Tackling childhood diarrhoea is not dependent on advances in technology but the adoption of interventions of proven efficacy that would further help reduce the burden of childhood diarrhoea and mortality. Sustainable intervention measures that are tailored to country-specific needs could offer a better approach to solving rural-non-rural gaps in diarrhoea prevalence in LMIC. Nonetheless, the odds of diarrhoea was higher among non-rural children in some countries. Further research is needed in this regard. However, the reasons could be ascribed to poor hygiene and sanitation in nonrural areas. More so, there are non-rural areas with slums, in which case the slums could have been categorized as non-rural areas. Of concern is also the type of food available to children in non-rural areas.
Public health and community efforts should focus on promoting hygiene programs and intervention such as hand washing, cleaning of toilet and proper disposal of waste in addition to provision of employment opportunities to women.. This calls for the formulation of effective interventions and policies that recognizes the heterogeneity of rural and non-rural communities. There is the need to formulate regional-specific policies, rather than generalised measures, in reducing the gap in ruralnon-rural diarrhoea burden. Also, intervention measures that focus on the redistribution of wealth, better access to improved sanitation among others will go a long way in reducing regional inequalities in childhood diarrhoea.