Study area
Ethiopia is an East African country with an estimated population of more than 100 million which makes it the second-most populous country in Africa [29]. Administratively, Ethiopia is divided into regions and regions are divided into zones, and zones, into administrative units called district (wereda). Each district is further subdivided into kebele that is the lowest administrative unit. Kebeles are also further subdivided into enumeration areas (EAs). Ethiopia, located within 3.30°–15°N, 33°–48°E in the northeastern part of Africa, and it has a total area of 1.1 million square kilometers. It’s topographic features ranges from mountains as high as Ras Dashen 4550 m (m) above sea level (ASL)—to 110 m below sea level in the Afar Depression [4].
Around two-thirds of the country’s territory is favorable for malaria transmission, with malaria primarily associated with altitude and rainfall. Approximately 60% of Ethiopia’s population lives in a malarious area. The highest period of malaria incidence occurs from September to December and from March to May in most parts of the country. The proportion of the population consisting of children under-5 years of age and pregnant women was estimated to be 14.6% and 3.3%, respectively [4].
Data source, study design, sampling procedure, and sample size
This study used cross-sectional survey data from a secondary source extracted from the Ethiopian Malaria Indicator Survey (EMIS), 2015. The EMIS 2015 was the third survey conducted in Ethiopia, a nationwide sample of 13,875 households from 555 EAs was selected.
The EMIS 2015 was used a two-stage cluster sampling methodology. 555EAs were selected in the first stage. Then a complete mapping and listing of all households in the selected EAs were conducted and 25 households were randomly selected for a total of 13, 875 households. Also, the survey involved testing for anemia and malaria among under-five children in all selected households [4].
Since the microscopic examination is the gold standard for the diagnosis of malaria, for this study, children were considered as malaria positive or negative based on theresult of this test only. In this study, we included all available relevant data for children under-5 years of age from the EMIS 2015. The sample size for this study was those all under-five children who were tested for malaria microscopy test. Thus, the number of children whose data were used in this study was 8301.
Study variables
Dependent variable
Malaria microscopy test result among under-five children (Yes/No). The dependent variable was the malaria microscopy test result which was dichotomized into Yes if the test is positive for Plasmodium falciparum, Plasmodium vivax, and mixed infection (both falciparum and vivax) and No if the test negative for all species.
Independent variable
The determinants of malaria among under-five children were grouped into individual-level and community-level determinants.
Individual-related predictors include; the age of a child, sex of a child, household insecticide-treated net (ITN) ownership, the household status of indoor residual spraying (IRS), utilization of nets, number of nets in the household, availability of electricity, housing conditions of the household (floor materials, roof materials, and wall materials), water sources for drinking, time to get water, availability of television, availability of radio, toilet facilities, community-related predictors include; enumeration area, altitude, and region.
Data management and statistical analysis
Sample allocation in the EMIS to different regions as well as urban and rural areas was not proportional. Thus, sample weights to the data were applied to estimate proportions and frequencies to adjust disproportionate sampling and non-response. Since the normality assumption was violated for continuous variables age and altitude, their median with interquartile range was reported in the descriptive analysis. Natural log transformation was applied for these variables before inclusion in the regression model to cure this problem.
The descriptive analysis was performed using both STATA (version 15) and R (version 3.5.2) statistical software and the inferential statistics were done by bayesian statistical software win BUGS (version1.4.3).
Bayesian multilevel logistic regression model
In the usual classical statistics, the analysis of multilevel logistic regression model is based on estimating parameters through Maximum Likelihood Estimation (MLE) and given the asymptotic properties [30]. However, the Bayesian approach has an advantage over the classical approach in the estimation of the model parameters, which is conducted based on their posterior distribution [31].
The EMIS data has hierarchical nature and clustering effect expected in this hierarchical data nature. Therefore, to account for this clustering effect and to get unbiased parameter estimates bayesian multilevel logistic regression analysis was applied to identify determinants of malaria among under-five children in Ethiopia.
In this study, the basic data structure of the two-level logistic regression is a collection of J groups (enumeration areas) and within-group j (j = 1,2,…, J), a random sample nj of level-one units (individual children). The outcome variable is denoted by;
Yij = 1 if ith children are in the jth enumeration area is positive for microscopy test.
0 if the ith children are in the jth enumeration area is negative for microscopy test.
With probabilities, Pij = 1/Xij, Uij) which is the probability of being positive for an ith child (i = 1,2, …nj) from the jth enumeration area. 1-Pij is the probability of being negative for the ith child (i = 1,2, …nj) from the jth enumeration area. Therefore, the model is
$$ Logit\ {\left({Y}_{ij}\right)}_{-}\left( Xij\beta \right)+{U}_{0j\kern1.25em } where\ {U}_{0j}\sim N\left(0,{\sigma^2}_u\right) $$
Xij is the observed value of the predictor variable for a child i in an enumeration area j and U0jis a random effect.
The convergence of the algorithm
In this study, Markov Chain Monte Carlo (MCMC) algorithm was carried out using the Bayesian statistical software Win BUGS version 1.4.3 [32]. The deviance information criterion (DIC) [33] was used to select the best fitted model. The empirical results from a given MCMC analysis are not deemed reliable until the chain has reached its stationary distribution. The term convergence of an MCMC algorithm denotes whether the algorithm has reached its equilibrium distribution or not. If the algorithm has reached its equilibrium distributions, then the generated sample derives from the true target distribution. Therefore, assessing the convergence of the algorithm is essential for producing results from the posterior distribution of interest.
To assess the convergence algorithm in our study, we used autocorrelation, time series plots, Gelman-Rubin statistic, and density plot. All the plots showed that the algorithm has reached its equilibrium (target) distribution for all parameters.
Summary statistics were carried out from the posterior distribution to describe the covariates and adjusted odds ratio (AOR) with corresponding 95% credible interval in the multilevel multivariable logistic regression model was used to select predictors of malaria among under-five children.