Study population and setting
Ghana, located in the Sub-Saharan African (Figure 1), has a population of more than 27 million people and a population density of 113persons/km2. Approximately 51% of the populations are residing in urban centres according to the 2010 census [16]. Ghana has ten administrative regions, subdivided into 170 districts on which 2010 census was conducted, and has approximately 75 ethnic groups with different socio-cultural practices [16, 17]. This study was conducted on the 170 districts using malaria incidence for the 2010 – 2014 period.
Data sources and variables
Clinically diagnosed malaria cases for outpatient visits at all health facilities in Ghana during the study period 2010–2014, were obtained from the Centre for Health Information and Management (CHIM) within the Ghana Health Service (GHS). Routinely, clinical diagnoses of malaria are based on parasitological microscopy and/or rapid diagnostic test (undertaken at public and private hospitals, clinics and health centres) but mostly by rapid diagnostic test at Community Health Planning Services programme (CHPS) zones in accordance with World Health Organization criteria. The CHPS programme improves coverage of malaria ascertainment for underserved communities and villages in rural areas by using trained community health nurses to render basic clinical and public health services, including diagnosis and treatment of malaria. The entire population of Ghana is at risk of malaria since malaria is endemic in all parts of the country with seasonal variations. Shapefiles for the 170 local health administrative districts were obtained from the Survey and Mapping Divisions, Accra and the Geomatic department of KNUST. Data were available nationally at the district level. Sociodemographic characteristics were obtained from the 2010 Population and Housing Census (PHC) which had complete population coverage on the 170 districts, providing information relating to the various aspects of the populations and households. The district-level proportions (expressed as percentage of the total) of the socio-demographic factors used in the study were described concisely as follow:
-
Basic education level: proportion of the population aged 6 years and older who attended or currently attending basic school (from elementary to junior high school).
-
Illiteracy: proportion of the population aged 15 years and older who cannot read and write any one of the three languages; one Ghanaian language, English and French.
-
Religion: Proportion of the population identified as Christian, Islam, Traditional and other religion or none religion.
-
Urbanisation by population size: In Ghana a locality within a district with a population of ≥ 5,000 people was classified as urban, and less than 5,000 as rural.
-
Population density: Population per square kilometre within the district.
-
Inter or intra-migration: Information on place of birth and the non-Ghanaian population were used to identify intra-migration of the population within Ghana, and inter-migration across national boundaries.
-
Traditional (unimproved) housing units: Proportion of households living in houses with the outer walls/roofing/floor materials made of traditional materials such as mud brick/earth, wood, bamboo, thatch/palm leaf, sandcrete/landcrete and stone.
-
Household Overcrowding index: Computed from the sum of the five indicators consisting of population per dwelling, single room occupancy and sleeping room, average household size and households per dwelling.
-
Dependency ratio: Number of dependents (child and old age) per 100 people undertaking paid employment.
-
Employment-to-population ratio (EPR): Age-specific proportions of the population aged 15 years and over who undertook paid employment.
-
Household in Agriculture: Proportion of households for which at least, one person in the household is engaged in any type of farming activity; crop farming, tree growing, livestock rearing and fish farming.
-
Household Insanitation Index: The indicators included were the main source of drinking water, toilet and bathing facilities, and solid and liquid wastes disposal. The WHO/UNICEF Joint Monitoring Programme for Water Supply and Sanitation [18] standard method of classifying sanitation facilities and drinking-water sources as ‘improved (safe)” and “unimproved (unsafe) was used in this study. The indicators identified as unimproved (unsafe) sanitary conditions were combined as insanitation index to reflect relative degree in a district.
Spatial statistical analyses
Malaria incidence rates were estimated followed by assessment of the spatial dependency within and between the health outcome (malaria) and risk factors (sociodemographic risk factors). Statistically significant risk factors were selected for the excess risk and conditioned choropleth maps.
Incidence estimation and spatial weights
The crude district-level annual malaria incidence rates, RMal for the i-th district in the year t was estimated as
$$ {R}_{Mal_{it}}=\frac{X_{Mal_{it}}}{P_{it}}\times 10,000 $$
(1)
where\( {X}_{Mal_{it}} \) denotes the reported malaria case counts at the district i (i= 1, 2, … , n =170) for the year t (t = 2010, 2011, … , 2014), andPit denotes the population in district i for the year t. The cumulative and five-year average incidence rates were also calculated for each district.
A spatial weights matrix was created based on first-order queen polygon contiguity. The effects of first-order queen polygon contiguity, merging both rook and bishop contiguities, are sufficient to capture spatial autocorrelation given the size and shape of the districts in Ghana. The irregularity of the shapes of the districts, hence the adoption of this contiguity approach in past studies to avoid neighbourless districts [19,20,21] and it is suitable to represent malaria transmission. Rook or bishop contiguity can leave gaps, which would not represent malaria transmission very well. Hence districts that shared common edges and/or common corners were considered neighbours and weights were assigned to these identified neighbours. The spatial weights were row-standardized such that for each row Σwij = 1 if districts i and j shared a common boundary; otherwise Σwij = 0, for non-neighbouring districts. Following standard convention, we excluded “self influence” by assuming that wii = wjj = 0 so that W has zero diagonals.
Empirical Bayes Smoothing of incidence rates
We used Empirical Bayes Smoothing using the principle of shrinkage [1, 22, 23] to stabilise incidence rates for areas with small populations or disease counts. We assumed that the relative risks of people residing in district i
(δi) were independently and identically distributed according to a Poisson distribution:
$$ {x}_i/{\delta}_i\sim Poisson\left({N}_i{\delta}_i\right) $$
(2)
where xi is the random variable representing disease count in district i while Ni is expected count for the same district. The Empirical Bayes Smoothed (EBS) relative risk of malaria, \( {\widehat{R}}_{Mal_{it}} \) borrows the neighbouring district rates to adjust the uncertain rates as per the expression:
$$ {\widehat{R}}_{Mal_{it}}={\phi}_i{R}_{Mal_{it}}+\left(1-{\phi}_i\right){m}_{\delta_i} $$
(3)
where ϕi is the ratio of prior variance to the data variance, and \( {m}_{\delta_i} \)is the prior mean (weighted sample mean). The final EBS rate remains practically unchanged for districts with relatively large population or cases [23].
Measuring spatiotemporal patterns and disease-risk factor associations
We checked and established spatiotemporal patterns of rates from 2010 to 2014 using global and local Moran’s indices. Global Moran’s I was used to determine whether or not identifiable spatial patterns exist over space and time [4, 5, 22] and Anselin local Moran’s Ii (the most widely used Local Indicator of Spatial Association, LISA), to identified specific districts and locations exhibiting spatial autocorrelation with their neighbouring districts as clusters or outliers [23,24,25]. The statistical inference was based on Monte Carlo randomisation test at 999 permutations with significance pseudo p-value<0.05 [4, 5, 19]. Non-spatial correlation was evaluated with Pearson correlation while global bivariate Moran's I was estimated to examine the spatial correlation between the five-year average incidence of malaria and the sociodemographic covariates. The statistically significant sociodemographic determinants were selected for excess risk and conditioned choropleth maps. Due to the ERM and CCM computational functionality in GeoDa, all spatial statistical maps were generated using GeoDa software version 1.12 even-though this package has lower cartographic quality as compared to other spatial packages especially ArcGIS.
Mapping excess risk ratio as influenced by risk factors
The excess or relative risk is a form of standard morbidity or mortality rate (SMR) often used in public health which is estimated as the ratio of observed rate to the expected rate. The expected rate is the average rate for all the population at risk in each location which is computed as the ratio of the sum of all events in all locations to the sum of all the populations at risk [4, 5]. Implemented with excess risk map functionality in GeoDa, we calculated excess risk maps (ERMs) of malaria incidence (event variable) for each statistically significant socio-demographic covariate (base variable) [4, 5, 26].
Exploring Malaria incidence with Conditioned Choropleth Maps
Both non-spatial Pearson correlation and spatial bivariate Moran's I analyses were performed between every pair of statistically significant risk factors to determine how they might act together or in sequence to influence malaria transmission. These analyses informed selection of the pairs of risk factors for the conditional choropleth mapping. We adopted conditioned choropleth mapping using the five-year average incidence rates of malaria as dependent variable (theme variable) and two strongly correlated significant sociodemographic factors (covariates) to visualise the three variables simultaneously. This resulted in a 3 x 3 panel of nine micromaps for which panel columns corresponded to the three categories of one covariate and the rows correspond to the three categories of the other covariate.