Data sources
We analyzed data from the 2011 Uganda AIDS Indicator Survey (UAIS) and health facility HIV testing data from the national DHIS2 collected during 2011. UAIS data was downloaded from the Measure DHS website www.measuredhs.com after obtaining consent from ICF/Macro international, while health facility testing data was extracted from the DHIS2 hosted at MoH after obtaining written permission from MoH. Ethical clearance to conduct this study was obtained from the University of Witwatersrand Human Research Ethics Committee (HREC) and Uganda National Council for Science and Technology (UNCST).
Uganda AIDS Indicator survey
The UAIS is a nationally representative, population-based, HIV serological survey, designed to provide HIV prevalence estimates at national and regional levels [19]. The survey used a two-stage cluster sampling design. For the 2011 survey, Uganda was divided into 10 geographical regions each consisting of 8–15 neighboring districts. Clusters were randomly selected from each region with probability proportional to number of households in the cluster. The estimated number of households per cluster were projections from the 2002 National Population and Housing Census (NPHC) [20]. Clusters were enumeration areas from the 2002 NPHC. Sample sizes were allocated equally across the 10 geographical regions. A systematic sample of 25 households were then selected from each cluster using the 2002 NPHC sampling frame. All adults present in the selected household and who consented to participate in the survey were interviewed [19]. More details about the survey are available from www.measuredhs.com.
For this study, a total of 19,475 individuals (8532 men and 10,943 women) aged 15–49 years and tested for HIV during the survey were considered. Variables included in the analysis were (i) at cluster level: area of residence (urban/rural) and region of the country and at (ii) individual level: respondents’ gender, marital status, education level attained, number of sexual partners including husband/wife in the 12 months preceding the survey, employment status and distance to nearest health facility.
A multilevel logistic regression model was fitted to the UAIS data to obtain the respondents’ probability of testing for HIV in a health facility. The model was fitted using a total of 470 clusters. The average number of observations per cluster were 45(min = 11 and max = 64). Unequal sample selection probabilities were accounted for by incorporating scaled sampling weights. Carle’s methodology was applied to adjust/scale the sampling weights [21]. The models were fitted using maximum likelihood method in Stata statistical software, release 15 [22].
Survey respondents were considered to have tested for HIV at a health facility if they reported that they tested for HIV in health facility and received their test results in the 12 months preceding the survey. Pregnant or breastfeeding women who tested for HIV during antenatal care attendance and individuals who tested at an HIV care centre such as The AIDS Support Organization (TASO) and AIDS Information Centre (AIC) were included in the analysis. Health facilities included facilities owned and managed by government (public) and private organizations that reported HIV testing data to the national DHIS2.
Health facility data
Health facility HIV testing data comprised of data reported to the national DHIS2. The system was developed to provide accurate, timely and quality routine data for monitoring and planning for the health sector in Uganda [9, 23]. Training and technical support from development partners and MoH has led to improvement in the quality and reliability of data in the system [9]. Aggregated HIV testing data is reported by health facilities to the DHIS2 on a monthly basis. The data includes HIV testing at all inpatient and outpatient departments in health facilities. For 2011 reporting period, data was disaggregated by age (i.e. 0–14, 15–49 and 50+ years) and gender (male and female). For this study, we considered males and females aged 15–49 years.
Indicators considered for this analysis were: number of individuals who were tested and received their HIV test results (A) and number of individuals who tested HIV positive (B). For ANC data, we considered number of pregnant women counseled, tested and received their HIV test results (C) at first antenatal visit and the number who tested HIV positive (D). HIV counseling and testing algorithm in Uganda recommends HIV testing for any individual whose most recent negative HIV test result was conducted more than 3 months prior to the current visit to the health facility [24]. Some individuals may test multiple times within a year but may not disclose to health workers resulting in double counting, a key limitation for this study. Furthermore, some pregnant women may test for HIV before seeking antenatal care and test again during antenatal attendance leading to double counting in the data reported to the national DHIS2.
Variables based on DHIS2 data were defined as follows;
Total number of individuals tested for HIV = (A + C)
Number HIV positive = (B + D)
Total number of males tested for HIV = males in A
Number of males tested HIV positive = males in B
Total number of females tested for HIV = (females in A) + C
Number of female tested HIV positive = (females in B) + D
Addressing possible bias in health facility data
Routine facility data collected as part of service delivery consists of individuals who self-select, limiting its’ use for general population health indicator monitoring. To obtain general population indicator estimates, some researchers have used census projections as denominators, however this approach often results in coverage estimates that are greater than 100% [25]. Population surveys are preferred to obtain health indicator denominators since their design takes into account population distribution in the country [25,26,27,28]. The UAIS comprise two subpopulations, namely individuals who tested for HIV in a health facility in the 12 months preceding the survey (the facility testers) and those who did not test for HIV in a health facility (the non-facility testers) for the same period. We assume that the UAIS estimates of HIV prevalence for those who tested for HIV in a health facility and for those who did not test for HIV in a health facility are accurate at regional levels, since estimates of domain proportions from a multistage survey are unbiased. We apply this assumption to adjust the denominators of the DHIS2 data so that at the regional level, DHIS2 HIV prevalence estimates are similar to UAIS prevalence estimates. The adjustment process was carried out as follows:
- 1.
We obtained the HIVs prevalence \( {\hat{k}}_f \) among health facility testers in each region in the UAIS data.
- 2.
We adjusted denominators in the DHIS2 data for each region using \( {n}_{ajdusted}^r=\frac{n_{pos}}{{\hat{k}}_f} \), where npos is the observed number of individuals who tested HIV positive in each region in the DHIS2 data.
- 3.
Calculated an adjustment factor (δf) for each region, using \( {\delta}_f=\frac{n_{ajdusted}^r}{n_r} \), where nr is the observed number of individuals who tested for HIV in each region from the DHIS2 data.
- 4.
We applied the adjustment factor (δf), to obtain \( {n}_{ajdusted}^d \), the adjusted number of individuals who tested for HIV in a health facility at district level using, \( {n}_{ajdusted}^d={\delta}_f\ast {n}_d \), where nd is the observed number of individuals tested for HIV at district level.
- 5.
HIV prevalence (Pf) based on DHIS2 adjusted data in the district was then obtained as a ratio of npos, the total observed positives and nadjusted the adjusted number of individuals who tested for HIV in the district, i.e. \( {P}_f=\frac{n_{pos}}{n_{ajdusted}^d} \)
Hybrid prevalence estimation methodology
We consider n individuals in the UAIS to include nc individuals who tested for HIV at a health facility during the 12 months preceding the survey and know their test result and \( {n}_{\underset{\_}{c}} \) individuals who did not test for HIV at a health facility and therefore do not know their HIV status. i.e. \( n={n}_c+{n}_{\underset{\_}{c}} \). Using health facility prevalence computed in step 5 above, we computed district HIV prevalence as a weighted average of prevalence from DHIS2 data, Pf and prevalence among individuals who did not test for HIV in a health facility, \( {\hat{P}}_{\underset{\_}{s}} \) estimated from the UAIS data.
$$ \hat{P}={\hat{\pi}}_c{P}_f+\left(1-{\hat{\pi}}_c\right){\hat{P}}_{\underset{\_}{s}} $$
(1)
where;
\( \hat{P} \) – HPE/combined estimate, \( {\hat{\pi}}_c \) – the estimated probability of testing for HIV in a health facility, Pf− Adjusted HIV prevalence for individuals tested at a health facility and \( {\hat{P}}_{\underset{\_}{s}} \) – HIV prevalence for individuals tested during the survey and had not tested for HIV in a health facility in the 12 months preceding the survey. We estimated \( {\hat{\pi}}_c \) from UAIS data using multilevel logistic regression adjusting for both individual and cluster level factors. Applying this model, we account for clustering at cluster level [25]. Although the probability of testing for HIV in a health facility was obtained at individual level, we used average district level probability of testing to combine the estimates. Since average probability of HIV testing is obtained from a survey sample containing both facility and non-facility testers, we estimate the variance and standard errors (SE) for the HPE respectively as follows;
$$ {\displaystyle \begin{array}{l} Var\left(\hat{P}\right)=\frac{1}{n}\left\{{\hat{P}}_{\underset{\_}{s}}\left(1-{\hat{P}}_{\underset{\_}{s}}\right)\left(1-{\hat{\pi}}_c\right)+\left(1-{\hat{\pi}}_c\right)\ {\left({P}_f-{\hat{P}}_{\underset{\_}{s}}\right)}^2\right\}\\ {}\mathrm{and}\kern0.37em SE=\sqrt{\mathit{\operatorname{var}}\left(\hat{P}\right)}\end{array}} $$
(2)
We assess accuracy of the HPEs compared to survey-based prevalence estimates by computing the percentage change in standard errors. We further assessed agreement of the estimates obtained using the HPE methodology with those from population survey method (Direct population survey estimate) using a Bland Altman analysis [26, 27].
All analysis was carried out in Stata statistical analysis software, Release 15 [22] and R version 3.5.0 [28].