Skip to main content

Methods for conducting trends analysis: roadmap for comparing outcomes from three national HIV Population-based household surveys in Kenya (2007, 2012, and 2018)



For assessing the HIV epidemic in Kenya, a series of independent HIV indicator household-based surveys of similar design can be used to investigate the trends in key indicators relevant to HIV prevention and control and to describe geographic and sociodemographic disparities, assess the impact of interventions, and develop strategies. We developed methods and tools to facilitate a robust analysis of trends across three national household-based surveys conducted in Kenya in 2007, 2012, and 2018.


We used data from the 2007 and 2012 Kenya AIDS Indicator surveys (KAIS 2007 and KAIS 2012) and the 2018 Kenya Population-based HIV Impact Assessment (KENPHIA 2018). To assess the design and other variables of interest from each study, variables were recoded to ensure that they had equivalent meanings across the three surveys. After assessing weighting procedures for comparability, we used the KAIS 2012 nonresponse weighting procedure to revise normalized KENPHIA weights. Analyses were restricted to geographic areas covered by all three surveys. The revised analysis files were then merged into a single file for pooled analysis. We assessed distributions of age, sex, household wealth, and urban/rural status to identify unexpected changes between surveys.

To demonstrate how a trend analysis can be carried out, we used continuous, binary, and time-to-event variables as examples. Specifically, temporal trends in age at first sex and having received an HIV test in the last 12 months were used to demonstrate the proposed analytical approach. These were assessed with respondent-specific variables (age, sex, level of education, and marital status) and household variables (place of residence and wealth index). All analyses were conducted in SAS 9.4, but analysis files were created in Stata and R format to support additional analyses.


This study demonstrates trends in selected indicators to illustrate the approach that can be used in similar settings. The incidence of early sexual debut decreased from 11.63 (95% CI: 10.95–12.34) per 1,000 person-years at risk in 2007 to 10.45 (95% CI: 9.75–11.2) per 1,000 person-years at risk in 2012 and to 9.58 (95% CI: 9.08–10.1) per 1,000 person-years at risk in 2018. HIV-testing rates increased from 12.6% (95% CI: 11.6%–13.6%) in 2007 to 56.1% (95% CI: 54.6%–57.6%) in 2012 but decreased slightly to 55.6% [95% CI: 54.6%–56.6%) in 2018. The decrease in incidence of early sexual debut could be convincingly demonstrated between 2007 and 2012 but not between 2012 and 2018. Similarly, there was virtually no difference between HIV Testing rates in 2012 and 2018.


Our approach can be used to support trend comparisons for variables in HIV surveys in low-income settings. Independent national household surveys can be assessed for comparability, adjusted as appropriate, and used to estimate trends in key indicators. Analyzing trends over time can not only provide insights into Kenya’s progress toward HIV epidemic control but also identify gaps.

Peer Review reports


In 2017, despite the rapid increase in antiretroviral therapy (ART) use over the previous two decades and the corresponding decline in mortality, approximately one-third of people living with HIV in East and Southern Africa and less than half of those living with HIV in West and Central Africa were not receiving any life-saving treatment [1, 2]. By 2017, HIV/AIDS was a major cause of death in sub-Saharan Africa (SSA), where 71% of all people living with HIV resided. Globally, 75% of HIV-related deaths and 65% of all new HIV infections occurred in SSA [3,4,5].

Against this background, it is important to assess whether interventions over the last two to three decades have decreased HIV incidence and to identify geographic regions and sociodemographic groups with high HIV prevalence [3, 6]. HIV data obtained from national population-based surveys play an important role in monitoring the HIV epidemic and response in the general population. These surveys estimate incidence, prevalence, and various parameters related to the HIV pandemic in high-HIV-prevalence countries. These surveys were designed to monitor progress toward ending the AIDS epidemic [6,7,8]. Additionally, they were designed to monitor the UNAIDS 90–90-90 targets by the year 2020: 90% of all HIV-positive people know their HIV status; of these, 90% are receiving sustained ART; and of these, 90% have achieved viral load suppression [9,10,11]. These surveys have also been used to describe associations between high-risk behavior and HIV status and to assess HIV prevention, care, and treatment services.

Unlike in high-income countries where longitudinal studies provide nationally representative trend estimates for health outcomes, for example, the National Health and Nutrition Examination Survey [12, 13], HIV surveys in low-income countries and high-prevalence settings are generally cross-sectional and are independently implemented approximately once every 5 years. Therefore, it is important to develop methods that can be used to assess trends across independent surveys for countries interested in employing similar techniques. We used Kenya to showcase this approach as there had been several HIV population-based surveys conducted, with varying sampling and survey weighting considerations, in the past two decades. Such methods must account for differences in survey design, weighting, coverage, and indicator definitions. Over the past two decades, five national population-based surveys [14,15,16,17,18] have included HIV testing and HIV modules in their algorithms in Kenya.

We present methods that can be used to assess temporal trends in outcome variables of interest as a means to answer such questions as: “Has HIV risk behavior significantly declined over time in Kenya, and if so, in which demographic groups or regions?” and “Has access to HIV testing services increased over time in Kenya?” Our tools also can help HIV programs appropriately analyze trends in recent population-based HIV surveys in Kenya and provide guidance regarding appropriate statistical comparisons between surveys, including tests for trends. These suggestions may also serve as a roadmap for other cross-survey comparison analyses applicable to other countries or indicators. The methods presented here are being utilized to examine trends in specific indicators of interest in other KENPHIA-focused studies. Therefore, the programmatic implications of selected trends comparison presented in this study are not discussed.


Harmonization of survey datasets

Analysis approach

First, we reviewed survey design documents to describe the survey design and weighting procedures used for all three surveys. We compared sampling design and survey weighting procedures across surveys to identify differences that could potentially influence comparisons. We developed an analysis strategy to both facilitate comparisons and minimize the influence of differences in survey design or weighting procedures on comparisons between survey estimates. Once we chose a weighting approach, we developed a list of variables to extract and harmonize across surveys based on perceived importance, availability, and consistency of definitions across surveys. Once extracted, the weighted estimates of these variables were assessed for consistency across surveys. Finally, we used selected variables to identify and describe appropriate statistical methods for comparisons and trend analysis.

Data extraction and manipulation

We reviewed data dictionaries and other survey documentation to identify relevant survey design and analysis variables pertaining to HIV biomarkers and behavioral and demographic variables across the three surveys for inclusion in the analysis.

Survey design

These surveys were originally designed to provide data used by various stakeholders to monitor Kenya’s population and HIV-related health outcomes. This section briefly summarized the survey design and weighing approaches used in the surveys. All three surveys utilized two-stage stratified, cluster sampling designs based on the National Sample Survey and Evaluation Programme (NASSEP) household-based sample frames created by the Kenya National Bureau of Statistics and revised after each decennial population census.

KAIS 2007 was the first AIDS Indicator Survey conducted in Kenya to monitor progress on key indicators in the national HIV prevention, care, and treatment programs [16]. The survey was designed to obtain a nationally representative sample of persons aged 15–64 years and to provide estimates of HIV-related outcomes stratified by urban/rural residence and the 8 provinces. The first stage included a selection of 415 clusters (70% rural and 30% urban) from the NASSEP IV (based on the 1999 census); the second stage included selecting a sample of 25 households within each cluster.

KAIS 2012 selected 372 clusters from NASSEP V (based on the 2009 census) using a systematic random sampling method. KAIS 2012 sampled 9,300 households within 9 of the 10 National AIDS and STI Control Programme (NASCOP) programmatic regions: Nairobi, Central, Coast, Eastern North, Eastern South, Nyanza, Upper Rift, Lower Rift, and Western regions, designated as either urban or rural. The sampling frame was not available for the North-Eastern region at the time of the survey, and this region (and hence seven NASCOP regions) was excluded from the survey. The target population was persons aged 18 months–64 years. Half of the households were targeted for children aged 18 months–14 years. The survey was designed to provide estimates of HIV-related outcomes for adults aged 15–64 years stratified by urban/rural area and the nine included NASCOP regions.

Like KAIS 2012, KENPHIA 2018 also was based on NASSEP V. KENPHIA was a cross-sectional, household-based survey conducted among persons aged 0–64 years in 800 clusters from 96 urban/rural county strata covering the entire household population of Kenya. In 2012, following the promulgation of the 2010 Constitution of Kenya, these counties became the geographical units of devolved government in place of districts. Survey data collection was conducted from June 2018 to February 2019. Of the 34,610 persons targeted by the survey, 27,897 were adults aged 15–64 years, and 6,713 were children aged 0–14 years. One in three households were targeted for the inclusion of children. The survey was designed to provide estimates for adults aged 15–64 years for all 47 counties in Kenya.

Each of these studies were carried out in accordance with the Helsinki Declaration.

Table 1 presents detailed summaries of the three surveys.

Table 1 Summary of survey designs and HIV testing for KAIS 2007, KAIS 2012 and KENPHIA 2018

Weighting process


The KAIS 2007 design was stratified by district and residency (urban/rural). Urban areas were further stratified by socioeconomic status. Both KAIS 2012 and KENPHIA designs were stratified by county and residency. Household nonresponse adjustments in KAIS 2007 were computed by province and residency, whereas in KAIS 2012, they were computed by NASCOP region and residency, resulting in the following nineteen design strata: Nairobi (Urban), Central (Urban/Rural), Nyanza (Urban/Rural), North Rift (Urban/Rural), South Rift (Urban/Rural), Eastern North (Urban/Rural), Eastern South (Urban/Rural), Western (Urban/Rural), and Coast (Urban/Rural). In KENPHIA, household nonresponse adjustments were computed by county.


The KAIS 2007 and KENPHIA surveys covered the entire national territory, but KAIS 2012 excluded one geographic region, North Eastern. Therefore, to ensure that differences in coverage did not bias trend analyses, this region was omitted from the analysis, thereby stratifying by 17 remaining NASCOP region/residency strata across all three surveys.

Survey weighting

To compensate for over- or under- sampling of individuals or for disproportionate stratification along with the non-response, studies often include several types of survey weights in the datasets that are made available after the survey. Individual, child, and HIV-testing (blood) weights ensure that adults aged 15–64 years, children aged 0–14 years, and individuals selected for HIV testing, respectively, are representative of the population sampled. The survey design and nonresponse weighting approach for KAIS 2007 and KAIS 2012 were similar, and so no adjustments were made to the weights used in these studies. The KENPHIA 2018 survey design weights differed from the KAIS design weights in that no household-level post-stratification adjustments were done, and nonresponse weights were developed using a least absolute shrinkage and selection operator regression and chi-square automatic interaction detection methodology rather than the simpler inverse proportional weighting done by sex and geographic area variables. All the variables available in KENPHIA 2018, whether household, individual and blood draw specific, were used for this purpose [40]. Furthermore, post-stratification weights were developed to age and sex control totals from the national population projections for 2019 for KENPHIA. Therefore, to remove potential biases in comparisons resulting from the differing nonresponse and post-stratification weighting approaches, KENPHIA was reweighted to increase comparability between weighted estimates across the surveys.

Revised KENPHIA weights

A primary sampling unit (PSU) or enumeration area (EA) base weight was computed as the inverse of the probability of selection of the EA. No PSU nonresponse adjustment was made, apart from two ineligible EAs whose weights were set to 0. A household’s initial weight was then computed as a product of the PSU base weight and the inverse of the probability of selection of the household within the EA. An unknown eligibility household nonresponse adjustment was computed as a product of the household initial weight and the inverse of the probability of the household having unknown eligibility. The household weight was further adjusted for the eligible household member nonresponse rate.

Adult person-level weights were assumed equal to the household weight since all adults (aged ≥ 15 years) were eligible in a household. In the case of children (aged 0–14 years), only children in every third household were included in KENPHIA 2018. The child weight was then computed as three times the household weight. For adults, nonresponse adjustments cells were created by NASCOP region, urban–rural residence, and sex, whereas nonresponse-weighting classes for children were not stratified by sex. The post-stratification cells are produced by NASCOP region and sex. The child weights were not post stratified.

A similar approach was used to compute the HIV-testing (blood) weights included in the study.

Data manipulation and merging

Using the three individual survey datasets, we created a dataset that included survey year, the design variables (weights, strata, and cluster), demographic characteristics, and HIV-specific indicators. The stratification variable in the combined dataset consisted of the 17 NASCOP regions. The cluster was uniquely characterized by the survey year and the cluster identifier in each survey. The weights in the combined dataset were normalized such that the normalized weights summed to the total number of respondents in each survey. The SAS program that combines the three datasets and renames and recodes variables to facilitate comparative analyses is available in Supplementary File 1.

To create the combined data file, we combined 2007, 2012, and 2018 files so that the number of respondents in the combined data file was the sum of the respondents from the three individual files. We then ensured that the analysis variables had the same names and values or categories in all three data files. Table 2 illustrates how the variables used in this analysis were redefined. Secondly, the approach to creating the new set of statistical weights is provided in Supplementary File 2.

Table 2 Variables included in the combined dataset for KAIS 2007, KAIS 2012 and KENPHIA 2018

The study investigators did not interact with human subjects or have access to identifiable data or specimens. This was a secondary data analysis using anonymized data from each of the surveys that were included.

Figure 1 describes our suggested approach for harmonization of variables and datasets to perform trend analysis.

Fig. 1
figure 1

Preparing a trend analysis across independent surveys

Assessing comparability of reweighted surveys across key population characteristics

Ideally, a set of unchanging population characteristics could be used to assess the comparability of the original and re-weighted datasets before proceeding with trend analyses. In the absence of such ideal variables, several demographic characteristics such as age, sex, marital status, residency, wealth index, and education, which have predictable trends and have been measured in other surveys over time, can be assessed for trends. In this analysis, we assessed the weighted distribution of each of these variables and used survey-weighted logistic regression to assess changes in the selected characteristics over time (Table 3). We found that there was no significant difference (trend) in key demographic variables selected for comparative assessment of original and re-weighted KENPHIA 2018 datasets.

Table 3 Comparison of the distribution of participants in the 2018 survey computed using the revised and the original KENPHIA weights


Illustrative statistical analysis

Once the comparability of the revised and harmonized datasets is established, it is possible to carry out trend analysis on selected indicators. In our analysis, we selected trends in two behavioral indicators relevant to HIV programs: “Age of sexual debut among respondents aged 20–29 years” and “Tested for HIV in the past 12 months among respondents aged 15–64 years.” We selected these example indicators to illustrate trend analysis for continuous, binary, and time-to-event variables (Fig. 2). Trends were assessed visually and through regression methods, including adjustment for demographic variables to control for other changes in the population over time.

Fig. 2
figure 2

Choosing a statistical method based on type of variable to be analyzed in KAIS 2007, KAIS 2012 and KENPHIA 2018. Abbreviations: GLM, generalized linear model

Characteristics of the study population

Table 4 summarizes the sociodemographic characteristics of study participants. Women were overrepresented in all three surveys with male to female ratios of 1.00:1.33 in KAIS 2007, 1.00:1.38 in KAIS 2012, and 1.00:1.24 in KENPHIA 2018. There was a significant linear decline in the proportion of respondents sampled from within rural settings over time (KAIS 2007, 77.7% [95% confidence interval (CI): 75.1%–80.3%]; KAIS 2012, 62.9% [95% CI: 60.5%–65.3%]; and KENPHIA 2018, 60.7% [95% CI: 58.6%–62.8%]). There were significant variations in the distribution of the respondents by education. Across the three surveys, most respondents had primary education. Marital status varied between surveys. The age structure was generally consistent over time, except for a spike in the 20–24 year age group in 2007, followed by a similar spike in the 25–29 year age group in 2012 and in the 30–34 year age group in 2018. This pattern was consistent with an age cohort moving through the survey populations due to changing fertility or child mortality patterns in the mid-1980s.

Table 4 Demographic characteristics of interviewed study participants age 15–64 years in KAIS 2007, KAIS 2012 and KENPHIA 2018

Table 4 Demographic characteristics of interviewed study participants age 15–64 years in KAIS 2007, KAIS 2012 and KENPHIA 2018.

Sexual debut

Trends in sexual debut were initially assessed visually and through regression methods, including adjusted analyses including demographic variables to control for other changes in the population over time using SAS PROC SURVEYREG. In this case, we assumed that the outcome was continuous and emanated from a Gaussian distribution. For this specific outcome variable, the analysis was restricted to individuals aged 20–29 years at the time of the survey. An Age-Period-Cohort (APC) analysis approach was used, with 2 age categories (20–24 years, 25–29 years), three time periods (2007, 2012 and 2018) and 5 birth-cohorts (1975–1979, 1980–1984, 1985–1989, 1990–1994 and 1995–1999).

Table 5 provides an example of how one can present summaries, trends, and regression results for the analysis of a continuous covariate such as age at sexual debut by selected covariates. In general, the median age at sexual debut of the study participants has increased significantly over time. There was a monotonic increase in the median age at sexual debut by age, level of education and wealth index. Age at sexual debut was consistently higher among older, and better educated individuals and individuals from the richest households. Age at sexual debut increased over time among the women, peaking in 2012 and decreasing slightly in 2018. Age at sexual debut was lower among the married respondents and those separated compared to those who never married. Age at sexual debut was lower among the married respondents and those separated, divorce or widowed compared to those who never married.

Table 5 Trends and regression results for age at first sex by selected covariates in KAIS 2007, KAIS 2012 and KENPHIA 2018

In addition to assessing sexual debut as a continuous outcome variable, we also assessed trends in early sexual debut. Early sexual debut was defined as first vaginal intercourse before 15 years of age [19,20,21,22]. The time taken until first sexual intercourse for anyone who had not had sex by the age of 15 years was considered to be censored. We used the Kaplan–Meier method to compute the survival probability (not having become sexually active by age 15 years) by each age. We used SAS, version 9.4, to produce separate Kaplan–Meier estimates for each level of the covariates of interest. A log-rank test is not available for complex survey data to assess equality of survival curves, but Cox models are available for complex survey data. For our analyses, we used SAS PROC LIFETEST and SAS PROC SURVEYPHREG. The incidence of early sexual debut decreased, although not significantly, from 11.63(10.95–12.34) per 1,000 person-years at risk in 2007 to 10.45(9.75–11.2) per 1,000 person-years at risk in 2012 and further decreased significantly to 9.58(9.08–10.1) per 1,000 person-years at risk in 2018 (Table 6).

Table 6 Person-time analysis and survey-weighted Cox-regression of age at sexual debut by selected covariates in KAIS 2007, KAIS 2012 and KENPHIA 2018

Tested in the last 12 months

We used SAS PROC SURVEYLOGISTIC to fit a survey-weighted logistic regression model to the binary outcome “Tested for HIV in the last 12 months.” Table 7 presents trends in the rates of HIV testing in the past 12 months among individuals aged 15–64 years. The results suggest a significant increase in the HIV-testing rates over time when adjusting for all the covariates considered. HIV-testing rates increased significantly from 12.6% (95% CI: 11.6%–13.6%) in 2007 to 56.1% (95% CI: 54.6%–57.6%) in 2012 but decreased slightly, although not significantly, to 55.6% [95% CI: 54.6%–56.6%) in 2018. Further, based on the survey-weighted logistic regression, after adjustment for all covariates considered, HIV testing rates increased substantially over time.

Table 7 Logistic regression results for HIV in the last 12 months by selected covariates in KAIS 2007, KAIS 2012 and KENPHIA 2018

In Fig. 2, we provide a rubric that can be used to make decisions about the statistical analysis to employ for a given analysis question based on various design considerations. The SAS program used to carry out the crosstabulation, the survey-weighted regression analysis, the survey-weighted logistic regression analysis, the survey-weighted regression, and the person-time analysis is available in Supplementary File 3.


We developed an approach for assessing and harmonizing independent population-based surveys to assess trends in HIV-related indicators. After describing the methods used to design and weight each survey, we harmonized stratification, demographic variables, and survey weights to ensure comparability before proceeding with a trend analysis. In this analysis, the survey weights for the latest survey (KENPHIA 2018) were revised to ensure comparability with the previous two surveys. It is important to note that we developed these methods strictly to allow for comparisons between surveys. The methods are not meant to provide revised or improved estimates for the most recent survey analyzed (KENPHIA). The original weights for KENPHIA are optimal and should be used to analyze and present the results of the KENPHIA survey. Similar approaches to making comparisons between surveys are documented elsewhere in the literature [23,24,25,26,27,28,29,30]. For reproducibility, we also provide the analysis codes that demonstrate how the analysis was carried out and how the comparison was done.

The weighted distributions of demographic variables were consistent across surveys with some exceptions. There was an increasing proportion of the sample that resided in urban areas, as expected given broad development trends in Kenya. The age structure showed spikes in subsequent age groups across surveys, consistent with a cohort effect from reductions in fertility 15–20 years before the KAIS 2007 survey, given that the surveys were spaced at approximately 5-year intervals, consistent with the historical fertility reductions observed in the recent Demographic and Health survey [31] and census [13] in Kenya. Other differences are difficult to explain. For example, the sex distribution seemed skewed in 2007, with 42% of the survey population being men, compared to higher proportions of in the other two surveys (48%–49%), perhaps indicating coverage issues among men in that survey.

We used two outcome variables expected to change over time (HIV testing and age at sexual debut) to demonstrate a methodology to carry out trend comparisons. For HIV testing in the last 12 months, we highlighted two approaches that can be used to assess trends in dichotomous outcomes. We first computed survey-weighted proportions and plotted the resulting trends over time by selected covariates. We then fitted logistic regression models with the year as a covariate, adjusting for age, sex, residence, marital status, and wealth index. This approach has been employed in several other previous studies. Trends in HIV-testing rates have also been discussed extensively in the literature [23,24,25]. The use of chi-square tests of trends and logistic regression were extensively used in the literature.

For age at sexual debut, we show two approaches for assessing trends for a continuous variable. Several studies have also used survey data from low-income countries to assess trends in the HIV-related outcomes considered in our analysis. Several studies have treated age at sexual debut as a time-to-event outcome, assessing this outcome variable's trends among different cohorts observed [32, 33]. These studies have used survival analysis-based approaches to assess trends in the outcome variables of interest. In our analysis, we used two approaches where the first ignored censoring in the age at sexual debut and presented a summary and regression-based results as an example of how trends in continuous outcome variables could be assessed [34]. We then used the survival approach and found a decrease in the risk of early sexual debut over time.

Our analysis is subject to several limitations. The trend comparison was based on using three time points (2007, 2012, and 2018), so we were only able to make relatively short-term assessments of the trajectory of the indicators considered. Previous studies used Cochran Armitage chi-square tests or Z-tests to assess the significance of trends [35,36,37,38]. Survey weighted versions of this statistic were not implemented in our analysis due to limitations in the software we used. Another challenge encountered was the change in the definition of certain variables and indicators over time, adding uncertainty in interpreting the meaning of observed trends. Furthermore, not all questions asked across the three surveys were the same, making it difficult to analyze some of the outcomes across the three surveys. Our study did not address all relevant issues for every conceivable trend analysis that could be conducted with these surveys. For example, changes in HIV-testing algorithms may affect estimates. Great care is needed in interpreting results with potential underlying methodological differences. Finally, there are alternatives to population-based surveys for measuring trends in health conditions. For example, in 2018, Kenya established an HIV Case-Based Surveillance system to measure progress along the HIV care cascade to provide high-quality, timely, and reliable HIV data by population characteristics. Despite current limitations and challenges, this system will provide an opportunity for future assessment of trends based on a census of events rather than population-based sampling, as presented here.

The pooling of data from multi-year surveys aimed at assessing trends in key public health performance indicators is an important part of interrogating the impact of programs and interventions. However, this pooling of data from large complex surveys leads to data sets with large sample sizes which inadvertently increase statistical power [39]. This increased power leads to a tendency of finding statistically significant differences however small they are. Therefore, researchers need to distinguish between statistical difference and scientific difference. The statistically significant difference that arises from larger sample sizes may not be scientifically meaningful.


We have provided approaches and considerations that can be used to support trend comparisons for various outcome variables in HIV surveys in low-income settings. Our approach has demonstrated that independent national household surveys conducted over time can be assessed for comparability, adjusted as appropriate, and used to estimate trends in key indicators. Analyzing trends over time can not only provide insights into Kenya’s progress toward HIV epidemic control but also identify gaps in key HIV indicators.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.



Kenya AIDS Indicator surveys


Kenya Population-based HIV Impact Assessment


Antiretroviral therapy


Sub-Saharan Africa


National Health and Nutrition Examination Survey


National Sample Survey and Evaluation Programme


Kenya National Bureau of Statistics


National AIDS and STI Control Programme


Primary sampling unit


Enumeration area


Kenya Medical Research Institute


Institutional Review Board


Centers for Disease Control and Prevention


University of California, San Francisco


Confidence Interval


Odds ratio


Odds ratio


Demographic and Health survey


  1. Joint United Nations Programme on HIV/AIDS (UNAIDS): 2020 global AIDS update-seizing the moment-Tackling entrenched inequalities to end epidemics. In.; 2020.

  2. Joint United Nations Programme on HIV/AIDS (UNAIDS): UNAIDS DATA 2020. In.; 2020.

  3. Dwyer-Lindgren L, Cork MA, Sligar A, Steuben KM, Wilson KF, Provost NR, Mayala BK, VanderHeide JD, Collison ML, Hall JB. Mapping HIV prevalence in sub-Saharan Africa between 2000 and 2017. Nature. 2019;570(7760):189–93.

    CAS  Article  Google Scholar 

  4. James SL, Abate D, Abate KH, Abay SM, Abbafati C, Abbasi N, Abbastabar H, Abd-Allah F, Abdela J, Abdelalim A. Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. The Lancet. 2018;392(10159):1789–858.

    Article  Google Scholar 

  5. Roth GA, Abate D, Abate KH, Abay SM, Abbafati C, Abbasi N, Abbastabar H, Abd-Allah F, Abdela J, Abdelalim A. Global, regional, and national age-sex-specific mortality for 282 causes of death in 195 countries and territories, 1980–2017: a systematic analysis for the Global Burden of Disease Study 2017. The Lancet. 2018;392(10159):1736–88.

    Article  Google Scholar 

  6. Joint United Nations Programme on HIV/AIDS (UNAIDS): Monitoring the declaration of commitment on HIV/AIDS: guidelines on construction of core indicators: World Health Organization; 2005.

  7. Ghys PD, Williams BG, Over M, Hallett TB, Godfrey-Faussett P: Epidemiological metrics and benchmarks for a transition in the HIV epidemic. PLoS Med. 2018;15(10):e1002678.

  8. Justman JE, Mugurungi O, El-Sadr WM. HIV population surveys—bringing precision to the global response. N Engl J Med. 2018;378(20):1859–61.

    Article  Google Scholar 

  9. Simbayi L, Zuma K, Zungu N, Moyo S, Marinda E, Jooste S, Mabaso M, Ramlagan S, North A, Van Zyl J: South African National HIV Prevalence, Incidence, Behaviour and Communication Survey, 2017: towards achieving the UNAIDS 90–90–90 targets. In. Human Sciences Research Council (HSRC); 2019.

  10. Akullian A, Morrison M, Garnett GP, Mnisi Z, Lukhele N, Bridenbecker D, Bershteyn A. The effect of 90–90-90 on HIV-1 incidence and mortality in eSwatini: a mathematical modelling study. The Lancet HIV. 2020;7(5):e348–58.

    Article  Google Scholar 

  11. Green D, Tordoff DM, Kharono B, Akullian A, Bershteyn A, Morrison M, Garnett G, Duerr A, Drain PK. Evidence of sociodemographic heterogeneity across the HIV treatment cascade and progress towards 90–90-90 in sub-Saharan Africa–a systematic review and meta-analysis. J Int AIDS Soc. 2020;23(3):e25470.

    Article  Google Scholar 

  12. Johnson CL, Paulose-Ram R, Ogden CL, Carroll MD, Kruszan-Moran D, Dohrmann SM, Curtin LR: National health and nutrition examination survey. Analytic guidelines, 1999–2010. 2013.

  13. 2019 Kenya Population and Housing Census. 2019, Volume I: Population by County AND Sub-County.

  14. KDHS. Kenya demographic and health survey. In: Central Bureau of Statistics, Ministry of Planning and National Development. 2003.

    Google Scholar 

  15. KNBS, ICF Macro: Kenya Demographic and Health Survey 2008–09. 2010. In. Calverton, Maryland: Kenya National Bureau of Statistics and ICF Macro; 2009.

  16. Maina WK, Kim AA, Rutherford GW, Harper M, K’Oyugi BO, Sharif S, Kichamu G, Muraguri NM, Akhwale W, De Cock KM. Kenya AIDS Indicator Surveys 2007 and 2012: implications for public health policies for HIV prevention and treatment. J Acquir Immune Defic Syndr (1999). 2014;66(Suppl 1):S130.

    Article  Google Scholar 

  17. Waruiru W, Kim AA, Kimanga DO. The Kenya AIDS indicator survey 2012: rationale, methods, description of participants, and response rates. J Acquir Immune Defic Syndr (1999). 2014;66(Suppl 1):S3.

    Article  Google Scholar 

  18. Johnson K, Way A. Risk factors for HIV infection in a national adult population: evidence from the 2003 Kenya Demographic and Health Survey JAIDS. Journal of Acquired Immune Deficiency Syndromes. 2006;42(5):627–36.

    Article  Google Scholar 

  19. Gómez AM, Speizer IS, Reynolds H, Murray N, Beauvais H. Age differences at sexual debut and subsequent reproductive health: Is there a link? Reprod Health. 2008;5(1):1–8.

    Article  Google Scholar 

  20. Magnusson B, Crandall A, Evans K. Early sexual debut and risky sex in young adults: The role of low self-control. BMC Public Health. 2019;19(1):1–8.

    Article  Google Scholar 

  21. Magnusson BM, Masho SW, Lapane KL. Early age at first intercourse and subsequent gaps in contraceptive use. J Womens Health. 2012;21(1):73–9.

    Article  Google Scholar 

  22. O’Donnell L, O’Donnell CR, Stueve A. Early sexual initiation and subsequent sex-related risks among urban minority youth: the reach for health study. Fam Plann Perspect. 2001;33(6):268–75.

    CAS  Article  Google Scholar 

  23. Ante-Testard PA, Benmarhnia T, Bekelynck A, Baggaley R, Ouattara E, Temime L, Jean K. Temporal trends in socioeconomic inequalities in HIV testing: an analysis of cross-sectional surveys from 16 sub-Saharan African countries. Lancet Glob Health. 2020;8(6):e808–18.

    Article  Google Scholar 

  24. Okeafor CU, Okeafor IN. Trends in sexual risk behavior, HIV knowledge and testing among reproductive-aged women in Nigeria: DHS 2003–2013. HIV/AIDS Rev Int J HIV Related Problems. 2017;16(2):107–11.

    Google Scholar 

  25. Achia TN, Obayo E: Trends and correlates of HIV testing amongst women: lessons learnt from Kenya. Afr J Prim Health Care Fam Med 2013, 5(1).

  26. Davis WW. Examining trends and averages using combined cross-sectional survey data from multiple years. J Los Angeles California: UCLA Center for Health Policy Research; 2007.

    Google Scholar 

  27. Lee S, Davis W, Nguyen H, McNeel T, Brick J, Flores-Cervantes I: Examining trends and averages using combined cross-sectional survey data from multiple years. CHIS Methodology Paper 2007:1–24.

  28. Eaton JW, Rehle TM, Jooste S, Nkambule R, Kim AA, Mahy M, Hallett TB. Recent HIV prevalence trends among pregnant women and all women in sub-Saharan Africa: implications for HIV estimates. JAIDS. 2014;28(4):S507.

    Google Scholar 

  29. Kimanga DO, Ogola S, Umuro M. Prevalence and incidence of HIV infection, trends, and risk factors among persons aged 15–64 years in Kenya: results from a nationally representative study. J Acquir Immune Defic Syndr. 2014;66(Suppl 1):S13.

    Article  Google Scholar 

  30. Yuen CM, Weyenga HO, Kim AA, Malika T, Muttai H, Katana A, Nganga L, Cain KP, De Cock KM. Comparison of trends in tuberculosis incidence among adults living with HIV and adults without HIV–Kenya, 1998–2012. PLoS ONE. 2014;9(6):e99880.

    Article  Google Scholar 

  31. Macro IJN, Kenya. pdf: Kenya demographic and health survey 2014. 2014.

  32. Marston M, Slaymaker E, Cremin I, Floyd S, McGrath N, Kasamba I, Lutalo T, Nyirenda M, Ndyanabo A, Mupambireyi Z. Trends in marriage and time spent single in sub-Saharan Africa: a comparative analysis of six population-based cohort studies and nine Demographic and Health Surveys. Sex Trans Infect. 2009;85(Suppl 1):i64–71.

    Article  Google Scholar 

  33. Slaymaker E, Bwanika JB, Kasamba I, Lutalo T, Maher D, Todd J. Trends in age at first sex in Uganda: evidence from Demographic and Health Survey data and longitudinal cohorts in Masaka and Rakai. Sex Trans Infect. 2009;85(Suppl 1):i12–9.

    Article  Google Scholar 

  34. Zaba B, Boerma T, Pisani E, Baptiste N: Estimation of levels and trends in age at first sex from surveys using survival analysis. MEASURE Evaluation Project, Carolina Population Center Working Paper 2002(0251).

  35. Kharsany AB, Frohlich JA, Yende-Zuma N, Mahlase G, Samsunder N, Dellar RC, Zuma-Mkhonza M, Karim SSA, Karim QA. Trends in HIV prevalence in pregnant women in rural South Africa. J Acq Immune Defic Syndr (1999). 2015;70(3):289.

    Article  Google Scholar 

  36. Asamoah-Odei E, Calleja JMG, Boerma JT. HIV prevalence and trends in sub-Saharan Africa: no decline and large subregional differences. Lancet. 2004;364(9428):35–40.

    Article  Google Scholar 

  37. Hessol NA, Eng M, Vu A, Pipkin S, Hsu LC, Scheer S. A longitudinal study assessing differences in causes of death among housed and homeless people diagnosed with HIV in San Francisco. BMC Public Health. 2019;19(1):1–12.

    Article  Google Scholar 

  38. Torres TS, Cardoso SW. Velasque LdS, Marins LMS, Oliveira MSd, Veloso VG, Grinsztejn B: Aging with HIV: an overview of an urban cohort in Rio de Janeiro (Brazil) across decades of life. Brazilian J Infect Dis. 2013;17(3):324–31.

    Article  Google Scholar 

  39. Suresh KP, Chandrashekara S. Sample size estimation and power analysis for clinical research studies. J Hum Reprod Sci. 2012;5(1):7.

    Article  Google Scholar 

  40. Population-based HIV Impact Assessment (PHIA) Data Use Manual. New York, NY. April 2021.

Download references


The authors thank the KENPHIA and KAIS field teams for their contribution during KENPHIA and KAIS data collection and all the children and families who participated in these national surveys.


The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the funding agencies.

Attribution of support

This analysis is based on data from the national data warehouse, which is supported by the President's Emergency Plan for AIDS Relief (PEPFAR) through the Centers for Disease Control and Prevention (CDC) under the terms of Grant Number U2GGH001226.


All the work described in the paper was funded by Centers for Disease Control and Prevention (CDC). The authors are wholly responsible for the design of the study, conduct of the analyses, interpretation of the results, and writing the manuscript.

Author information

Authors and Affiliations



T.A., P.Y., and S.M. designed the study protocol and performed statistical analysis and interpretation. T.A. drafted the manuscript. T.A. and P.Y. participated in the data cleaning and statistical analysis. All authors read and approved the final manuscript.

Authors’ information

Not applicable.

Corresponding author

Correspondence to Thomas Achia.

Ethics declarations

Ethics approval and consent to participate

The KAIS 2007 and KAIS 2012 protocols were approved by the Scientific Steering Committee and the Ethical Review Committee at the Kenya Medical Research Institute (KEMRI) and by the Institutional Review Board (IRB) at the U.S. Centers for Disease Control and Prevention (US CDC) and the Committee on Human Research of the University of California, San Francisco (UCSF).

The KENPHIA 2018 survey protocol was approved by the US CDC IRB, the Columbia University Medical Center IRB, and the Scientific and Ethical Review Unit (SERU) at the KEMRI (KEMRI/RES/7/3/1). It was reviewed in accordance with the Helsinki Declaration with Centers for Disease Control (CDC) human research protection procedures and was determined to be research, but CDC investigators did not interact with human subjects or had access to identifiable data or specimens. All adults, 18 years or older provided written informed consent, while for children younger than 18 years old, parental/guardian informed written consent and the minor’s assent were obtained before participants took part in the study. The study was carried out in accordance with the Helsinki Declaration.

Consent for publication

Not applicable.

Competing interests

The author(s) declare(s) that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Supplementary file 1.

SAS Code that merged from three National HIV Population-based Household surveys in Kenya (2007, 2012, and 2018).

Additional file 2: Supplementary File 2.

SAS code for the analysis of the outcome variable Age at first sex.

Additional file 3: Supplementary File 3.

Proposal for the reweighting of variable in the 2018 Kenya Population-based HIV Impact Assessment (KENPHIA) survey data to match the 2012 Kenya AIDS indicator survey approach.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Achia, T., Cervantes, I.F., Stupp, P. et al. Methods for conducting trends analysis: roadmap for comparing outcomes from three national HIV Population-based household surveys in Kenya (2007, 2012, and 2018). BMC Public Health 22, 1337 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • HIV
  • Trends
  • Survey design
  • Stratification
  • Survey weights
  • Clustering
  • Multistage sampling