Skip to content


  • Research article
  • Open Access
  • Open Peer Review

Neighborhood clustering of non-communicable diseases: results from a community-based study in Northern Tanzania

  • 1, 2, 3, 5Email author,
  • 2,
  • 2, 4,
  • 1, 2, 5,
  • 1, 2, 3, 5 and
BMC Public HealthBMC series – open, inclusive and trusted201616:226

  • Received: 8 October 2015
  • Accepted: 1 March 2016
  • Published:
Open Peer Review reports



In order to begin to address the burden of non-communicable diseases (NCDs) in sub-Saharan Africa, high quality community-based epidemiological studies from the region are urgently needed. Cluster-designed sampling methods may be most efficient, but designing such studies requires assumptions about the clustering of the outcomes of interest. Currently, few studies from Sub-Saharan Africa have been published that describe the clustering of NCDs. Therefore, we report the neighborhood clustering of several NCDs from a community-based study in Northern Tanzania.


We conducted a cluster-designed cross-sectional household survey between January and June 2014. We used a three-stage cluster probability sampling method to select thirty-seven sampling areas from twenty-nine neighborhood clusters, stratified by urban and rural. Households were then randomly selected from each of the sampling areas, and eligible participants were tested for chronic kidney disease (CKD), glucose impairment including diabetes, hypertension, and obesity as part of the CKD-AFRiKA study. We used linear mixed models to explore clustering across each of the samplings units, and we estimated absolute-agreement intra-cluster correlation (ICC) coefficients (ρ) for the neighborhood clusters.


We enrolled 481 participants from 346 urban and rural households. Neighborhood cluster sizes ranged from 6 to 49 participants (median: 13.0; 25th–75th percentiles: 9–21). Clustering varied across neighborhoods and differed by urban or rural setting. Among NCDs, hypertension (ρ = 0.075) exhibited the strongest clustering within neighborhoods followed by CKD (ρ = 0.440), obesity (ρ = 0.040), and glucose impairment (ρ = 0.039).


The neighborhood clustering was substantial enough to contribute to a design effect for NCD outcomes including hypertension, CKD, obesity, and glucose impairment, and it may also highlight NCD risk factors that vary by setting. These results may help inform the design of future community-based studies or randomized controlled trials examining NCDs in the region particularly those that use cluster-sampling methods.


  • Chronic kidney disease
  • Cluster design
  • Design effect
  • Epidemiology
  • Intra-cluster correlation
  • Non-communicable disease
  • Sub-Saharan Africa
  • Variance


Non-communicable diseases (NCDs) are a growing global epidemic that disproportionately affect low- and middle-income countries (LMICs) [1]. In sub-Saharan Africa, they are now one of the leading causes of death among adults, and in order to begin to address this burden, high quality community-based epidemiological studies from the region are urgently needed [25]. Additionally, outcomes-related research either through observational cohort studies or randomized-controlled trials (RCTs) will be an important component of the public health response moving forward [6].

Nonetheless, many challenges exist in carrying out these studies. Poor infrastructure and a lack of resources in many of the sub-Saharan African countries limit rigorous studies, in part due to inadequate methodological capabilities. Physical addresses, phonebooks, and reliable census data are often unavailable for many populations in the region which means that representative community-based samples often require labor-intensive, prospective household surveys. In this context, cluster-designed sampling methods offer an efficient, practical, and cost-effective means of obtaining a representative sample from the population of interest [7, 8].

However, studies that use cluster sampling methods require extra considerations in their design and analyses, and cluster-designed studies in sub-Saharan Africa continue to inadequately address many of these considerations [9]. Because study participants or households are drawn from clusters, which serve as the primary sampling unit, they can demonstrate more homogeneity than would otherwise be expected from a simple, random sample. For NCDs, similar lifestyles, environmental risks, economic stress, and genetic backgrounds may all increase homogeneity within clusters, and consequently, this increased homogeneity within clusters, or intra-cluster correlation (ICC), can significantly affect the precision of population parameter estimates [10, 11]. The ICC is typically quantified by the ICC coefficient, and although the ICC coefficient can be calculated post hoc during the analysis stage, this method may not be preferable or ethical in many sub-Saharan African settings due to cost and limited resources. Accounting for the design effect beforehand allows for more accurate estimations of sample size, budget requirements, and logistical needs; however, for NCD-related research, few ICCs have been reported in the region [9, 10].

The Comprehensive Kidney Disease Assessment for Risk Factors, epidemiology, Knowledge, and Attitudes (CKD-AFRiKA) study is an ongoing project in northern Tanzania with the goal of understanding and addressing the health burden of chronic kidney disease (CKD) and CKD-related NCDs. As part of the study, we conducted a cluster-designed, community-based epidemiologic survey. In the design stage, we were unable to identify any comparable ICCs for health outcomes related to CKD or CKD-related NCDs, and we had to extrapolate them from data derived from high-income settings. To fill this gap, we report here the observed intra-cluster correlations for multiple NCD-related factors from a community-based, sub-Saharan African setting [12].


Ethics, consent, and permissions

The study protocol was approved by Duke University Institutional Review Board (#Pro00040784), the Kilimanjaro Christian Medical College Ethics Committee (EC#502), and the National Institute for Medical Research in Tanzania. Written informed consent (by signature or thumbprint) was obtained from all participants.

Study setting

We conducted a stratified, cluster-designed cross-sectional household survey between January and June 2014 in the Kilimanjaro Region of Tanzania, which has an adult population of more than 900,000 people [13, 14]. The region comprises seven districts, and our study was conducted in two of these districts, Moshi Urban and Moshi Rural, which served as strata for our sampling scheme. Within these districts, there are 21 and 31 administrative wards respectively that range in size from 1500 to 25,000 people. Each ward is then further sub-divided into neighborhoods (also known as streets). Neighborhoods are the most basic governmental administrative unit in Tanzania, and they range in population size from 500 to 5000 people. The 65 urban neighborhoods have a median population size of 2000 people and a median area of 0.50 km2. The 165 rural neighborhoods have a median population size of 2200 people and a median area of 4.00 km2. In total, there are 230 neighborhoods/streets in the Moshi Urban and Moshi Rural districts [14].

Sampling methods

We used a three-stage cluster probability sampling method stratified by urban and rural. We used a random-number generator to select twenty nine neighborhoods within the Moshi Urban and Moshi Rural districts. We based the random neighborhood selection on probability proportional to size sampling according to the 2012 national census [14]. From the twenty-nine neighborhoods, we then randomly selected the starting point for each sampling area (37 in total) using geographic coordinates randomly generated by Arc Global Information Systems (ArcGIS), v10.2.2 (Environmental Systems Research Institute, Redlands, CA). From the randomly-selected geographic point, we then chose households based on a coin-flip and die-rolling technique (Appendix 1). All non-pregnant adults (age ≥ 18 years old) living in the selected households were recruited. A neighborhood cluster, therefore, included a group of individuals living in geographically-related households within the boundaries of an administrative neighborhood.

We targeted an enrollment between 15 and 25 participants per sampling area based on the requirements of the CKD AFRiKA study. The total sample size was designed to estimate the community prevalence of CKD with a precision of 5 % when accounting for the cluster-design effect, assuming a CKD prevalence up to 20 % and an ICC coefficient of 0.05. To reduce non-response rates, we attempted a minimum of two additional visits during off-hours (evenings and weekends) and multiple phone calls using mobile phone numbers.

Data collection

Our data collection methods have been previously described in detail [12]. In brief, participants were tested for CKD and CKD-related conditions including diabetes and hypertension, and anthropomorphic data (including height, weight, and body mass index) were recorded for each participant.

CKD was defined as the presence of albuminuria (≥30 mg/dL; confirmed by repeat assessment) and/or a reduction in the estimated glomerular filtration rate (eGFR) ≤60 ml/min/1.73 m2 according to the Modification of Diet in Renal Disease equation without the race factor [15]. Hypertension was defined as a single blood pressure measurement of greater than 160/100 mmHg, a two-time average measurement of greater than 140/90 mmHg, or the ongoing use of anti-hypertensive medications. Glucose impairment was defined as an HbA1C >6.0 % in the presence or absence of ongoing treatment with anti-hyperglycemic medications. Diabetes mellitus was defined as an HbA1c level was ≥7.0 % or current known use of anti-hyperglycemic medications for the purpose of treating diabetes. Participants with an HbA1C between 6.0 % and 6.0 % in the absence of treatment with anti-hyperglycemic medications were considered to have pre-diabetes. Overweight was defined as a body mass index (BMI) greater than 25 kg/m2 and obesity was defined as a BMI greater than 30 kg/m2.

Data analysis

We used STATA version 13 (STATA Corp., College Station, TX) for all data analyses. Continuous variables were summarized by the mean and standard deviation (SD) or median and inter-quartile range (IQR). Categorical variables were summarized using counts and percentages. To address potential non-response bias, mean and prevalence estimates were sample-adjusted using age- and gender-weights based on the 2012 urban and rural district-level census data [14]. To estimate the level of clustering in health outcome variables at the household level, the sampling area level, and the neighborhood cluster level, we first fitted a mixed effect model with separate random intercepts for neighborhood, sampling area, and household for each of the outcomes of interest. In these models, after accounting for neighborhood, very little clustering (<15 %) remained at the sampling area level and household level indicating that most of the variation in these outcomes was explained at the individual and neighborhood cluster-levels. As such, we estimated the ICC for the neighborhood clusters only.

To estimate the absolute-agreement ICC coefficient for neighborhood clusters (ρ) we used a one-way, random effects analysis of variance (ANOVA) estimator which has been shown to perform well for both binary and continuous outcomes across a wide range of ρ and cluster sizes [1619]. These estimations were performed in STATA using the ‘loneway’ command which uses the F statistic to calculate ρ as described by Hayes and Moulton. Although alternative estimators are available for binary outcomes, given that the ANOVA estimator has been shown to perform well for binary outcomes, we chose to present all estimates based on the common, easily implementable approach as described above [17, 18].

We calculated ρ for the social characteristics, self-reported medical histories, physical and laboratory measurements, and measured health outcomes. Negative values were truncated at zero, and our reporting of ρ is in accordance with the guidelines suggested by Campbell et al. [20].

Variance estimation was based on asymptotic theory, as implemented in the ‘loneway’ command, which accommodates differing cluster sizes. The 95 % confidence intervals for each ICC coefficient were derived from the asymptotic standard error, which has been shown to provide good coverage probabilities for a wide range of parameter combinations including clusters, cluster sizes, and ρ [18, 21, 22]. Confidence intervals with negative values were truncated at zero.


Between January 2014 and June 2014, we enrolled 481 participants from 346 households from a total of 37 sampling areas (30 urban and 7 rural) within 29 neighborhoods (23 Urban and 6 rural) (Table 1). These 29 neighborhoods were located within 18 wards (13 urban and 5 rural). The mean age was 46.9 years (SD 15.1). The household non-response rate was 15.0 %. Men (p < 0.001) and adults 18–39 years old (p = 0.001) were more likely to be non-responders. The median neighborhood cluster size was 13.0 participants (IQR 9–21), and neighborhood cluster size ranged from 6 to 49 participants (Appendix 2).
Table 1

Unweighted proportions for demographic, social characteristics, self-reported medical histories, health outcomes, and design parameters stratified by setting; N = 481 (CKD-AFRiKA, 2014)

Variable (n, %)




(n = 370)

(n = 111)

(n = 481)

Gender (female)

278 (75.1 %)

80 (72.1 %)

358 (74.4 %)



 18–39 years old

138 (37.3 %)

34 (30.6 %)

172 (35.8 %)

 40–59 years old

145 (39.2 %)

46 (41.5 %)

191 (39.7 %)

 60+ years old

87 (23.5 %)

31 (27.9 %)

118 (24.5 %)




230 (62.2 %)

58 (52.3 %)

288 (59.9 %)


35 (9.5 %)

31 (27.9 %)

66 (13.7 %


18 (4.9 %)

9 (8.1 %)

27 (5.6 %)


87 (23.5 %)

13 (11.7 %)

100 (20.8 %)




27 (7.3 %)

4 (3.6 %)

31 (6.4 %)


253 (68.4 %)

96 (86.5 %)

349 (72.6 %)


64 (17.3 %)

10 (9.0 %)

74 (15.4 %)


26 (7.0 %)

1 (0.9 %)

27 (5.6 %)




71 (19.2 %)

3 (2.7 %)

74 (15.4 %)

 Farmer/wage earner

114 (30.8 %)

85 (76.6 %)

199 (41.4 %)

 Small business/vendors

143 (38.6 %)

15 (13.5 %)

158 (32.8 %)


42 (11.4 %)

8 (7.2 %)

50 (10.4 %)

Social characteristics


 Ongoing tobacco use

34 (9.2 %)

16 (14.4 %)

50 (10.4 %)

 Ongoing alcohol use

146 (39.5 %)

52 (46.9 %)

198 (41.2 %)

Self-reported medical history



53 (14.3 %)

8 (7.2 %)

61 (12.7 %)


113 (30.6 %)

21 (19.1 %)

134 (28.0 %)


6 (1.6 %)

2 (1.8 %)

8 (1.7 %)

 Heart diseased

17 (4.6 %)

1 (0.9 %)

18 (3.7 %)


10 (2.7 %)

0 (0 %)

10 (2.1 %)


12 (3.2 %)

2 (1.8 %)

14 (2.9 %)


329 (88.9 %)

98 (88.3 %)

427 (88.8 %)


6 (1.6 %)

0 (0 %)

6 (1.3 %)


23 (6.2 %)

2 (1.8 %)

25 (5.2 %)


20 (5.4 %)

1 (0.9 %)

21 (4.4 %)

 Kidney disease

14 (3.8 %)

0 (0 %)

14 (2.9 %)

Health condition



112 (30.3 %)

37 (33.3 %)

149 (31.0 %)


116 (31.4 %)

22 (19.8 %)

138 (28.7 %)

 Glucose impairment

102 (27.6 %)

27 (24.3 %)

129 (26.8 %)


63 (17.0 %)

21 (18.9 %)

84 (17.5 %)


39 (10.5 %)

6 (5.4 %)

45 (9.4 %)

 Chronic kidney disease

54 (14.6 %)

3 (2.70 %)

57 (11.9 %)

Design parameters


 Neighborhood clusters (k)




 Median cluster size (IQR)

12.0 (7.5–19.5)

16.0 (15–26)

13.0 (9–21)

 Cluster size range




 Participants enrolled




aOther tribal ethnicities represented in our groups include Luguru, Kilindi, Kurya, Mziguwa, Mnyisanzu, Rangi, Jita, Nyambo, and Kaguru

bIncluded housewives and students

cProfessional included any salaried position (e.g. nurse, teacher, government employee, etc.) and retired persons

dHeart disease included coronary disease, heart failure, or structural diseases

The majority of participants lived in an urban setting (n = 370; 77.0 %), were women (n = 358; 74.4 %), ethnically Chagga (n = 288; 59.9 %), and had obtained a primary school level of education (n = 349; 72.6 %), and most participants were occupied as farmers or daily wage-earners (n = 199; 41.4 %) (Table 1). Many participants reported an ongoing use of alcohol (n = 198; 41.2 %) and many reported a history of malaria (n = 427; 88.8 %), diabetes (n = 61; 12.7 %), or hypertension (n = 134; 28.0 %). Few reported a history of stroke, heart disease, tuberculosis, hepatitis, HIV/AIDS, COPD/asthma, cancer or kidney disease. From our assessment of NCD-related health outcomes, 149 participants (31.0 %) had hypertension, 138 (28.7 %) were obese, 57 (11.9 %) had CKD, and 129 (26.8 %) had glucose impairment of which 84 (17.5 %) had pre-diabetes and 45 (9.4 %) had diabetes.

Clustering varied across neighborhoods and differed by urban or rural setting. Overall ICC coefficients ranged from 0.00 to 0.125 with a mean value of 0.30 (SD 0.033) (Table 2). In the rural setting, ICC coefficients ranged from 0.000 to 0.331, and in the urban setting, ICC coefficients ranged from 0.000 to 0.109. Ongoing alcohol use exhibited the strongest neighborhood clustering (ρ = 0.125), which was most prominent in rural neighborhoods (ρ = 0.331). Ongoing tobacco use exhibited modest neighborhood clustering in both rural (ρ = 0.022) and urban settings (ρ = 0.042). Neighborhood clustering of self-reported medical histories was most significant for diabetes (ρ = 0.045), hypertension (ρ = 0.100), HIV (ρ = 0.054), and CKD (ρ = 0.020).
Table 2

Population-based intra-cluster correlation coefficients (ρ) for neighborhood clustering; N = 481 (CKD-AFRiKA, 2014)








Mean (SD) or prevalence (%)b

ρ (95 % CI)

Mean (SD) or prevalence (%)

ρ (95 % CI)

Mean (SD) or prevalence (%)

ρ (95 % CI)

Social characteristics


 Ongoing tobacco use

15.2 %

0.022 (0.000–0.073)

19.6 %

0.042 (0.000–0.159)

18.0 %

0.028 (0.000–0.075)

 Ongoing alcohol use

35.4 %

0.069 (0.000–0.145)

45.8 %

0.331 (0.007–0.654)

41.9 %

0.125 (0.034–0.216)

Self-reported medical history



9.68 %

0.035 (0.000–0.094)

5.51 %


7.06 %

0.045 (0.000–0.100)


20.7 %

0.109 (0.012–0.207)

16.1 %

0.014 (0.000–0.101)

17.8 %

0.100 (0.020–0.181)


11.8 %

0.000 (0.000–0.039)

2.07 %

0.000 (0.000–0.070)

17.4 %

0.000 (0.000–0.033)

 Heart disease

2.64 %

0.000 (0.000–0.039)

0.72 %

0.047 (0.000–0.169)

1.44 %

0.000 (0.000–0.033)


2.92 %

0.000 (0.000–0.039)

0.00 %


1.09 %

0.000 (0.000–0.033)


2.48 %

0.000 (0.000–0.039)

1.17 %

0.000 (0.000–0.070)

1.66 %

0.000 (0.000–0.033)


87.8 %

0.000 (0.000–0.039)

89.0 %

0.001 (0.000–0.073)

88.6 %

0.000 (0.000–0.033)


1.04 %

0.009 (0.000–0.039)

0.00 %


0.39 %

0.000 (0.000–0.033)


3.39 %

0.000 (0.000–0.039)

1.31 %

0.000 (0.000–0.071)

2.08 %

0.000 (0.000–0.033)


5.19 %

0.048 (0.000–0.113)

0.59 %

0.010 (0.000–0.093)

0.99 %

0.054 (0.006–0.114)

 Kidney disease

3.29 %

0.010 (0.000–0.054)

0.00 %


1.22 %

0.020 (0.000–0.063)

Health outcome



19.4 %

0.056 (0.000–0.125)

33.2 %

0.167 (0.000–0.398)

28.0 %

0.075 (0.001–0.126)


25.1 %

0.035 (0.000–0.094)

15.6 %

0.009 (0.000–0.091)

19.1 %

0.040 (0.000–0.093)

 Glucose impairment

24.0 %

0.025 (0.000–0.078)

20.3 %

0.110 (0.000–0.293)

21.7 %

0.039 (0.000–0.918)


15.8 %

0.001 (0.000–0.040)

16.1 %

0.149 (0.000–0.367)

16.0 %

0.031 (0.000–0.079)


8.21 %

0.000 (0.000–0.039)

4.20 %

0.000 (0.000–0.070)

5.70 %

0.000 (0.000–0.033)

 Chronic kidney disease

15.2 %

0.036 (0.000–0.094)

2.03 %

0.000 (0.000–0.070)

7.00 %

0.044 (0.000–0.096)

Physical and laboratory measurements


 SBP (mmHg)

124.0 (24.3)

0.024 (0.000–0.077)

132.8 (26.4)

0.207 (0.000–0.467)

129.5 (24.3)

0.064 (0.000–0.129)

 DBP (mmHg)

76.1 (12.3)

0.025 (0.000–0.077)

78.5 (12.1)

0.199 (0.000–0.453)


0.056 (0.000–0.116)

 BMI (kg/m2)

26.2 (8.23)

0.049 (0.000–0.115)

25.6 (13.7)

0.031 (0.000–0.136)

25.8 (12.0)

0.032 (0.000–0.081)

 HbA1C (%)

5.98 (1.28)

0.000 (0.000–0.039)

6.83 (12.7)

0.025 (0.000–0.123)

6.51 (10.1)

0.012 (0.000–0.050)

 Serum creatinine (μmol/L)

68.2 (27.2)

0.000 (0.000–0.040)

61.8 (13.6)

0.049 (0.000–0.174)

64.2 (20.0)

0.000 (0.000–0.034)


14.1 %

0.015 (0.000–0.063)

1.31 %

0.014 (0.000–0.100)

6.08 %

0.038 (0.000–0.090)

 eGFR (mg/L/min)(MDRD)

104.7 (24.4)

0.078 (0.000–0.161)

128.5 (203.2)

0.001 (0.000–0.073)

119.7 (161.9)

0.005 (0.000–0.041)

 eGFR (mg/L/min) (CKD-EPI)

119.0 (26.6)

0.048 (0.000–0.117)

146.1 (224.3)

0.000 (0.000–0.073)

136.5 (222.1)

0.000 (0.000–0.036)

SD standard deviation, ρ intra-cluster correlation coefficient, COPD chronic obstructive pulmonary disease, SBP systolic blood pressure, DBP diastolic blood pressure, BMI body mass index, eGFR estimated glomerular filtration rate, MDRD modification of diet in renal disease equation (without the race factor) for eGFR, CKD-EPI CKD epidemiology collaboration equation (without the race factor) for eGFR

aToo few positive events/outcomes were observed in these categories, bAge-and gender-weighted estimates

Among the NCDs, neighborhood clustering varied with ρ ranging from 0.000 to 0.075. Hypertension (ρ = 0.075) exhibited the strongest clustering within neighborhoods followed by CKD (ρ = 0.440), obesity (ρ = 0.040), and glucose impairment (ρ = 0.039) (Fig. 1). Among those with glucose impairment, neighborhood clustering was more significant for pre-diabetes (ρ = 0.031) than for diabetes (ρ = 0.000). Neighborhood clustering for physical and laboratory measurements paralleled the NCD outcomes. Both systolic (ρ = 0.064) and diastolic (ρ = 0.056) blood pressures exhibited strong neighborhood clustering. Clustering for albuminuria was modest (ρ = 0.038), but it accounted for most of the neighborhood clustering observed for CKD when compared to serum creatinine or eGFR measurements. Similar to obesity and glucose impairment, clustering of BMI was more significant in urban neighborhoods (ρ = 0.049) while clustering of HbA1C was more significant in rural neighborhoods (ρ = 0.025).
Fig. 1
Fig. 1

Neighborhood clustering of non-communicable diseases in northern Tanzania. Intra-cluster correlation coefficients, presented by prevalence, for CKD, obesity, glucose impairment, and hypertension


In northern Tanzania, prevalence of NCDs, including hypertension, CKD, obesity, and glucose impairment, exhibited clustering by neighborhood. This clustering varied across urban and rural settings, and for NCD prevalence, it was most significant for hypertension and CKD. Based on the ICC coefficients that we observed, cluster-designed studies examining NCDs in the region should account for the design effect on precision or variance caused by clustering. In a region where the NCD burden is quickly growing, these results will be valuable in designing such studies, including cluster RCTs [5, 12, 23].

The urban and rural differences in neighborhood clustering of NCDs may highlight important environmental and lifestyle risk factors for the development of hypertension, glucose impairment, obesity, and CKD. The neighborhood clustering for hypertension and glucose impairment was most pronounced in the rural settings where families tend to remain more environmentally clustered, share meals, and work in similar agricultural jobs which may all contribute to the development of such NCDs that are known to be highly associated with lifestyle [2426]. On the other hand, obesity and CKD were most clustered in the urban neighborhoods. For obesity, this urban clustering highlights the importance that urban lifestyles, which may be clustered within neighborhoods on the basis of socioeconomic status, transportation, or occupation, play in the development of obesity. In the context of CKD, living in an urban setting has been shown to be a significant risk factor, yet specific etiologies associated with the urban environment remain unknown [12]. The clustering of CKD within urban neighborhoods that we observed may be important in highlighting causes of CKD, and it further stresses that public health efforts targeting CKD must take a broad approach that includes urban planning with sanitation improvement, safe drinking water, pollution reduction, and infection control.

Among all measured variables, ongoing alcohol use, hypertension, a self-reported history of hypertension, and a self-reported history of HIV were most highly correlated among cluster-sampled individuals, and the latter two variables may reflect an increased awareness and/or prevalence of these conditions within certain neighborhoods. In northern Tanzania, alcohol is commonly homemade and shared among households which may in part explain the significant clustering that we observed.

To our knowledge, this is the first community-based, household-level survey to report on the neighborhood clustering of NCDs in East Africa. As such, these are the first ICC coefficients reported for hypertension, CKD, obesity, and glucose impairment in the region, and compared to reports of ICC coefficients in high-income countries there are significant differences in several of the physical and laboratory variables [2729]. Because we also measured clustering in both an urban and rural settings we were able to demonstrate important differences which may help inform future studies examining the demographic transition of NCDs in sub-Saharan Africa where rapid urbanization is occurring [30].

Despite these strengths, we also noted a few limitations. Caution must be taken when applying these estimates to other populations and settings. Although the paucity of data currently available for NCD-related measurements and outcomes may make these results useful to researchers more broadly across the region, differences in prevalence and risk factors for NCDs, particularly those that are geographic or environmental-based, mean that even NCDs can cluster at different rates within villages, neighborhoods, or households. Additionally, although we used sample-balancing approaches to address potential non-response bias, the effect of participant non-response upon these estimates is not fully known. Finally, some results, such as self-reported medical history, rely upon the subjective response of individual participants, and as such, they may be prone to recall or response bias.


In conclusion, we have reported on the observed neighborhood clustering for several NCDs from a community-based study in northern Tanzania. The neighborhood clustering, which varied by urban or rural setting, was substantial enough to contribute to a design effect for NCD outcomes including hypertension, CKD, obesity, and glucose impairment, and it may also highlight NCD risk factors that vary by setting. These results may help inform the design of future community-based studies or randomized controlled trials examining NCDs in the region particularly those that use cluster-sampling methods.



We would like to thank Professors G Ralph Corey and John Bartlett and all the staff of the KCMC-Duke Collaboration in Moshi, Tanzania for all of their efforts. We give a special thanks to Carol Sangawe, Cynthia Asiyo, and Nicola West for their integral role implementing the study and Jeffrey Hawley and Audrey Brown of the Duke Office of Clinical Research for their help in data management. We are also grateful to Estomih Mduma and his team for their help with translation needs. This study was supported by an NIH Research Training Grant (#R25 TW009337) funded by the Fogarty International Center and the National Institute of Mental Health; a Research and Prevention Grant funded by the International Society of Nephrology Global Outreach Committee; and a Master’s of Science research stipend from the Duke Global Health Institute. There was no involvement by the funding sources in the design, analysis, or writing of this study. All authors had full access to the data and had final responsibility for the decision to publish.

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Authors’ Affiliations

Department of Medicine, Duke University, DUMC Box 3182, Durham, NC 27710, USA
Duke Global Health Institute, Duke University, Durham, NC 27710, USA
Duke Clinical Research Institute, Duke University, DUMC Box 3646, Durham, NC 27710, USA
Department of Biostatistics and Bioinformatics, Duke University, DUMC Box 2721, Durham, NC 27710, USA
Duke University Medical Center, Box 3182, Durham, NC 27710, USA


  1. Fuster V. Cardiovascular disease and the UN millennium development goals: a serious concern. Nat Clin Pract Cardiovasc Med. 2006;3(8):401.View ArticlePubMedGoogle Scholar
  2. Stanifer JW, Jing B, Tolan S, Helmke N, Mukerjee R, Naicker S, et al. The epidemiology of chronic kidney disease in sub-Saharan Africa: a systematic review and meta-analysis. Lancet Global Health. 2014;2(3):e174–81.View ArticlePubMedGoogle Scholar
  3. Renzaho AM. The post-2015 development agenda for diabetes in sub-Saharan Africa: challenges and future directions. Global Health Act. 2015;8:27600.Google Scholar
  4. Abegunde DO, Mathers CD, Adam T, Ortegon M, Strong K. The burden and costs of chronic diseases in low-income and middle-income countries. Lancet. 2007;370(9603):1929–38.View ArticlePubMedGoogle Scholar
  5. Unwin N, Setel P, Rashid S, Mugusi F, Mbanya JC, Kitange H, et al. Noncommunicable diseases in sub-Saharan Africa: where do they feature in the health research agenda? Bull World Health Organ. 2001;79(10):947–53.PubMedPubMed CentralGoogle Scholar
  6. Holmes MD, Dalal S, Volmink J, Adebamowo CA, Njelekela M, Fawzi WW, et al. Non-communicable diseases in sub-Saharan Africa: the case for cohort studies. PLoS Med. 2010;7(5), e1000244.View ArticlePubMedPubMed CentralGoogle Scholar
  7. Organization WH. Training for mid-level managers: The EPI coverage survey. Geneva: WHO Expanded Programme on Immunization; 1991.Google Scholar
  8. Luman ET, Worku A, Berhane Y, Martin R, Cairns L. Comparison of two survey methodologies to assess vaccination coverage. Int J Epidemiol. 2007;36(3):633–41.View ArticlePubMedGoogle Scholar
  9. Isaakidis P, Ioannidis JP. Evaluation of cluster randomized controlled trials in sub-Saharan Africa. Am J Epidemiol. 2003;158(9):921–6.View ArticlePubMedGoogle Scholar
  10. Donner A, Birkett N, Buck C. Randomization by cluster. Sample size requirements and analysis. Am J Epidemiol. 1981;114(6):906–14.PubMedGoogle Scholar
  11. Donner A, Koval JJ. Design considerations in the estimation of intraclass correlation. Ann Hum Genet. 1982;46(Pt 3):271–7.View ArticlePubMedGoogle Scholar
  12. Stanifer JW, Maro V, Egger J, Karia F, Thielman N, Turner EL, et al. The epidemiology of chronic kidney disease in Northern Tanzania: a population-based survey. PLoS One. 2015;10(4), e0124506.View ArticlePubMedPubMed CentralGoogle Scholar
  13. United Republic of Tanzania. Education sector performance report, 2010–2011. Dar es Salaam: Education Sector Development Committee; 2011.Google Scholar
  14. United Republic of Tanzania. 2012 Population and housing census. Dar Es Salaam: Central Census Office and National Bureau of Statistics; 2013.Google Scholar
  15. Wyatt CM, Schwartz GJ, Owino Ong’or W, Abuya J, Abraham AG, Mboku C, et al. Estimating kidney function in HIV-infected adults in Kenya: comparison to a direct measure of glomerular filtration rate by iohexol clearance. PLoS One. 2013;8(8), e69601.View ArticlePubMedPubMed CentralGoogle Scholar
  16. Donner A, Koval JJ. The large sample variance of an intraclass correlation. Biometrika. 1980;67(3):719–22.View ArticleGoogle Scholar
  17. Wu S, Crespi CM, Wong WK. Comparison of methods for estimating the intraclass correlation coefficient for binary responses in cancer prevention cluster randomized trials. Contemp Clin Trials. 2012;33(5):869–80.View ArticlePubMedPubMed CentralGoogle Scholar
  18. Zou G, Donner A. Confidence interval estimation of the intraclass correlation coefficient for binary outcome data. Biometrics. 2004;60(3):807–11.View ArticlePubMedGoogle Scholar
  19. Hayes R, Moulton L. Cluster randomised controlled trials. New York: Chapman and Hall/CRC Press; 2009.View ArticleGoogle Scholar
  20. Campbell MK, Grimshaw JM, Elbourne DR. Intracluster correlation coefficients in cluster randomized trials: empirical insights into how should they be reported. BMC Med Res Methodol. 2004;4:9.View ArticlePubMedPubMed CentralGoogle Scholar
  21. Donner A. A review of inference procedures for the intraclass correlation coefficient in the one-way random effects model. Int Stat Rev. 1986;54(1):67–82.View ArticleGoogle Scholar
  22. Mian IU, Shoukri MM. Statistical analysis of intraclass correlations from multiple samples with applications to arterial blood pressure data. Stat Med. 1997;16(13):1497–514.View ArticlePubMedGoogle Scholar
  23. Eldridge SM, Ashby D, Kerry S. Sample size for cluster randomized trials: effect of coefficient of variation of cluster size and analysis method. Int J Epidemiol. 2006;35(5):1292–300.View ArticlePubMedGoogle Scholar
  24. Kuate DB. Demographic, epidemiological, and health transitions: are they relevant to population health patterns in Africa? Global Health Act. 2014;7:22443.Google Scholar
  25. Addo J, Smeeth L, Leon DA. Hypertension in sub-saharan Africa: a systematic review. Hypertension. 2007;50(6):1012–8.View ArticlePubMedGoogle Scholar
  26. Assah FK, Ekelund U, Brage S, Mbanya JC, Wareham NJ. Urbanization, physical activity, and metabolic health in sub-Saharan Africa. Diabetes Care. 2011;34(2):491–6.View ArticlePubMedPubMed CentralGoogle Scholar
  27. Parker DR, Evangelou E, Eaton CB. Intraclass correlation coefficients for cluster randomized trials in primary care: the cholesterol education and research trial (CEART). Contemp Clin Trials. 2005;26(2):260–7.View ArticlePubMedGoogle Scholar
  28. Smeeth L, Ng ES. Intraclass correlation coefficients for cluster randomized trials in primary care: data from the MRC Trial of the Assessment and Management of Older People in the Community. Control Clin Trials. 2002;23(4):409–21.View ArticlePubMedGoogle Scholar
  29. Singh J, Liddy C, Hogg W, Taljaard M. Intracluster correlation coefficients for sample size calculations related to cardiovascular disease prevention and management in primary care practices. BMC Research Notes. 2015;8:89.View ArticlePubMedPubMed CentralGoogle Scholar
  30. Agyei-Mensah S, de-Graft Aikins A. Epidemiological transition and the double burden of disease in Accra, Ghana. J Urban Health. 2010;87(5):879–97.View ArticlePubMedPubMed CentralGoogle Scholar


© Stanifer et al. 2016