The development and validation of an urbanicity scale in a multi-country study
© Novak et al.; licensee BioMed Central Ltd. 2012
Received: 18 January 2012
Accepted: 26 June 2012
Published: 20 July 2012
Although urban residence is consistently identified as one of the primary correlates of non-communicable disease in low- and middle-income countries, it is not clear why or how urban settings predispose individuals and populations to non-communicable disease (NCD), or how this relationship could be modified to slow the spread of NCD. The urban–rural dichotomy used in most population health research lacks the nuance and specificity necessary to understand the complex relationship between urbanicity and NCD risk. Previous studies have developed and validated quantitative tools to measure urbanicity continuously along several dimensions but all have been isolated to a single country. The purposes of this study were 1) To assess the feasibility and validity of a multi-country urbanicity scale; 2) To report some of the considerations that arise in applying such a scale in different countries; and, 3) To assess how this scale compares with previously validated scales of urbanicity.
Household and community-level data from the Young Lives longitudinal study of childhood poverty in 59 communities in Ethiopia, India and Peru collected in 2006/2007 were used. Household-level data include parents’ occupations and education level, household possessions and access to resources. Community-level data include population size, availability of health facilities and types of roads. Variables were selected for inclusion in the urbanicity scale based on inspection of the data and a review of literature on urbanicity and health. Seven domains were constructed within the scale: Population Size, Economic Activity, Built Environment, Communication, Education, Diversity and Health Services.
The scale ranged from 11 to 61 (mean 35) with significant between country differences in mean urbanicity; Ethiopia (30.7), India (33.2), Peru (39.4). Construct validity was supported by factor analysis and high corrected item-scale correlations suggest good internal consistency. High agreement was observed between this scale and a dichotomized version of the urbanicity scale (Kappa 0.76; Spearman’s rank-correlation coefficient 0.84 (p < 0.0001). Linear regression of socioeconomic indicators on the urbanicity scale supported construct validity in all three countries (p < 0.05).
This study demonstrates and validates a robust multidimensional, multi-country urbanicity scale. It is an important step on the path to creating a tool to assess complex processes like urbanization. This scale provides the means to understand which elements of urbanization have the greatest impact on health.
Urbanization and non-communicable disease
In 1960, 22% of people living in developing countries lived in urban areas; by the year 2000, that figure had nearly doubled to 40% . The trend is expected to continue: the UN estimates that by 2030 60% of the world’s population will live in urban areas . The pace of urbanization in the developing world is accelerating as globalization changes the patterns of industry and trade . With urbanization come significant changes in nutrition, physical activity and tobacco consumption patterns . The aspects of urbanization which encourage this change in lifestyle is unclear.
Accompanying these shifts, nutritional and epidemiologic transitions are also leading to a dramatic increase in incidence of non-communicable disease (NCD). It is estimated that by 2020 69% of mortality in developing countries will be due to NCD . Rates of diabetes are on the rise, and expected to double between 2000 and 2025 . The spread of NCD in developing countries departs from the patterns seen during the periods of transition experienced by previously industrialized countries during the 18th and early 19th centuries. The crucial difference between previous and current waves of urbanization is that for those urbanizing rapidly in the 21st century, the burden of NCD will fall primarily on the poor, and will affect people at younger ages than in developed countries .
Although urban residence is consistently identified as one of the primary correlates of non-communicable disease (NCD) [1, 5, 6], the nature of the association remains poorly understood. It is not yet clear why or how urban settings predispose individuals and populations to NCD, or how this relationship could be modified to slow the spread of NCD in the developing world. Progress in understanding the impact of urbanization on NCD has been hindered by the lack of robust methodological tools to define, measure, and compare degrees of urbanization across settings. Currently, most studies rely on a simple urban–rural dichotomy determined by a limited set of factors, such as population size and density, administrative definitions (e.g. living in the capital city), or measures of economic activity (e.g. the percentage of population involved in agriculture). The concept of “urbanicity”, or the presence of conditions such as population density, commercial activity, and transportation infrastructure “that are particular to urban areas or present to a much greater extent than in nonurban areas” is helpful in developing nuanced tools . Several authors have called for the development of a quantitative tool to measure urbanicity continuously along several dimensions so that its relationship to NCD may be better understood [1, 7, 8].
Use of quantitative tools to measure urbanicity as an exposure for NCD
To date, four authors have developed and applied urbanicity scales or urbanization indices for the evaluation of urbanicity as an exposure for NCD [1, 7, 9, 10]. All authors draw on community-level data to measure various dimensions of urbanicity, such as population size and density, access to markets, communications infrastructure, transportation infrastructure and educational and health facilities. While these urbanicity scales have made important strides in operationalizing various dimensions of urban life, the method needs extensive refinement before it can be a reliable tool in public health. It remains unclear whether the same urbanicity scale can be used in multiple settings, and in which ways it can better inform understanding of NCD risk and, ultimately, the potential for intervention and prevention of NCD.
Is it possible to create and validate an urbanization scale that can be used across multiple countries (Ethiopia, India and Peru)?
What considerations arise in applying such a scale in different countries?
How does a new scale compare with previously validated scales of urbanicity?
Selection and description of participants
This paper uses community-level data from the Young Lives project, a longitudinal study of childhood poverty in Peru, Ethiopia, Vietnam and India. Young Lives is based on a holistic understanding of poverty, collecting multidimensional data on children’s health, education, and social, emotional, and psychological well-being. It aims to inform both the development and implementation of policies that will reduce and alleviate the effects of childhood poverty. The Young Lives data are particularly well suited to this study because individual-level, household-level and community-level data are available for all study participants.
The Young Lives study includes 20 communities in each of four countries: Ethiopia, India, Peru and Vietnam. Due to incomplete community-level data, the Vietnamese data were not available for this particular analysis. In all countries, communities were chosen with a pro-poor sampling framework. Ethiopia and India utilized a “sentinel site surveillance system,” whereby the study sites (“sentinel sites”) were selected purposively to ensure a balanced representation of regional diversity as well as rural/urban differences [11, 12]. In Peru, the Young Lives team used multi-stage, cluster-stratified, random sampling. Rather than purposively selecting sites to represent the diversity of the region, the Peruvian team randomly selected 20 of the country’s 1818 districts to serve as sentinel sites. The sample is considered pro-poor because it excludes the 5% wealthiest districts as determined by the Peruvian national poverty map .
Data collection was administered in conjunction with local research partners in each country: the Ethiopian Development Research Institute, the Centre for Economic and Social Studies and Sri Padmavati Mahila Visvavidyalam (Women's University) in India, and Grupo de Análisis para el Desarollo and the Instituto de Investigación Nutricional in Peru. Research partners trained data collectors extensively, including pilot-testing the survey in several phases. A final two-week pilot test was executed in all countries and overseen by a crew from Oxford to ensure consistency in data collection between countries . This analysis uses data from the second round of the study conducted in late 2006 and early 2007. The total sample sizes from each country were 1856 (Ethiopia), 1778 (India), and 1963 (Peru). Young Lives data are available through the UK Public Data Archive at <http://www.esds.ac.uk/international/access/I33379.asp>.
This study uses household-level and community-level Young Lives data. Household-level data include parents’ occupations and education level, household possessions and access to resources. Data were collected via extensive questionnaires administered to the head of household. Community-level data, such as population size, availability of health facilities, or types of roads available were collected by field supervisors from community leaders in each sentinel site.
Scale Construction: Variable Selection
Variables were selected for inclusion in the urbanicity scale based on inspection of the data and a review of literature on urbanicity and health. Various aspects of urbanicity that have been demonstrated to affect health are identified in the literature, including population composition, the social environment (including social and economic inequality), the physical environment, access to health and social services, markets, and government and civic society . This framework, along with a review of previous urbanicity scales, was applied to the Young Lives data. Incomplete variables were eliminated and remaining variables were divided into preliminary concept domains. Within each domain, collinear variables were identified and principal component analysis was used to identify conceptually related variables. These analyses were used to narrow each domain to a manageable set of contributing variables.
The seven domains included in the urbanicity scale were Population Size, Economic Activity, Built Environment, Communication, Education, Diversity and Health Services. The variables used to measure each domain were:
Population of locality.
Proportion of population listing agriculture as their primary occupation.
Road type in the locality, availability and utilization of sewage services in locality, and availability and utilization of electricity in locality.
Proportion of houses with television, mobile phone, availability of communication services (public internet, movie theatre, public telephone) in locality.
Types of educational facilities in the locality, average education of mothers in the locality.
Variance in housing quality index, variance in years of education among mothers.
Types of health facilities available in the locality, types of health workers present in locality.
Summary of urbanicity indicators, by dimension and country
Number of communities
388 - 61740
2835 - 40101
% of community in agriculture
Paved road (number of communities)
Unpaved road for motor traffic
% community with electricity
% community with flush toilet
Theatre (number of communities)
% community owns mobile phone
% community owns TV
Nursery or preschool (number of communities)
Average yrs of mother’s ed.
Variance in housing quality index*
Variance in mother’s ed.*
Hospital (number of communities)
Village health worker
Further household-level variables used for data validation include two indices calculated by the Young Lives project; the housing quality index and the consumer durables index. The housing quality index is calculated using the number of rooms in the household, the number of household members, presence of a finished floor and the presence of an iron, concrete, or slate roof. The consumer durables index is constructed from data on the ownership of a radio, bicycle, motorbike or scooter, motorized vehicle, landline telephone and a modern bed or table. The classification of the locality according to the rural–urban dichotomy was also used for validation purposes.
Complete scale algorithm
Approximately how many people (including children) live in this locality?
Proportion of population involved in agriculture (primary occupation)
10 points- 10*(proportion of population involved in agriculture)
Types of road in locality
Unpaved road for motor traffic
Proportion of households with flush toilet
2 points* proportion
Electricity in community
Proportion of households with electricity
2 points *proportion
Proportion of houses with television, mobile phone
Proportion of households with television
2 points *proportion
Proportion of households with mobile phone
2 points *proportion
Communication services in locality
Educational facilities in locality
Nursery and/or preschool
Average education of mothers in community (years)
Variance in housing quality index
Variance in mother’s education
Health facilities available
Hospital (public or private)
Health Center (public or private)
Health workers available
Village Health Worker
Scale Properties, Reliability and Validity
Factor analysis was used to determine whether or not the domains measured one latent construct, urbanicity. The scale was assessed for construct validity by comparing it to factors known to vary with urbanicity, such as material wealth. Criterion-related validity is typically assessed by comparing the scale in question to a “gold standard” measurement. As there isn’t yet a “gold standard” for urbanicity , the scale was compared to the current standard, the urban–rural dichotomy. Corrected item-scale correlations are also calculated.
Summary of scale domains and total, overall and by country
Mean (St. Dev.)
Mean (St. Dev.)
Mean (St. Dev.)
Mean (St. Dev.)
Scale validation results
Literature on scale development recommends various tests to ensure that a scale accurately measures the latent construct (in this case, urbanicity) that it claims to measure. Among these is a test for unidimensionality to ensure that the various components of the scale (in this case, the seven dimensions of urbanicity) actually measure a single construct . Unidimensionality can be assessed by a factor analysis; the scale is considered undimensional if only one dimension has an eigenvalue greater than 1 . The factor analysis to test for unidimensionality of the urbanicity scale resulted in the first factor having an eigenvalue of 3.9 and all subsequent factors having eigenvalues of 0.9 or lower, which suggests that the scale does indeed measure a latent unidimensional construct which we presume to be urbanicity.
Corrected item-scale correlations of domains of urbanicity
Corrected item-scale correlation
Criterion-related validity of the scale can be assessed by comparing the scale with a standard measurement . We compared the scale to the current best standard, the urban–rural dichotomy, to ensure that the scale did not diverge markedly from the general pattern measured by the dichotomy. The scale was dichotomized into high- and low-urbanicity and compared to the classification of each community as urban or rural (done by the Young Lives staff). The Kappa statistic for agreement beyond chance can be used to test whether the two measures agree in their assessment of urbanicity. A TKappa statistic of 1 would indicate perfect agreement; values upwards of 0.6 typically indicate good agreement . he Kappa statistic for agreement beyond chance between the urban–rural dichotomy and a dichotomized version of the urbanicity scale was 0.76 (Expected agreement: 49.8%, observed agreement: 88.1%; p < 0.0001).
Spearman’s rank-correlation coefficient, another tool used to compare two measures, was also calculated. Unlike the Kappa statistic, Spearman’s rank-correlation coefficient does not require that the two measures have the same format, i.e. it allows for the comparison of a continuous variable (the urbanicity scale) with a dichotomous variable (the urban–rural dichotomy). A Spearman’s rank-correlation coefficient of 1 indicates perfectly monotonic relationship between the two measures, whereas a coefficient of 0 would indicate no agreement . The calculated coefficient for the comparison of the urbanicity scale with the urban–rural dichotomy was 0.84 (p < 0.0001). These statistics indicate that the scale does not depart significantly from the divisions made by the urban rural dichotomy, i.e. “urban” localities tend to have higher urbanicity scores than “rural” ones. However, as a continuous measure, the urbanicity scale has the potential to improve upon the urban–rural dichotomy by providing more complex information.
Linear regression of urbanicity scale and housing quality index and urbanicity scale and consumer durables index, by country
Urbanicity of site
Urbanicity of site
Urbanicity of site
Average housing quality index of site
Average consumer durables index of site
The coefficients are positive and statistically significant for both variables in all three countries, which means that housing quality and consumer durables are good predictors of the proposed urbanicity scale in each set of data. This confirms that the scale does behave as one would expect an urbanicity scale to behave.
This study demonstrates that a robust multidimensional, multi-country urbanicity scale can be created and validated. Drawing on both urbanicity literature and preliminary analysis of community and household-level data, the scale captured a broad range of aspects of urbanicity and performed well in tests of unidimensionality, construct validity, and criterion-related validity. It is an important step on the path to creating a helpful tool to assess complex processes like urbanization.
This analysis builds on previous literature on urbanicity and health risks by creating a new and validated scale of urbanicity that can be used to assess the relationship between urbanicity and health. It is innovative in that it is the first scale to be created and validated using data from multiple countries. Finding that a single scale performs well in three economically, geographically and culturally diverse countries is an important step in the project of creating a standard continuous measure of urbanicity.
Previous urbanicity scales drew on data from the Philippines , Tamil Nadu, India , and China [1, 10] and included seven [7, 9] or twelve  domains or dimensions of urbanicity. All studies draw on community-level data to develop their urbanicity scales, but the dimensions they choose to define urbanicity varied depending on study context and the data to which they had access. The scale discussed in this paper includes several domains that were identified for use in these studies. Methods of validation varied between studies, but where they can be compared this scale performs as well as Jones-Smith and Popkin’s scale in tests of unidimensionality, item-scale correlation and criterion-related validity .
This analysis differs from previous analyses because it is the first to use urbanicity as a continuous variable, while previous analyses divide the scale into quintiles  or tertiles [7, 9] for analysis. Using the scale as a continuous measure allows for the detection of more complex effects and better represents the concept of urbanicity as a continuous spectrum.
The Young Lives dataset does not include data on two key aspects of urbanicity: population density and markets. Population density is one of the primary variables used to denote urbanicity [2, 10]. In relation to NCD risk, available markets, especially food markets, are likely to be an important aspect of urbanicity [7, 10]. However, the scale does include data on population size, and the availability of markets is likely to be correlated with many of the other dimensions included in this scale. The pro-poor sample is not representative of all communities in the study country, so the validity of the scale in this sample may not reflect validity on a broader scale.
There is the potential for ceiling and floor effects since the scale algorithm (Table 2) does not allow the score to be greater than 10 or less than 0 in a single dimension, and does not allow the total score to be greater than 70 or less than 0. Furthermore, although the scale algorithm was created with the best possible input from existing literature on urbanicity and on analysis of the available data, it still contains a certain degree of arbitrariness that could be concerning to a sceptical audience. However, several authors [7, 10] make a convincing argument that a literature-based scale is preferable to a data-driven scale development method .
In order for an urbanicity scale to become a standard epidemiological tool it will be important to explore the applicability and generalizability of this scale to other contexts and data sources. Although this scale performed well on tests of validity even when applied in three different countries, this does not necessarily imply that urbanization progresses the same way in all contexts. More studies will be needed to confirm which aspects of urbanicity are most consistent across settings and are also readily measurable in standard surveys. Future research is also needed to examine the predictive ability of this validated scale against known chronic disease risks, for example nutritional status indicators such as BMI, overweight or underweight nutritional status. It has been suggested that urbanicity scales of this type could be useful not only for understanding the spread of NCD but also for other economic, demographic and social research .
This paper presents a validated tool that provides a continuous measure of urbanicity in a number of contexts. In future analyses, such a scale could be used as a predictor variable to better illuminate the nature of the relationship between urbanization and NCD risk in developing countries. Ultimately, urbanicity scales such as this one may provide insight into which particular aspects of urbanization have the greatest impact on health and shed light on potential policy interventions to stem the spread of NCD in developing countries.
The data used in this publication comes from Young Lives, a 15-year survey investigating the changing nature of childhood poverty in Ethiopia, India (Andhra Pradesh), Peru and Vietnam (http://www.younglives.org.uk). Young Lives is core-funded by UK aid from the Department for International Development (DFID). The views expressed here are those of the authors. They are not necessarily those of Young Lives, the University of Oxford, DFID or other funders.
Funding and other source(s) of support: SA and PS receive funding from the British Heart Foundation to contribute to this study. NN received funding from the Rhodes Trust.
- Mendez MA, Popkin BM: Globalization, urbanization and nutritional change in the developing world. Electron J Agric Dev Econ. 2004, 1: 220-241.
- Vlahov D, Galea S: Urbanization, urbanicity, and health. J Urban Health. 2002, 79 (4): S1-S12.PubMed CentralView ArticlePubMed
- Boutayeb A, Boutayeb S: The burden of non communicable diseases in developing countries. Int J Equity Health. 2005, 4: 2-10.1186/1475-9276-4-2.PubMed CentralView ArticlePubMed
- Zimmet P, Alberti K, Shaw J: Global and societal implications of the diabetes epidemic. Nature. 2001, 414: 782-787. 10.1038/414782a.View ArticlePubMed
- Ezzati M: Vander Hoorn S, Lawes CMM, Leach R, James WPT et al: Rethinking the “diseases of affluence” paradigm: global patterns of nutritional risks in relation to economic development. PLoS Med. 2005, 2: e133-10.1371/journal.pmed.0020133.PubMed CentralView ArticlePubMed
- Mohan V, Mathur P, Deepa R, Deepa M, Shukla DK, et al: Urban rural differences in prevalence of self-reported diabetes in India—The WHO-ICMR Indian NCD risk factor surveillance. Diabetes Res Clin Pr. 2008, 80 (1): 159-168. 10.1016/j.diabres.2007.11.018.View Article
- Dahly DL, Adair LS: Quantifying the urban environment: a scale measure of urbanicity outperforms the urban–rural dichotomy. Soc Sci Med. 2006, 64: 1407-1419.PubMed CentralView ArticlePubMed
- Allender S, Foster C, Hutchinson L, Arembepola C: Quantification of urbanization in relation to chronic diseases in developing countries: a systematic review. J Urban Health. 2008, 85: 938-951. 10.1007/s11524-008-9325-4.PubMed CentralView ArticlePubMed
- Allender S, Lacey B, Webster P, Rayner M, Deepa M, Scarborough P, Arembepola C, Manjula D, Mohan V: Level of urbanization and noncommunicable disease risk factors in Tamil Nadu, India. Bull World Health Organ. 2009, 87: 297-304.
- Jones-Smith JC, Popkin BM: Understanding community context and adult health changes in China: Development of an urbanicity scale. Soc Sci Med. 2010, 71: 1436-1446. 10.1016/j.socscimed.2010.07.027.PubMed CentralView ArticlePubMed
- Outes-Leon I, Sanches A: An assessment of the Young Lives sampling approach in Ethiopia. Young Lives Technical Note No. 1. 2008, March
- Kumra N: An assessment of the Young Lives sampling approach in India. Young Lives Technical Note No. 2. 2008, March
- Escobar J, Flores E: An assessment of the Young Lives sampling approach in Peru. Young Lives Technical Note No. 3. 2008, March
- Young Lives Method Guide. Piloting: Testing Instruments and Training Filed Teams. 2011, http://www.younglives.org.uk/files/methods-guide/methods-guide-piloting-and-training,
- Fotso J: Urban–rural differentials in child malnutrition: trends and socioeconomic correlates in sub-Saharan Africa. Health Place. 2007, 13: 205-223. 10.1016/j.healthplace.2006.01.004.View ArticlePubMed
- Van de Poel E, O’Donnell O, Van Doorslaer E: Are urban children really healthier? Evidence from 47 developing countries. Soc Sci Med. 2007, 65: 1986-2003. 10.1016/j.socscimed.2007.06.032.View ArticlePubMed
- National Research Council, Committee on Population, Division of Behavioral and Social Sciences and Education. Cities Transformed: Demographic Change and Its Implications in the Developing World. Edited by: Montgomery MR, Stren R, Cohen B, Reed HE. 2003, Washington, DC: National Academies Press
- DeVellis RF: Scale Development: Theory and Applications. 2003, Thousand Oaks, CA: Sage Publications, 2
- Netemeyer RG, Bearden WO, Sharma S: Scaling Procedures: Issues and Applications. 2003, Thousand Oaks, CA: Sage Publications
- Altman DG: Practical Statistics for Medical Research. 1991, London: Chapman and Hall
- McDade TW, Adair LS: Defining the “urban” in urbanization and health: a factor analysis approach. Soc Sci Med. 2001, 53: 55-70. 10.1016/S0277-9536(00)00313-0.View ArticlePubMed
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2458/12/530/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.