This article has Open Peer Review reports available.
Spatial variations of pulmonary tuberculosis prevalence co-impacted by socio-economic and geographic factors in People’s Republic of China, 2010
© Li et al.; licensee BioMed Central Ltd. 2014
Received: 5 December 2013
Accepted: 14 March 2014
Published: 17 March 2014
The report of the fifth national tuberculosis (TB) epidemiological survey in P. R. China, 2010, roughly showed that pulmonary TB (PTB) prevalence was higher in western China than in central and eastern China. However, accurately estimating the continuous spatial variations of PTB prevalence and clearly understanding factors impacting on spatial variations of PTB prevalence are important for allocating limited resources of national TB programme (NTP) in P. R. China.
Using ArcGIS Geostatistical Wizard (ESRI, Redlands, CA), an evaluation was performed to decide that which kriging and cokriging methods along with different combinations of types of detrending, semivariogram models, anisotropy and covariables (socio-economic and geographic factors) can accurately construct spatial distribution surface of PTB prevalence using statistic data sampled from the fifth national TB epidemiological survey in P. R. China, 2010, and then the evaluation results were used to explore factors of spatial variations.
The global cokriging with socio-economic and geographic factors as covariables proved to be the best geostatistical methods for accurately estimating spatial distribution surface of PTB prevalence. The final continuous surfaces of PTB prevalence distribution demonstrated that PTB prevalence were lower in Beijing, Tianjin, Shanghai and southeastern coast China, higher in western and southwestern China, and crossed between low and high in central China.
The predicted continuous surface perspicuously illustrated the spatial variations of PTB prevalence that were co-impacted by socio-economic and geographic factors, which can be used to better allocate the always limited resources of NTP in P. R. China.
In 2010, Disease Control Bureau of the Ministry of Health, People’s Republic of China (P. R. China) and Chinese Center for Disease Control and Prevention implemented the fifth national tuberculosis (TB) epidemiological survey, due to logistical and financial limitations, which only was conducted through sampling a limited number of point locations throughout the country and roughly found that the active, Mycobacterium positive and smear positive pulmonary TB (PTB) prevalence was higher in western China than in central and eastern China . However, what factors have significant impacts on these spatial variations of PTB prevalence are not quite clear in P. R. China. Accurately estimating the continuous surface of TB prevalence and clearly understanding factors of spatial variations are important for allocating limited resources of national TB programme (NTP) and prioritizing the areas with serious TB prevalence relative to another. Therefore, it is necessary to understand the patterns on spatial heterogeneity of PTB prevalence using statistic data sampled from the fifth national TB epidemiological survey and explore factors of spatial heterogeneity in P. R. China.
In order to understand the patterns on spatial heterogeneity, some types of spatial data analysis method could be used to estimate data values at unobserved locations from observation of its value at nearby locations. Generally, most of the studies in spatial data analysis can be divided into two branches: the model-driven approach, e.g. spatial regression analysis, and the data-driven approach, e.g. kriging methods. A study found that kriging provide the smaller error measures than multiple linear regression model, spatial lag model and spatial error model, so it was concluded that kriging has a clear advantage for spatial data analysis compared to spatial regression analysis .
Kriging is one of interpolation methods, which apply regionalized variables and describe spatial dependencies between the instances of random variables by using semivariograms . A semivariogram is a graphical display of a variance of measurements over the distance between the measurement sites. If there are spatial dependencies the variance between the observations on two points normally increases with increasing distance until at a specific range a maximum value is reached. Considered to be the most sophisticated geostatistical method, kriging can potentially provide the most accurate results of continuous surface estimates, and has been more and more often used for epidemiological mapping of infectious disease, such as TB , schistosomiasis , malaria , cholera , dysentery  and influenza-like illness . However, kriging is applied narrowly in discipline of TB control and prevention in P. R. China.
Unlike kriging, which only use data available at the target location and fail to use existing spatial correlations from secondary-data control points and the primary attribute to be estimated, cokriging not only requires the same conditions to be satisfied as kriging does, but also can take advantage of the covariance between two or more regionalized variables that are related, which proved to be beneficial to better estimate map values in a study . As an important public health problem, TB prevalence has been influenced by not only socio-economic factors but also geographic factors worldwide. For example, a study in Brazil showed that TB incidence and socio-economic status had a significant curvilinear relationship , and another study in Mexico found that altitude had a strong inverse relationship to PTB notification rates . However, it is not clear whether socio-economic and geographic attributes can impact on TB prevalence, and compared with kriging, cokriging along with these factors as covariables can improve accuracies of continuous surface estimate about TB prevalence in P. R. China.
In this study, using the dataset of the fifth national TB epidemiological survey in 2010 , kriging and cokriging along with different trend removal, anisotropy, semivariogram models and cokriging combined with information on socio-economic and geographic attributes were performed to find the appropriate methods that can provide most accurate distribution estimates of PTB prevalence in P. R. China. Based on the appropriate methods, socio-economic and geographic factors impacting on spatial variations of PTB prevalence were evaluated, and spatial distribution of PTB prevalence were generated, which can be helpful to allocating limited resources of NTP in P. R. China.
Data sources of TB prevalence
Origin of socio-economic covariable
Human development index (HDI) was used as the socio-economic covariable for cokriging in this study, which is a composite statistic of life expectancy, education, and income indices used to reflect human development, well-being concept based on capability approach, published by the United Nations Development Programme . By concentrating on aspects beyond income and treating income as a proxy for a decent standard of living, the HDI provides a more comprehensive picture of human life than income only . So the HDI is the appropriate indicator representing the socio-economic attributes. The HDIs by province in P. R. China, 1999, 2003, 2005 and 2008 were collected [15–18], and the data values of 4 years were averaged by province to increase the stability of data and minimized the bias, which were converted into an ESRI Geodatabase format for calculating in this study. Figure 1A showed the averaged values of HDI across the country.
Origin of geographic covariable
The Digital Elevation Model (DEM) was used as the geogrphic covariable for cokriging in this study, which has a spatial resolution of 200 m, and was obtained from the website of Data Sharing Infrastructure of Earth System Science (http://www.geodata.cn). Figure 1B illustrated the elevational gradients of whole country in P. R. China. It was proved that the elevation, as one of the geographic attributes, has close correlations with TB prevalence in many countries, such as in Mexico, Kenya, Peru and Turkey [11, 19–22]. Therefore, the elevation was considered as the better covariable to estimate TB prevalence. All digital datasets including TB prevalence of survey sites, HDIs by province and DEM were transformed to the same cartographic projection.
Testing of kriging and cokriging
The geostatistical method that was selected to generate maps of PTB prevalence distribution was based on statistical characteristics of each output surface based on comparison of cross-validation measures . Four cross-validation prediction error parameters were taken into account: root-mean-square (RMS), mean standardized (MeanStan), root-mean-square standardized (RMSStan) and average standard errors (ASE) for geostatistical methods. A better geostatistical method satisfies the following conditions at the same time: RMS is smaller, MeanStan is nearly 0, RMSStan is nearly 1, and ASE approaches RMS.
Evaluation of ordinary kriging and ordinary cokriging with various combinatorial approaches (evaluation results of top 10 best methods for each class of PTB prevalence)
Type of detrending
Smear positive PTB prevalence
HDI + Elevation
HDI + Elevation
HDI + Elevation
HDI + Elevation
HDI + Elevation
HDI + Elevation
HDI + Elevation
Mycobacterium positive PTB prevalence
HDI + Elevation
HDI + Elevation
Active PTB prevalence
HDI + Elevation
HDI + Elevation
HDI + Elevation
Maps showing spatial distribution prediction of PTB prevalence and prediction standard errors that shows the uncertainty related to the predicted values were created with the Geostatistical Wizard to ArcGIS (ArcGIS 10; ESRI Inc., Redlands, CA, USA), and the Natural Breaks (Jenks) method was used to classify the predicted values and their standard errors. ArcGIS also was used to convert datasets without Geodatabase format into an ESRI Geodatabase format, transform all digital datasets to the same cartographic projection, and evaluate geostatistical methods with different parameters combination. Except types of detrending, types of semivariogram models, anisotropy and covariables, other parameters (nugget, partial sill, etc.) of kriging and cokriging were estimated using an iterative cross validation technique to optimize semivariogram models in ArcGIS.
Results of cross-validation
Distribution estimate of smear positive PTB prevalence
Figure 5 illustrated smear positive PTB prevalence prediction map (1 × 1 km spatial resolution) and prediction standard error map (1 × 1 km spatial resolution) according to the best geostatistical method. The range of the prevalence was 0 to 426 per 100,000 population in P. R. China, in which the predicted values increased by degrees in eastern, central and western China but presented interlocked distributions in some pockets of the country. The prevalence (0 to 70 per 100,000 population) in Beijing, Tianjin, Hebei, Shanxi, Neimenggu, Liaoning, Shanghai, Jiangsu, Zhejiang, Anhui, Fujian, Shandong and Ningxia were relatively lower than in other provinces. The prevalence in Jilin, Heilongjiang, Jiangxi, Henan, Hubei, Guangdong, northern Sichuan, Shaanxi, Gansu, eastern Qinghai and northern Xinjiang presented interlocked distributions between 0 and 137 per 100,000 population. In Hunan, Guangxi, Hainan, Chongqing, southern Sichuan, Guizhou, Yunnan, Tibet, western Qinghai and southern Xinjiang, the prevalence increased gradually from 97 to 426 per 100,000 population.
Distribution estimate of Mycobacteriumpositive PTB prevalence
Figure 6 illustrated Mycobacterium positive PTB prevalence prediction map (1 × 1 km spatial resolution) and prediction standard error map (1 × 1 km spatial resolution) according to the best geostatistical method. The range of the prevalence was 0 to 849 per 100,000 population in P. R. China, in which the predicted values increased by degrees in eastern, central and western China but presented interlocked distributions in some pockets of the country. The prevalence (0 to 136 per 100,000 population) in Beijing, Tianjin, Hebei, Shanxi, Neimenggu, Liaoning, Shanghai, Jiangsu, Zhejiang, Anhui, Fujian, Shandong and Ningxia were relatively lower than in other provinces. The prevalence in Jilin, Heilongjiang, Jiangxi, Henan, Hubei, Hunan, Guangdong, Guangxi, Hainan, Chongqing, Sichuan, Shaanxi, Gansu, Qinghai and northern Xinjiang presented interlocked distributions between 0 and 512 per 100,000 population. In Guizhou, Yunnan, Tibet and southern Xinjiang, the prevalence increased gradually from 317 to 849 per 100,000 population.
Distribution estimate of active PTB prevalence
Figure 7 illustrated active PTB prevalence prediction map (1 × 1 km spatial resolution) and prediction standard error map (1 × 1 km spatial resolution) according to the best geostatistical method. The range of the prevalence was 0 to 4,751 per 100,000 population in P. R. China, in which the predicted values increased by degrees in eastern, central and western China but presented interlocked distributions in some pockets of the country. The prevalence (0 to 491 per 100,000 population) in Beijing, Tianjin, Hebei, Liaoning, Shanghai, Jiangsu, Zhejiang, Fujian, Shandong and Ningxia were relatively lower than in other provinces. The prevalence in Shanxi, Neimenggu, Jilin, Heilongjiang, Anhui, Jiangxi, Henan, Hubei, Hunan, Guangdong, eastern Guangxi, Hainan, Chongqing, northern Sichuan, Shaanxi, Gansu, eastern Qinghai and northern Xinjiang presented interlocked distributions between 0 and 1,117 per 100,000 population. In western Guangxi, southern Sichuan, Guizhou, Yunnan, Tibet, western Qinghai and southern Xinjiang, the prevalence increased gradually from 725 to 4,751 per 100,000 population.
Obtaining an accurate prediction is the ultimate aim of most studies that use kriging or cokriging. To improve the accuracy, many studies always selected a kriging or cokriging method they thought fit, or compared two or more kriging or cokriging methods to find the fittest one [5–8, 25]. However, it is difficult to find the best fitness method that can provide the most accurate prediction because four cross-validation prediction error parameters can hardly meet requires at the same time in a method when many methods are compared. To solve this problem, we developed a comprehensive determination criterion in this study, which rapidly determined the comprehensive positions of four cross-validation prediction error parameters meeting requires at the same time in 264 combinations of geostatistical input parameters for both kriging and cokriging for each class of PTB prevalence. Therefore, we had good reasons to believe that the final cokriging methods selected in this study ensured considerable accuracy of spatial prediction because we had compared most methods in a study so far.
Results of cross-validation in this study showed that global cokriging with HDI and elevation as covariables was the best geostatistical methods, which suggested that HDI and elevation as covariables increased the accuracy of spatial prediction for TB prevalence. In deeper order, this reflects that socio-economic factors and geographical factors can affect TB prevalence in P. R. China, which confirmed our hypotheses according to previous studies conducted in other countries [10, 11, 19–22, 26–28]. Therefore, except adopting socio-economic measures to control and prevent TB in P. R. China, impacts of geographic factors on TB control and prevention should be evaluated and interventions according with geographic features also should be adopted.
Continuous surfaces estimation of PTB prevalence in this study demonstrated that sputum smear positive, sputum Mycobacterium positive and active PTB prevalence were lower in Beijing, Tianjin, Shanghai and southeastern coast China, and were higher in western and southwestern China, which was consistent with the report on the fifth national TB epidemiological survey . However, distributions of PTB prevalence were complex in central China, which presented interlocked distributions between low and high PTB prevalence. This situation would increase complexities and difficulties of TB control and prevention in these areas, which would slow down the progress of NTP, given that 53% of the total population in the country is in these areas . Consequently, in order to achieve the goal of NTP according to schedule, on the basis of keeping the current level in eastern China and strengthening the further effort in western China, central China should be as the prior areas of TB control and prevention.
Although we thought that spatial prediction of PTB prevalence was considerably accurate in this study, we found that the uncertainty of predicted values in the border of Heilongjiang and Neimenggu, Tibet and western Qinghai were larger than in other areas. It was obvious that survey sites were sparser in areas with higher uncertainty of predicted values. Guimaraes, et al.  advised that, to improve the accuracy of an estimate using kriging, it would be necessary to obtain data with better location and spatial distribution of the information collected in the fieldwork. However, the probability proportionate to population size was merely considered when sampling survey sites in the fifth national TB epidemiological survey in P. R. China, which led to that survey sites were sparser in the vast, sparsely populated areas . Therefore, if we hope to obtain accurate and stable surface estimate through sampling survey in P. R. China in the future, we need to consider not only proportion of population when sampling survey sites but also their rational spatial distribution.
In conclusion, cokriging proved to be a suitable tool for accurately estimating the continuous surface of TB prevalence in P. R. China when socio-economic and geographic factors were considered as covariables, which suggested that these factors had impacts on regional differences of TB prevalence. The predicted surface of TB prevalence perspicuously demonstrated that sputum smear positive, sputum Mycobacterium positive and active PTB prevalence were lower in Beijing, Tianjin, Shanghai and southeastern coast China, higher in western and southwestern China, and crossed between low and high in central China. These findings can be used to better allocate the always limited resources of NTP.
We appreciated all the people who participated in the fifth national tuberculosis epidemiological survey in 2010 and appreciated that provincial tuberculosis dispensaries of Shandong, Henan, Guangdong, Hainan, Sichuan, Gansu, Ningxia and Xinjiang provided datasets of pulmonary tuberculosis prevalence in provincial survey sites.
We were also pleased to acknowledge the support of the National Science and Technology Major Program (grant no. 2012ZX10004-220).
- Disease Control Bureau of the Ministry of Health, Chinese Center for Disease Control and Prevention: Report on the 5th National Tuberculosis Epidemiological Survey in China-2010. 2011, Beijing, China: Military Medical Science PressGoogle Scholar
- Calderón GF-A: Spatial regression analysis vs. kriging methods for spatial estimation. Int Adv Econ Res. 2009, 15: 44-58.View ArticleGoogle Scholar
- Fraczek W, Bytnerowicz A, Arbaugh MJ: Application of the ESRI Geostatistical Analyst for determining the adequacy and sample size requirements of ozone distribution models in the Carpathian and Sierra Nevada Mountains. Sci World J. 2001, 1: 836-854.View ArticleGoogle Scholar
- Martinez HZ, Suazo FM, Cuador Gil JQ, Bello GC, Anaya Escalera AM, Marquez GH, Casanova LG: Spatial epidemiology of bovine tuberculosis in Mexico. Veterinaria Italiana. 2007, 43 (3): 629-634.PubMedGoogle Scholar
- Guimaraes RJ, Freitas CC, Dutra LV, Felgueiras CA, Drummond SC, Tibirica SH, Oliveira G, Carvalho OS: Use of indicator kriging to investigate schistosomiasis in minas gerais state, Brazil. J Tropical Med. 2012, 2012: 837428-View ArticleGoogle Scholar
- Gething P, Atkinson P, Noor A, Gikandi P, Hay S, Nixon M: A local space-time kriging approach applied to a national outpatient malaria dataset. Comput Geosci. 2007, 33 (10): 1337-1350.View ArticlePubMedPubMed CentralGoogle Scholar
- Ali M, Goovaerts P, Nazia N, Haq MZ, Yunus M, Emch M: Application of poisson kriging to the mapping of cholera and dysentery incidence in an endemic area of Bangladesh. Int J Health Geograph. 2006, 5: 45-View ArticleGoogle Scholar
- Carrat F, Valleron AJ: Epidemiologic mapping using the “kriging” method: application to an influenza-like illness epidemic in France. Am J Epidemiol. 1992, 135 (11): 1293-1300.PubMedGoogle Scholar
- Yalçin E: Cokriging and its effect on the estimation precision. J S Afr Inst Min Metall. 2005, 105: 223-228.Google Scholar
- Maciel EL, Pan W, Dietze R, Peres RL, Vinhas SA, Ribeiro FK, Palaci M, Rodrigues RR, Zandonade E, Golub JE: Spatial patterns of pulmonary tuberculosis incidence and their relationship to socio-economic status in Vitoria. Brazil. Int J Tuberc Lung Dis. 2010, 14 (11): 1395-1402.PubMedGoogle Scholar
- Vargas MH, Furuya ME, Perez-Guzman C: Effect of altitude on the frequency of pulmonary tuberculosis. Int J Tuberc Lung Dis. 2004, 8 (11): 1321-1324.PubMedGoogle Scholar
- Ministry of Health P. R. China: WS 288–2008, Diagnostic Criteria for Pulmonary Tuberculosis. 2008, Beijing, China: People’s Medical Publishing HouseGoogle Scholar
- Li XX, Zhang H, Jiang SW, Liu XQ, Fang Q, Li J, Li X, Wang LX: Geographical distribution for prevalence of pulmonary tuberculosis in China in 2010. Zhonghua liu xing bing xue za zhi. 2013, 34 (10): 980-984.PubMedGoogle Scholar
- United Nations Development Programme: Global Human Development Report 1999. 1999, New York: Oxford University PressGoogle Scholar
- Stockholm Environment Institute in collaboration with United Nations Development Programme China: China Human Development Report 2002: Making Green Development a Choice. 2002, New York: Oxford University PressGoogle Scholar
- China Development Research Foundation in collaboration with United Nations Development Programme China: China Human Development Report 2005: Towards Human Development with Equity. 2005, Beijing, China: China Translation and Publishing CorporationGoogle Scholar
- United Nations Development Programme: China Human Development Report: 2007–2008: Basic Public Services Benefiting 1.3 Billion Chinese People. 2008, Beijing, China: China Translation and Publishing CorporationGoogle Scholar
- United Nations Development Programme: China Human Development Report: 2009/10: China and a Sustainable Future: Towards a Low Carbon Economy and Society. 2010, Beijing, China: China Translation and Publishing CorporationGoogle Scholar
- Mansoer JR, Kibuga DK, Borgdorff MW: Altitude: a determinant for tuberculosis in Kenya?. Int J Tuberc Lung Dis. 1999, 3 (2): 156-161.PubMedGoogle Scholar
- Olender S, Saito M, Apgar J, Gillenwater K, Bautista CT, Lescano AG, Moro P, Caviedes L, Hsieh EJ, Gilman RH: Low prevalence and increased household clustering of mycobacterium tuberculosis infection in high altitude villages in Peru. Am J Trop Med Hyg. 2003, 68 (6): 721-727.PubMedGoogle Scholar
- Saito M, Pan WK, Gilman RH, Bautista CT, Bamrah S, Martin CA, Tsiouris SJ, Arguello DF, Martinez-Carrasco G: Comparison of altitude effect on mycobacterium tuberculosis infection between rural and urban communities in Peru. Am J Trop Med Hyg. 2006, 75 (1): 49-54.PubMedGoogle Scholar
- Tanrikulu AC, Acemoglu H, Palanci Y, Dagli CE: Tuberculosis in Turkey: high altitude and other socio-economic risk factors. Public Health. 2008, 122 (6): 613-619.View ArticlePubMedGoogle Scholar
- Davis BM: Uses and abuses of cross-validation in geostatistics. Math Geol. 1987, 19 (3): 241-248.View ArticleGoogle Scholar
- Stein ML: Interpolation of Spatial Data: Some Theory for Kriging. 1999, New York: Springer-VerlagView ArticleGoogle Scholar
- Asmarian NS, Ruzitalab A, Amir K, Masoud S, Mahaki B: Area-to-area poisson kriging analysis of mapping of county- level esophageal cancer incidence rates in Iran. Asian Pacif J Cancer Prev. 2013, 14 (1): 11-13.View ArticleGoogle Scholar
- Alvarez-Hernandez G, Lara-Valencia F, Reyes-Castro PA, Rascon-Pacheco RA: An analysis of spatial and socio-economic determinants of tuberculosis in Hermosillo, Mexico, 2000–2006. Int J Tuberc Lung Dis. 2010, 14 (6): 708-713.PubMedGoogle Scholar
- Froggatt K: Tuberculosis: spatial and demographic incidence in Bradford, 1980–2. J Epidemiol Community Health. 1985, 39 (1): 20-26.View ArticlePubMedPubMed CentralGoogle Scholar
- Munch Z, Van Lill SW, Booysen CN, Zietsman HL, Enarson DA, Beyers N: Tuberculosis transmission patterns in a high-incidence area: a spatial analysis. Int J Tuberc Lung Dis. 2003, 7 (3): 271-277.PubMedGoogle Scholar
- Population Census Office under the State Council, Department of Population and Employment Statistics of National Bureau of Statistics: Tabulation on the 2010 Population Census of People’s Republic of China. 2012, Beijing, China: China Statistics PressGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2458/14/257/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.