Assessing local determinants of neural tube defects in the Heshun Region, Shanxi Province, China
© Wang et al. 2010
Received: 18 August 2009
Accepted: 2 February 2010
Published: 2 February 2010
Skip to main content
© Wang et al. 2010
Received: 18 August 2009
Accepted: 2 February 2010
Published: 2 February 2010
Neural tube defect (NTD) prevalence in northern China is among the highest worldwide. Dealing with the NTD situation is ranked as the number one task in China's scientific development plan in population and health field for the next decade. Physical and social environments account for much of the disease's occurrence. The environmental determinants and their effects on NTD vary across geographical regions, whereas factors that play a significant role in NTD occurrence may be buried by global statistics analysis to a pooled dataset over the entire study area. This study aims at identification of the local determinants of NTD across the study area and exploration of the epidemiological implications of the findings.
NTD prevalence rate is represented in terms of the random field theory, and Rushton's circle method is used to stabilize NTD rate estimation across the geographical area of interest; NTD determinants are represented by their measurable proxy variables and the geographical weighted regression (GWR) technique is used to represent the spatial heterogeneity of the NTD determinants.
Informative maps of the NTD rates and the statistically significant proxy variables are generated and rigorously assessed in quantitative terms.
The NTD determinants in the study area are investigated and interpreted on the basis of the maps of the proxy variables and the relationships between the proxy variables and the NTD determinants. No single determinant was found to dominate the NTD occurrence in the study area. Villages where NTD rates are significantly linked to environmental determinants are identified (some places are more closely linked to certain environmental factors than others). The results improve current understanding of NTD spread in China and provide valuable information for adequate disease intervention planning.
Birth defects --in particular, neural tube defects (NTD) - refer to any anomaly (functional or structural) that occurs in infancy or later in life. These are birth defects primarily of the brain and spinal cord (NTDs are comprised mainly of anencephaly, spina bifida, and encephalocele ). NTD is commonly induced by events preceding birth, inherited or acquired (usually between the 3rd and 4th weeks of the gestational age), and varying from minor cosmetic irregularities to life-threatening disorders. NTD are a leading cause of infant mortality and disability worldwide . In rural areas of China, the cost for the health care system is high and is considered a prime poverty-causing factor in these areas.
Important birth defect factors include heredity, environment (physical and psychological conditions, socioeconomic status, health etc.) and their interactions. The impact of a risk factor varies with the type of NTD and the presence or absence of other defects. Several studies have investigated the role that genetic and environmental factors play in triggering NTD cases . One of these factors is life-style. The psychosocial and emotional stress of the mother during pregnancy may increase NTD risk . The views concerning the role of socioeconomic status in NTD risk vary, with some studies suggesting that the risk is higher among families of lower socioeconomic status, and some others not supporting this suggestion [5–7]. There are also racial and ethnic differences in NTD risk, which may be due to differences in genetic susceptibility, culture and diet . Fortunately, many of the birth defect situations are preventable [9–11].
China has been acknowledged as a geographical region with high NTD occurrences. Based on data collected from a hospital-based surveillance system, the average NTD prevalence rate during 1987 was calculated to be approximately 27.4 per 10,000 births (with considerably higher rates observed in certain regions). Infant deaths caused by birth defects have increased constantly in China since the end of the last century . Evidence obtained from urban and rural investigations clearly shows that since the 1990's the infant mortality rate (IFM) caused by birth defects in cities increased from 21 to 30% . Accordingly, dealing with the NTD situation is ranked as the number one task in the country's scientific development plan for the next decade , a fact that amply demonstrates the high priority of NTD studies for China. The Shanxi province (north part of China) exhibits the highest rate of birth defects with 105.5 per 10,000 births in 1987; also, during the period 1996-2002 Shanxi had the highest rate with 60.88 per 10,000 births .
Taking into consideration the fact that the Heshun county (Shanxi province, China) is one of the areas with the highest NTD prevalence in the world, the main objective of this paper is to accurately map the geographical distribution of NTD cases in that county and identify the corresponding NTD determinants. Important issues to consider include the assessment of the specific factors causing NTD in a given region or population group in the Heshun county, and the investigation whether the NTD determinants apply globally or locally in space [16, 17]. As far as the quantitative study of the disease is concerned, some of the relevant factors may be implicit in classical and global statistics analyses, whereas some others may possess a spatial dimension that needs to be determined by spatial statistics methods [18, 19]. Furthermore, some NTD determinants may be leveraged out in a global statistics analysis (pooled dataset) but they may stand out in a localized analysis.
The above considerations make it appropriate that in the present Heshun NTD study we implement an adequate synthesis of quantitative techniques including spatial NTD analysis. The results of the spatial NTD analysis could be used to accurately identify intervention targets and offer valuable input to the systematic development of prevention strategies. This is an important matter, since it is widely accepted that the accurate identification of NTD determinants allows early intervention, which is a crucial component of any effort to minimize the consequences of birth defects.
Let the geographical distribution of the epidemic attribute (number of NTD cases) be represented mathematically by the spatial random field (SRF), Y s , in the sense of Christakos . The vector s = (s 1, s 2) denotes the location of the attribute, where s 1, s 2 are the associated spatial coordinates of the location. Also, let m s = E [Y s ] be a non-random quantity that represents the average value of all possible SRF realizations at the location s, where the E [·] denotes stochastic expectation. The SRF formulation properly represents the fact that NTD prevalence shows significant variation in terms of geographic location (within regions and between countries [[5, 22], and ]. Two quantitative expressions of prevalence rate can be defined in terms of the SRF, as follows.
where f y, s is the probability density function of Y s , and ψ s denotes a realization of Y s at s (for the underlying mathematical details the readers are referred to Christakos and Hristopulos ).
The OPR is directly observable and expresses the "here-and-now" crude disease rate, which makes r ℜ a useful study parameter when the objectives include the study of infectious disease outbreaks and the assessment of emergency health services. The SPR, on the other hand, expresses an essential property of epidemiological phenomena , which makes m ℜ a useful tool in the study of the relationship between an epidemic and its determinants. The SPR is rarely observed directly, but it can be approximated in terms of the available observations and by incorporating neighboring samples in a Bayesian context . Accordingly, m ℜ will be the prime focus of the present study, which means that in the following the term "NTD rate" refers to SPR values.
In spatial analysis one assumes that events are spatially correlated (dependent); see [27–29]. Accordingly, we introduced a regular lattice of grid points to serve as the spatial centers of the disease distribution. Then we constructed a series of circles that covered the entire Heshun area. Since the average NTD rates were calculated within each one of these circle using the available data, the circles were called "NTD_rate circle" (meaning that each circle had its own NTD rate). The socio-economic activities of Heshun residents usually take place between 6.2 and 9.3 Kms , which is why we set the radius of these circles to 3 Kms. Moreover, in order to cover all village points in the Heshun county, the least distance between any two centers should be 4.2 (3 ) Kms.
The Rushton circle  and the hierarchical Bayesian model  are efficient tools to correct the instability of the estimated rate for small probability events by borrowing strength from neighbors. Nevertheless, we still doubt that the number of unreliable values in the neighbors may play a more negative than positive role in estimation. Therefore, we set an artificial threshold of five live born of any village; villages bigger than this were included in the NTD prevalence rate calculation, then the values were used as sample in the subsequent statistical inference. Theoretically, there may exist a balance between (a) using larger samples to reduce the estimation variance [30, 31], but at a risk of negative impact due to some highly unreliable values among the sample, and (b) using reliable but small samples in the estimation of a regional attribute. The matter deserves further investigation in the future.
Three prime types of factors are suspected to cause NTD: (1) environment (physical, social, economic etc.); (2) hereditary (genetic, pre-existing conditions etc.); and (3) synthetic (interaction between (1) and (2)). Recent studies show that most NTD cases are the result of environment-gene interactions [32, 33]. The NTD etiology includes not only physical, chemical and biological agents, but also social and cultural determinants, where the latter affect body's immunity through their impact on psychological and mental health. Many of the direct NTD determinants are difficult or expensive to access. Actually, the direct etiological NTD factors act through at least four geographical layers, which are easier implementable in terms of a geographical information system, GIS [34, 35]. In particular, these layers may be grouped are as follows:
Physical NTD determinants that are spatially distributed. Potential NTD hazards include surface and subsurface water contaminated by insufficiently oxygenized ancient geological media; also, radiation emissions from certain rocks or along faults [36, 37].
Man-made pollution that is spatially distributed. Hazards of this kind include pesticides and chemical fertilizes spread over crop fields. Also, polluted air and water emission from workshops and electromagnetic radiation in the workplace [38–41].
Nutrition processes that are spatially distributed. For example, nutrition strongly depends on spatially varying residential income. Hence, it is usually proportional to the GDP that is regularly surveyed across space and published in the government's annual statistics/census reports .
Heredity and habits that are spatially distributed. Ethnic groups have specific genetically transmitted habits and behavioral patterns (e.g., related to food consumption), some of which are hazardous to health . Health determinants may be detected when the disease cases and the ethnic characteristics share similar spatial patterns; for example, when the shape and size of spatial disease clusters are consistent with these of the citizens' daily activities, it could suggest that heredity is relevant to the regional NTD .
The explicit physical and human geographical proxies of the NTD determinants are collected: elevation, accessibility (e.g., road buffer), geological background (fault buffer), water conditions (e.g., river buffer), per-capita income (per-capita net income), medical conditions (e.g., number of doctors), crop yield (e.g., vegetable and fruit production), agricultural chemical exposure (fertilizer and pesticide use), land cover, lithology, watershed and soil conditions in every village. The socioeconomic factors were measured in terms of averaged annual levels during the period 1998-2005. Fumonisins in maze or other grains could be an important NTD factor [44, 45]. However, the north of China where our pilot study was conducted is very dry during throughout the year, so the climate is not suitable for Fumonisins growth; and we have not found any report on Fumonisins in Shanxi province. In addition, the study area is hilly and is not the main maize production area, so we did not test for Fumonisins in this study.
When implementing the GWR technique, categorical variables (including land cover, lithology, watershed and soil conditions) should be distinguished from explanatory variables. Typically, the former variables appear in terms of the Ordinary Least Squares (OLS) technique by introducing dummy variables. However, this would result in what is technically termed a "sever model design" in GWR analysis (i.e., an explanatory variable is perfectly collinear with the intercept, especially when a village and all its neighbors have the same values for one or more explanatory variables). As a consequence, we only regressed non-categorical variables and NTD rates using GWR. The categorical variables were checked by comparing the spatial patterns of the GWR outputs vs. those of the categorical variables.
In standard regression applications, the elastic coefficients of the NTD factors are assumed to be constant over space, which is the case of "one model fits all". The GWR technique, however, properly extends the traditional regression framework by allowing local rather than global parameters to be estimated. In this study, therefore, we implemented the GWR technique to identify local relationships between NTD and environmental factors. The GWR software tool used was GWR3.0 ; the ArcGIS9.0i was also used to map the variables and the numerical results obtained.
where w ij represents the weight of any datum at point s j (j = 1, 2, ..., n) on the calibration of the prediction model at point s i . Normally, each w ij is a continuous function of d ij , the distance between s i and s j (i.e. d ij = |s i - s j |). One possible choice is , where h is called the bandwidth, and its specification depends on the situation under consideration.
The village centroids in the Heshun county are distributed unevenly: some are densely distributed, whereas some others are sparsely distributed. This means that local regression may rely on relatively few data points in areas where these points are sparsely distributed. To address this potential problem we used a spatially adaptive weighting technique, which involves the experimental calculation of the bandwidth rather than assigning it directly in the GWR context. The bandwidths are relatively small in areas where the data points are densely distributed and they are rather large in areas where the data points are sparely distributed. Better results, measured in terms of the global R 2 of GWR, were obtained when more points were involved in bandwidth calculation.
Global performance of GWR.
The series of maps in Figs. 7 ("NTD-proxies" links) together proxies and Fig. 4 (relationship between determinants and their proxies) provide a considerable amount of information concerning the relationship between NTD, its possible determinants and the proxies in the Heshun county (China). The maps accurately identify areas where NTD is sensitive to the various proxies; associate significant NTD determinants with the appropriate sites (using the inference framework of Fig. 4); and suggest potential sites for efficient intervention in order to effectively reduce the number of NTD cases.
The local GWR coefficients of most environment factors exhibit significant positive and negative associations with the NTD rates at different sites simultaneously. This cannot be explained by assuming a linear relationship between NTD and its possible determinants (which is the case, since GWR is essentially a linear regression technique). The situation may imply that a nonlinear association exists between NTD and the relevant environmental factors or that not a single determinant dominates the NTD occurrence in the study area.
NTD rates increase with increasing elevation in two village aggregations (the green color denotes significant positive association; Fig. 7a), whereas the rates decrease with increasing elevation in three other village aggregations (the purple color denotes significant negative association). All five places with significant coefficients of elevation are located in high NTD rate areas (Fig. 5c) --the reverse may not be true.
As is shown in Fig. 7b, in three village aggregates the NTD rates show significant negative association with distance from the faults, that is, the NTD reduces with increasing distance from the faults. Hence, the faults (in particular, the radiation and ancient water released from the faults) are the most probable significant NTD determinants in these three village aggregations.
The sites with significant road buffer coefficients (Fig. 7c) are consistent to those with high NTD rates (Fig. 5c). Furthermore, sites with significant positive coefficients (Fig. 7c) are located in the valley, whereas sites with negative coefficients (Fig. 7c) are found in the mountains. The situation in the valley is consistent with the common-sense expectation that people living near a road network usually have better life conditions than those living far away from it (which explains the lower NTD rates). Remarkably, the inverse situation is valid for people living in the mountains: the nearer to the road network they live, the higher the NTD rates. The epidemiological implication of the latter situation is complicated . Distance from rivers. There is a relative large village cluster with significant positive river buffer coefficients (Fig. 7d), which may imply that in the case of the Heshun county the river plays a significant positive role in reducing the number of NTD cases. This role needs to be further investigated.
Fig. 7e displays two purple-colored village aggregations, both in mountainous areas. In these areas, where health service is extremely rare, even a small improvement could lead to a significant reduction in NTD rates. These areas would be the most efficient places for NTD intervention (e.g., in terms of an increasing number of doctors), meaning that the highest intervention contribution in NTD reduction could be achieved in these areas.
Fig. 7f shows that the net income distribution does not behave as is commonly expected --i.e. in relation to nutrition, thus becoming a significant determinant of NTD in the specific area. A possible interpretation is that in most villages of the Heshun county, the net income is not high or low enough to effect NTD rates; or the net income is not transformed into food or nutrition consumed by the local people. Note that three villages in the west county exhibit a highly positive association between NTD rates and net income (which probably means that confounding factors should be present in these areas).
At its current level, fruit production (Fig. 7g) seems to be rather irrelevant to NTD. This does not imply that different levels of fruit production and the associated fruit consumption are irrelevant to NTD intervention, in general. Actually, fruit production and consumption in the Heshun county is very low, due to limitations imposed by the steep mountain and the barren soil in that county.
As one can see in Fig. 7h, there exists a village aggregation where NTD rates are positively associated with vegetable production. That is contrary to the common belief that the consumption of vegetables, which are rich in folio, will substantially reduce the risk of NTD. This is an abnormal village aggregation that deserves further investigation. Pesticides and fertilizers. The two factors (Fig. 7i and 7j) are believed to have negative impact on health . There are several villages that present negative association between NTD rate and these two factors. This strongly suggests that the agriculture products (on which pesticides and fertilizers have been used) are not consumed locally (a common practice in rural China is that the local farmers use lower amounts of pesticide and chemical fertilizer for the products they plan to consume themselves).
The intercept (Fig. 7k) reflects the baseline of NTD rates, assuming that all variables in the equation are zero. Interestingly, the spatial pattern of significant intercept coefficients is consistent with that of road buffer (Fig. 7c) and elevation (Fig. 7a), but with an opposite sign in the case of elevation. Elevation and road affect human communication, which means that the above "NTD baseline-road-elevation" pattern consistency can play a significant role in NTD occurrences in the mountainous areas of the Heshun county. Intermarriage usually falls within the definition of social activities and the male and female usually have similar socioeconomic status when they get married .
The categorical variables are visually compared with Fig. 7. The most of NTD cases happened in areas with soil type 7 (infant cinnamon soil). In contrast, the land cover types are distributed very dispersedly, so that one cannot detect any regular patterns. Also, the lithology of Q4 (Quaternary) is to some extent consistent with the negative coefficient for the fault buffer.
In summary, the local statistics approach identifies the villages where NTD rates are significantly linked to environmental determinants. In several village aggregations the NTD are found to be significantly associated with the proxy variables of radiation and ancient water released from the faults. Soil and lithlogy, river and road, health service, food production, pesticides and fertilizer are significantly related with NTD in some places, which can be interpreted by etiology or by social behaviors. Some places are more active than others as far as coefficient significance of the GWR is concerned, whereas most villages are always insignificant with respect to the different variables. This means that the NTD situation in these places may be more complicated than the variables and the linearity assumption considered by the GWR technique. In such cases, a composite space-time analysis involving nonlinear predictors [50–52] may be appropriate, which will be the subject of future NTD investigations.
neural tube birth defects
neural tube birth defect rate
geographically weighted regression
spatial random field
observed population rate
average root variance
geographical information system
This study was supported by the NSF China (40471111, 70571076), MOST, China (2006AA12Z215, 2007DFC20180; 2007AA12Z233), CAS, China (KZCX2-YW-308), and the California Air Resources Board, USA (55245A).
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.