Exploring genetic variants predisposing to diabetes mellitus and their association with indicators of socioeconomic status

Background The relevance of disease-related genetic variants for the explanation of social inequalities in complex diseases is unclear and empirical analyses are largely missing. The aim of our study was to examine whether genetic variants predisposing to diabetes mellitus are associated with socioeconomic status in a population-based cohort. Methods We genotyped 11 selected diabetes-related single nucleotide polymorphisms in 4655 participants (age 45-75 years) of the Heinz Nixdorf Recall study. Diabetes status was self-reported or defined by blood glucose levels. Education, income and paternal occupation were assessed as indicators of socioeconomic status. Multiple regression analyses were used to examine the association of socioeconomic status and diabetes by estimating sex-specific and age-adjusted prevalence ratios and their corresponding 95%-confidence intervals. To explore the relationship between individual single nucleotide polymorphisms and socioeconomic status sex- and age-adjusted odds ratios were computed. We adjusted the alpha-level for multiple testing of 11 single nucleotide polymorphisms using Bonferroni’s method ( α BF ~ 0.005). In addition, we explored the association of a genetic risk score with socioeconomic status. Results Social inequalities in diabetes were observed for all indicators of socioeconomic status. However, there were no significant associations between individual diabetes-related risk alleles and socioeconomic status with odds ratios ranging from 0.87 to 1.23. Similarly, the genetic risk score analysis revealed no evidence for an association. Conclusions Our data provide no evidence for an association between 11 diabetes-related risk alleles and different indicators of socioeconomic status in a population-based cohort, suggesting that the explored genetic variants do not contribute to health inequalities in diabetes.


Background
Indicators of socioeconomic status (SES) are strongly related to health conditions with groups of low SES showing higher prevalence and incidence for almost all diseases [1][2][3]. Despite the evidence that differences in working and living conditions, health behaviors and psychosocial factors are important determinants underlying these health inequalities [4][5][6], twin and other types of family studies suggest a contribution of genetic factors due to selection effects [7][8][9]. However, on the molecular level the relevance of disease-related genetic variants for the explanation of social inequalities in health is still unclear and theoretical approaches as well as empirical analyses incorporating genetic data are largely missing. Learning more about the possible role of genetic factors in health inequalities is considered to further improve the understanding of population health in general and of health inequalities in particular.
If differences in genetic predisposition of a certain disease had an impact on health inequalities, it would be expected that the disease itself would have an influence on SES through intra-and intergenerational processes of social mobility, allowing for a higher frequency of diseaserelated risk alleles in lower SES groups [10]. This assumption is described by the hypothesis of direct health selection supposing that individuals with good health are more likely to move upward in SES than individuals with poor health and vice versa [11]. Incorporating genetic factors in the hypothesis of direct health selection would give disease-related risk alleles an impact on SES through a mediating effect of disease. However, direct health selection cannot be regarded as the main explanation for health inequalities because processes of social mobility are scarce in older ages when most diseases arise [11,12]. Particularly, if focusing on late onset diseases and measures of SES representing early life conditions or conditions across the life span (e.g., parental SES, education), reverse causation is unlikely.
With regard to common complex diseases it is supposed that next to environmental factors a large number of genetic variants contribute to disease etiology. In the recent past, disease-related risk alleles were primarily identified by genomewide-association studies (GWAS) following the common diseasecommon variants hypothesis [13]. These GWAS-based gene variants usually have small to modest individual effects [14], however strong relationships between common variants and complex traits would be necessary to gain a further prerequisite for supposed differences in allele frequencies between SES groups [15].
Hence, there is little reason to assume that differences in genetic predisposition to complex diseases play a role in health inequalities. To reconsider this argument by empirical analysis on the molecular level, the aim of this study was to examine whether there are associations between different SES indicators and genetic variants predisposing to diabetes mellitus as an example for a late onset complex disease. The inverse relationship between indicators of SES and diabetes is well reported across different populations [1,[16][17][18]. In addition, a large number of genomic loci robustly associated with diabetes were identified through GWAS in the recent past [19][20][21]. To date, no studies have examined the association of SES and diabetes-related genetic variants to explore whether SES differences in genetic predisposition contribute to health inequalities in diabetes.

Study population
Data was used from the baseline examination of the Heinz Nixdorf Recall (Risk Factors, Evaluation of Coronary Calcium, and Lifestyle) Study, a prospective populationbased cohort study. The rationale and design of the study were described elsewhere [22]. A random sample derived from mandatory citizen registries of three large cities (Bochum, Essen, Mülheim/Ruhr) in an urban region in the western part of Germany was used to recruit 4814 women and men aged 45-75 years. Baseline examination took place from 2000 to 2003 and the baseline response proportion was 55.8% [23]. Written informed consent was obtained from all participants. The study was approved by the institutional ethics committee of the University Hospital Essen and comprises extended quality management procedures, including a certification according to DIN ISO 9001:2000.

Data assessment
Diabetes was defined as either of the following criteria: reported history of diabetes, taking glucose-lowering drugs, having fasting blood glucose levels of greater than 125 mg/dL, or having nonfasting glucose levels of 200 mg/dL or greater. Overall, 23 participants reported early disease onset indicating rather Type 1 than Type 2 diabetes. In analyses excluding these 23 participants virtually identical results were obtained as those presented in the following. To be consistent with a previous study using the same data [24] we decided to include the 23 participants in the analyses.
Based on literature research the following 11 single nucleotide polymorphisms (SNPs) related to 8 genetic loci (in parentheses) derived from GWAS for diabetes in European-origin populations were selected: rs4402960 (insulin-like growth factor-binding protein 2  ). These SNPs include common variants with some of the highest effects on diabetes risk reported to date [19][20][21]. The literature research took place in January 2009 and was previously described in detail [24]. Genotyping was performed by matrix-assisted laser desorption ionizationtime of flight mass spectrometry-based iPLEX Gold assay at the Department of Genomics, Life and Brain Center, Bonn, Germany.
Education, income and paternal occupation were collected as SES indicators by standardized interviews. Paternal occupation was classified referring to the International Standard Classification of Occupations (ISCO-88) [25] and categorized into four groups (unskilled employees/workers; qualified (skilled) employees/workers; technicians and associate professionals; managers and professionals).
Education was defined by combining school and vocational training as total years of formal education according to the International Standard Classification of Education [26] and categorized into three groups with the lowest educational group of 10 and less years (equivalent to a basic school degree with no vocational training), the medium educational group of 11 to 13 years (equivalent to upper secondary educational degrees or a combination of lower secondary education and vocational training) and the highest educational group of 14 and more years of education (equivalent to a vocational training including additional qualification or a university degree). In previous analyses of the same study population no further differentiation between university degrees and other types of higher education has been made because of the small number of diabetes cases in the respective group. This small sample size would have caused problems in conducting multivariate analyses.
Income was measured as the monthly household equivalent income calculated by dividing the total household net income by a weighting factor for each household member [27]. Income was included into analyses either as a continuous variable or divided into four groups using sex-specific quartiles.

Statistical analyses
The analyses were conducted with 4655 participants who had information on diabetes status, indicators of SES and genetic data. Some observations on paternal occupation (n = 258), education (n = 14) and income (n = 296) were missing. As the SES indicators were analyzed separately, these participants were excluded only from the respective analyses. No correlations between missing SES measures and diabetes status were observed. All analyses were performed using the R statistical package version 2.14.0 [28] and PLINK (v1.07) for Windows [29].
First, log-binomial regression models were fitted to assess the association of SES indicators and diabetes status by estimating sex-specific and age-adjusted prevalence ratios (PR) and their corresponding 95%-confidence intervals (95%-CIs). Education, income and paternal occupation were entered separately as categorical predictors by coding dummy variables with the highest category as reference.
Second, logistic regression models were fitted to check the association of the GWAS-based SNP alleles to diabetes status. Therefore, sex-and age-adjusted odds ratios (OR) and 95%-CIs were estimated under a (log-)additive genetic model for each SNP, as suggested in previous studies [19,20]. In addition, a genetic risk score was developed by adding the risk alleles (0/1/2) of the diabetes-related SNPs for each participant to explore the association between the sum of risk alleles and diabetes status. For missing genotype information expected values were imputed based on the risk allele frequency of the respective SNP in the study population. SNP pruning for the genetic risk score was performed using pairwise linkage disequilibrium of r 2 > 0.8 as cut off to account for correlated effects, resulting in the exclusion of rs1094639. The calculated effect size estimators are to be interpreted as average effects for one additional risk allele.
Third, for the primary research question the relationship between each SNP and SES was explored by computing sex-and age-adjusted ORs and 95%-CIs under a (log-) additive, dominant and recessive genetic model. Again, categorical education, income and paternal occupation were used separately as indicators of SES. For each SES category a binary outcome variable with the highest category as reference was entered in a logistic regression model. Income was also used as a continuous variable in a linear regression model to estimate standardized effect sizes and 95%-CI for each SNP. For this analysis income was log e -transformed to normalize the distribution. According to our knowledge, no association of the selected SNPs to SES has been demonstrated previously. Hence, it was decided to control the family-wise error rate of the primary research question at α = 0.05. We adjusted the alpha-level for multiple testing of 11 SNPs using Bonferroni's method (α BF~0 .005). In addition, the association between the genetic risk score and SES was explored using the SES indicators as outcome.

Results
Characteristics of the study population are shown in Table 1. Of the 4655 participants 13.6% (n = 634) had diabetes with women having a lower prevalence (9.8%) than men (17.4%). Differences between women and men in the distribution of two SES indicators were observed: Women reported less years of formal education and showed a lower median income.
Inequalities in all SES indicators were related to diabetes status for both women and men (Figures 1, 2 and 3). The comparison of men in the lowest category of paternal occupation (unskilled employees/workers) to those in the highest category (managers and professionals) showed an age-adjusted PR of 1.5 (95%-CI: 1.0-2.3) for the occurrence of diabetes in the study population. The respective PR for women was 1.6 (95%-CI: 0.9-3.0). The analyses for education (≤10 years of education vs. ≥ 14 years; women: PR 2.3, 95%-CI: 1.4-3.9; men: PR 1.4, 95%-CI: 1.0-2.0) and income (lowest sex-specific quartile vs. highest sex-specific quartile; women: PR 2.0, 95%-CI: 1.3-3.2; men: PR 1.2, 95%-CI: 0.9-1.5) showed similar results. There are gender differences with women revealing stronger associations with diabetes status across all SES indicators. Table 2 shows the estimated ORs of the logistic regression models for the 11 selected SNPs and diabetes status.
The results for the logistic regression models for the 11 SNPs and paternal occupation under a (log-)additive genetic model are shown in Table 3. No statistically significant associations at α BF~0 .005 were observed. The estimated ORs were small to modest ranging from 0.87 to 1.23 for each respective diabetes risk allele. The logistic regression models for the outcomes education ( Table 4) and income (Table 5) revealed similar results with no statistically significant associations at α BF~0 .005 and ORs ranging from 0.88 to 1.16 (education) and 0.87 to 1.14 (income). Under a dominant and recessive genetic model no deviant results to those under a (log-)additive model were obtained (results not shown). Furthermore, no statistically significant associations were observed for the analysis using log e -transformed income (results not shown). This was consistent with the observation for income as a categorized outcome. Table 6 shows the results using the genetic risk score in logistic regression models for all SES indicators. The estimated ORs are close to 1.0 and statistically significant results at a nominal α level of 0.05 were observed only for income comparing the 3 rd quartile with the highest quartile (OR 0.95, 95%-CI: 0.91-0.99).

Discussion
The presented data showed an association between all indicators of SES and diabetes status. Magnitude and trend of the associations across different SES groups are similar to those reported in the literature [1,17,18].

Figure 1
Paternal occupation and diabetes: age-adjusted prevalence ratios and 95% confidence intervals for the association of paternal occupation and diabetes for women (white) and men (black) (by groups; 'managers and professionals' as reference).
Stronger associations of SES differences in diabetes for women have been reported before as well [30]. With the selected SES indicators different but related aspects of social inequalities are measured representing different stages in the life course [18,31]. As associations for all of the explored SES indicators were found, our results give supporting evidence that health inequalities in diabetes are affected by diverse social conditions during different stages in life [18,32]. As expected, considering the given sample size and the relatively small number of diabetes cases in the study population, only 5 of the 11 selected diabetes-related SNPs (5 of the 8 genetic loci) were replicated with nominal statistically significant results for their association with Figure 2 Education and diabetes: age-adjusted prevalence ratios and 95% confidence intervals for the association of education and diabetes for women (white) and men (black) (by groups; '> = 14 years of education' as reference).

Figure 3
Income and diabetes: age-adjusted prevalence ratios and 95% confidence intervals for the association of income and diabetes for women (white) and men (black) (by sex-specific quartiles; highest quartile as reference). diabetes status. However, all SNP alleles showed directionally consistent effects when compared with those reported in the literature [19][20][21].
The main finding of the study is the lack of evidence for the contribution of 11 selected diabetes-related SNPs representing 8 genetic loci to the observed health inequalities in diabetes. There were no statistically significant associations between the individual SNPs and SES indicators after conservatively controlling for multiple testing by the Bonferroni method. Even with an uncorrected level of significance (α = 0.05) the number of estimators with p < α does not exceed the number expected by chance.
In general, no clear inverse trend of the calculated ORs could be found across the different SES groups.
There may be a few exceptions for markers related to FTO (for paternal occupation), PPARG, CDKAL1 and CDKN2A/2B (for education). As the differences between the respective ORs are small and statistically significant results were missing, they have to be interpreted as random trends. Furthermore, we observed many ORs < 1.0 which we would not expect if the risk alleles of the selected SNPs generally have an impact on the observed health inequalities in diabetes. In addition, the highest ORswhich are only of small to modest magnitudewere not exclusively present for the lowest compared with the highest SES group. The results for using the genetic risk score in the analyses support the observations for the individual SNPs: The sum of selected diabetes risk alleles is not increasing with a decreasing in SES.  Table 3 Genetic association analyses for paternal occupation: odds ratios, 95% confidence intervals and p-values for the genetic association analyses for paternal occupation (by groups; 'managers and professionals' as reference) using SNPs (additive genetic model), sex and age in a logistic regression model (CHR, chromosome) To our knowledge, this is the first study investigating social inequalities in diabetes and simultaneously exploring the impact of selected SNPs robustly associated with diabetes. There are just a few studies that have investigated supposed differences in risk allele frequencies to estimate their contribution to health inequalities with varying results [33,34]. Only Holzapfel et al. [34] analyzed the relationship between SES, body mass index (BMI) and the SNP rs9935401 within FTO, which is also associated with diabetes through its effect on BMI. They reported no association with education and income for rs9935401. This is in line with our observations for the FTO marker rs8050136, which is strongly correlated to rs9935401 (r 2 = 1.0) within the HapMap CEU population.
Our study suggests that there is no contribution of diabetes-related SNPs to health inequalities in terms of differences in risk allele frequencies between SES groups. The question remains, how genetic factors could adequately be integrated in explanations of health inequalities. One possible approach is offered by the life course perspective [35][36][37], which describes the interplay of biological, environmental and social factors and their impact on health over the life course. Thus, the life course perspective is not tied to the assumption of a single causal direction as described by, e.g., the hypothesis of direct health selection. Following this approach, a more plausible picture of genetic risk factors interacting with environmental and social factors can be drawn to Table 4 Genetic association analyses for education: odds ratios, 95% confidence intervals and p-values for the genetic association analyses for education (by groups; '> = 14 years of education' as reference) using SNPs (additive genetic model), sex and age in a logistic regression model (CHR, chromosome)  give consideration to the complex chain of risks that produces health inequalities.
The following limitations of the study need to be considered: First, due to the given sample size the statistical power to confirm the reported genetic associations with diabetes and to detect associations between the SNPs and the SES indicatorsespecially in the analyses with categorized SES indicatorsis limited. To address this limitation and to increase the detection power we also used a genetic risk score for our analyses.
Second, over 66 genomic loci related to diabetes are already described which account for approximately 6% of variance in diabetes susceptibility suggesting that there still may be some unexplained genetic variance [38]. Thus, the present investigation of the relationship between diabetes-related genomic loci and SES indicators is far from being comprehensive. Given that the more recently discovered loci yield smaller effects on diabetes than the SNPs explored here, our analysis should be regarded as a first step to address the relationship of diabetes-related genomic loci with SES indicators.
Third, as we do not have information on diabetes status of earlier life stages it was not possible to check for the causal direction of the association between the SES indicators and diabetes. However, especially for education and paternal occupation reverse causation is unlikely as the former is a stable indicator of socioeconomic status across the life course and the latter of the participants' childhood. For income, reverse causation cannot be ruled out.
Fourth, the validity of the SES indicators is restricted. For education and income this is due to their age dependency. The distribution of educational degrees varies by age groups with the higher groups showing lower variance and income generally declines in relation to retirement. Therefore, these indicators of SES may be more valid in younger age groups.

Conclusions
Despite the mentioned limitations, our study confirms social inequalities in diabetes for different indicators of SES and provides no evidence that a selection of common genetic variants with the largest reported diabetes effects plays a role in the observed health inequalities. However, replication of our results, assessment of larger genetic marker panels and empirical analyses for other types of diseases are needed to further challenge the claims that differences in genetic predisposition could explain social inequalities in health. Table 6 Genetic association analyses using a genetic risk score: odds ratios, 95% confidence intervals and p-values for the genetic association analyses for paternal occupation and education (both by groups; highest group as reference) using SNPs (genetic risk score), sex and age in a logistic regression model as well as for income (by sex-specific quartiles; highest quartile as reference) using SNPs (genetic risk score) and age in a logistic regression model