Study sample
The participants belong to the Chinese National Twin Registry (CNTR), the first and largest population-based twin registry in China described in detail elsewhere [22].
During 2011 to 2012, twin members were recruited in this study through an in-person interview by means of a questionnaire in 9 provinces or cities, including Jiangsu, Zhejiang, Sichuan, Heilongjiang and Qinghai province, Tianjin, Beijing, Qingdao and Shanghai City. For the purpose of this study, cigarette smoking and alcohol drinking data from male twins were used because the corresponding prevalence values were negligible in female twins. Twin pairs were eligible for the present study if (1) male twins aged between 18 and 79 years old, (2) weight, height, age and zygosity information from both twins of a pair were available, (3) free of cardiovascular heart disease, stroke, type 2 diabetes and cancer. Among these eligible twin pairs, 129 twin pairs reared apart were excluded from analyses (reared apart was defined according to SATSA’s definition as twins who had been reared apart for at least 1 year before the age of 11).
At last, the study population included 6121 complete male twin pairs (4122MZ and 1999DZ twin pairs). The determination of zygosity was based on age, gender, questions of appearance confused by strangers and previously perceived zygosity from questionnaires. This method has been validated using DNA genotyping from 192 pairs of same-gender twins and found to have an AUC of 89.03 % [23].
All participants provided informed consent and Biomedical Ethics Committee at Peking University, Beijing, China approved the study protocol.
Measures
Explanatory variables
Cigarette smoking was coded into four categories (nonsmoker, former smoker, current light smoker, and current moderate to heavy smoker). Nonsmokers were defined as those who gave negative answers to ‘Do you smoke?’. Those who responded ‘I have quit smoking for one month or more’ were defined as former smokers. Current smokers were those who gave affirmative responses to ‘Do you smoke?’. The definition of current light and moderate to heavy smokers were current smokers who smoked one to 9 and 10 or more cigarettes daily, respectively. A continuous measure of cigarette smoking (cigarette pack-years) was also calculated (one “pack-year” is 20 cigarettes smoked/day for one year [24]) for current smokers. Alcohol drinking status was similarly defined depending on their responses to ‘Do you drink alcohol’. Those who gave affirmative responses were defined as current drinkers; former drinkers and nondrinkers were those who previously drank, but subsequently quit drinking for one month or more and who never drank before.
Outcome variable
Self-reported height and weight were used to calculate BMI. Data on height and weight were required to be accurate to the nearest centimeter and kilogram, respectively. BMI was defined as weight in kilograms divided by height in meters squared. The reliability of self-reported height and weight was assessed in a subsample of these twins who participated in a follow-up study in 2014 July whose body weight and height were measured by health-care professionals. Intraclass correlation for measured versus self-reported weight and height were .89 and .94, respectively, which suggested good reliability of self-reported BMI.
Assessment of covariates
Potential covariates included age (18–79 y), region (‘south’: Jiangsu Province, Zhejiang Province, Shanghai City and Qingdao City, ‘north’: Heilongjiang Province, Tianjin City and Beijing City, ‘west’: Sichuan Province and Qinghai province), zygosity (MZ, DZ), education attainment (illiterate or primary education, secondary education and tertiary education), marital status (married, live alone) and regular physical activity at least 20 min in 5 days of a week (yes, no, unclear).
Statistical methods
We compared epidemiological characteristics between MZ and DZ twins. P-values were corrected for the correlation between co-twins using multinomial logistic regression for categorical variables and mixed-effects models for continuous variables. Regression models and gene-environment interaction models were used to examine the associations of cigarette smoking and alcohol drinking status with BMI.
Linear regression models
Mixed-effect linear regression models with a random intercept for each twin pair to account for twin clustering [25] were performed to estimate associations between cigarette smoking, alcohol drinking and BMI in the whole population treating twins as individuals. Covariates included age, region, zygosity, alcohol drinking, cigarette smoking, marital status, educational attainment and regular physical activity. Further, we applied fixed effect models separately for MZ and DZ twins to estimate the within-pair effects of cigarette smoking and alcohol drinking on BMI treating twins as pairs. Examining twins overall gives an average relationship between exposure and outcome across the twin population. If the association further persists in within-pair comparisons we can infer that something unique to each individual twin is contributing, rather than common to both twins. On the contrary, attenuation of the association in MZ or DZ twin pairs indicated that it was confounded by shared familial factors. Analyses were performed using Stata11.2 (Stata Corp, College Station, TX). P-values were two-sided and statistical significance was assumed at P < 0.05.
Gene-environment interaction models
We used a univariate structural equation model to estimate the genetic and environmental influences on BMI variance. It is assumed that the variance of a total phenotype can be decomposed into three different sources of influence: additive genetic component (A), common environmental component (C) and unique environmental component (E). MZ twins share 100 % of their genes (at the sequence level), whereas on average, DZ twins share 50 % of their segregating genes. The correlation coefficient of A and C is 1.0 and 1.0 for MZ, and 0.5 and 1.0 for DZ, respectively. The proportion of variance explained by additive genetic factors is also commonly termed narrow-sense heritability. We first estimated the variance of BMI explained by A, C and E components. Nested model for which C was equated to zero was also fitted and Akaike Information Criterion (AIC) was used for comparison of goodness of fit of the models [26].
By assessing whether the genetic variance of BMI depends on certain lifestyles, we are able to offer in-depth insights into the possible gene-lifestyle interaction on BMI. Based on the best-fit model we conducted a gene-environment interaction model [18] (Fig. 1) to find whether genetic variance of BMI depended on certain lifestyles using moderate to heavy smoking (nonsmoking as reference) and current alcohol drinking (nondrinking as reference) as moderation factors (denoted as M). These factors can affect the BMI (βm) directly but can also modify the underlying genetic factors (βa), common environmental factors (βc) and unique environmental factors (βe) of BMI. The effects of moderation factors on genetic and/or environmental variance of BMI were evident when interaction parameters (βa, βc, βe) were significantly different from zero. We used a z-score to standardize BMI to have mean as 0 and variance as 1.
All model fitting analyses and maximum-likelihood parameter estimates were performed in OpenMx (Version 1.4), and all variance components were estimated with inclusion of age and region as covariates in the models.