Study populations
The NHS cohort was established in 1976 among 121,700 US female registered nurses, ages 30 to 55 years; NHSII was established in 1989 among 116,430 female registered nurses, ages 25 to 42 years. All women completed an initial questionnaire about their lifestyle factors, health behaviors, and medical history, and have been followed biennially by questionnaire. From 1989 to 1990, 32,826 NHS participants (ages 43 to 70 years) provided blood samples and completed a short questionnaire [15]. Blood was processed and separated into plasma, red blood cell, and white blood cell components. From 2002 to 2004, kits to collect buccal cells were received from 33,040 NHS women (ages 54–84) had not previously provided a blood sample and had completed the 2000 questionnaire. DNA was extracted and purified upon sample receipt.
Between 1996 and 1999, 29,611 NHSII participants (ages 32 to 54 years) provided blood samples and completed a short questionnaire [16]. Briefly, premenopausal women, either provided a luteal blood sample 7 to 9 days before the anticipated start of their next cycle (n = 18,521) or a single 30-mL untimed blood sample (n = 11,090). NHSII samples were processed identically to the NHS samples. All study participants provided informed consent. This study was approved by the Committee on the Use of Human Subjects in Research at the Brigham and Women’s Hospital and the Harvard School of Public Health (Boston, MA).
The current analysis includes women with available DNA who were controls from 7 nested case–control studies of IGF-1 and IGBFP-3 SNPs and risk of various chronic diseases, including benign breast disease [17], breast cancer [5], endometrial cancer [18], myeloma [19], and ovarian cancer [20] (N=4567).
Body size and covariate information
Body size and covariate information was obtained from the questionnaire completed at sample collection and biennial study questionnaires. Birthweight was collected in 1992 (NHS) and 1991 (NHSII). In the NHS, the correlation between the participant’s self-reported birthweight and that reported by her mother was 0.77 [2]. In 1988 (NHS) and 1989 (NHSII), women were asked to choose one of nine diagrams (somatotypes) [21] that best depicted their body fatness at ages 5 and 10, with higher levels indicating larger body size. Among older women (aged 71–76) in another study population, the correlations between recalled somatotype and measured BMI were 0.57 at age 5 and 0.70 at age 10; the correlations were similar after controlling for current BMI [22]. BMI at blood draw and at age 18 (asked in 1980 for NHS and in 1989 for NHS2) were calculated as self-reported weight in kilograms divided by self-reported height (collected at baseline) in meters squared.
We considered a woman to be premenopausal at sample collection if (1) she gave a luteal sample (NHSII only), (2) her periods had not ceased, or (3) she had at least one ovary and was 47 years or younger (nonsmokers) or 45 years or younger (smokers). We considered a woman to be postmenopausal if (1) her natural menstrual periods had ceased permanently, (2) she had a bilateral oophorectomy, or (3) she had at least one ovary and was 56 years or older (nonsmokers) or 54 years or older (smokers). The age cutoffs represent the age when 90% of women with intact ovaries in the cohorts were premenopausal or postmenopausal, respectively. The remaining women, most of whom had a simple hysterectomy and were 48 to 55 years old, were considered to be of unknown menopausal status.
SNP selection and genotyping
SNP selection and genotyping have been described previously [5, 17, 19, 20]. Briefly, haplotype tagging SNPs (htSNPs) were identified by the Breast and Prostate Cancer Cohort Consortium (BPC3).a In IGF-1, 154 SNPs (56 SNPs in IGFBP-1/IGFBP-3) were genotyped in a panel representing several racial groups. Of these, 64 IGFBP-1 and 36 IGFBP-1/IGFBP-3 SNPs passed quality control and were confirmed to be SNPs. From these remaining SNPs, the expectation-maximization algorithm was used to select 14 IGF-1 and 12 IGFBP-1/IGFBP-3 SNPs that were predicted to tag the common haplotypes in Caucasian populations (rh
2 > 0.85). Four additional SNPs in IGFBP-1/IGFBP-3 were included in the BPC3 genotyping, in which the NHS and NHSII nested case–control samples for breast cancer were included.
DNA extraction and genotyping were performed at the Dana–Farber Cancer Institute/Harvard Cancer Center High Throughput Genotyping Core, a unit of the Harvard–Partners Genotyping Facility. DNA was extracted using a QIAamp 96 DNA Blood Kit (Qiagen). Genotyping assays for the 22 SNPs were performed by the 5’ nuclease assay (Taqman) on the Applied Biosystems Prism 7900HT Sequence Detection System (Applied Biosystems, Foster City, CA). Taqman primers, probes and conditions for genotyping assays are available upon request. Laboratory personnel were blinded to case–control status, and duplicate samples (10% of sample size) were inserted to validate genotyping procedures. More than 95% of the samples were successfully genotyped for each polymorphism and there were no discordant quality control sets.
Statistical analysis
We used ordinal logistic regression models to analyze the association between IGF-1 and IGFBP-1/IGFBP-3 SNPs and body size (birthweight, somatotypes at ages 5 and 10, and average somatotype at ages 5 and 10). Birthweight was categorized as <5.5 lbs, 5.5-6.9 lbs, 7–8.4 lbs, 8.5-9.9 lbs, or 10+ lbs, based on the questionnaire categories. As a secondary analysis, we examined the associations with low birth weight (<5.5 lbs) vs. all others using logistic regression. Somatotypes at ages 5 and 10 were categorized as diagram 1, 2, 3, 4, and 5+. To assess the fit of ordinal logistic regression models, we conducted the score test for proportional odds. Since BMI at age 18 was calculated continuously, we used linear regression to assess the associations between IGF variability and this outcome. SNPs were included in the model as ordinal terms with values from 0 to 2 (i.e. as one variable coded 0 for the homozygous major allele genotypes, 1 for heterozygous genotypes, and 2 for homozygous variant genotypes). Additionally, we combined SNPs using two methods. First, 11 SNPs (rs1520220, rs35767, rs7965399, rs2195239, rs2946834, rs2854744, rs2854746, rs3110697, rs2270628, rs2960436, rs2132570) associated with circulating IGF-1 or IGFBP-3 levels previously [5, 7, 23] were combined to make a SNP score. The allele that was associated with higher levels was considered the “risk” allele. Having one copy of the risk allele added one to the score; two copies added two to the score. Second, we used reduced rank regression [24] to construct a score of IGF-1 and IGFBP-3 SNPs that explained variability in measured plasma IGF-1 and IGFBP-3 in the NHS and NHSII participants who had both plasma levels and genotypes. We applied the scores to all women with genotypes and assessed associations with both scores using ordinal logistic regression (for birthweight and somatotypes) or linear regression (for BMI at age 18) as described above.
All models were adjusted for age at blood draw or cheek cell collection, menopausal status at collection and reference date, postmenopausal hormone use at collection and reference date, and DNA source (blood vs. cheek). The two cohorts were analyzed separately and combined using random effects meta-analysis. All p-values were two-sided and considered statistically significant if ≤0.05. Ordinal logistic regression analyses were conducted in STATA 11.0 (STATACorp, College Station, TX). Linear regression analysis and meta-analyses were conducted using SAS version 9.1 (SAS Institute, Cary, NC).