Skip to main content

Total Exposure Study Analysis consortium: a cross-sectional study of tobacco exposures



The Total Exposure Study was a stratified, multi-center, cross-sectional study designed to estimate levels of biomarkers of tobacco-specific and non-specific exposure and of potential harm in U.S. adult current cigarette smokers (≥one manufactured cigarette per day over the last year) and tobacco product non-users (no smoking or use of any nicotine containing products over the last 5 years). The study was designed and sponsored by a tobacco company and implemented by contract research organizations in 2002–2003. Multiple analyses of smoking behavior, demographics, and biomarkers were performed. Study data and banked biospecimens were transferred from the sponsor to the Virginia Tobacco and Health Research Repository in 2010, and then to SRI International in 2012, for independent analysis and dissemination.


We analyzed biomarker distributions overall, and by biospecimen availability, for comparison with existing studies, and to evaluate generalizability to the entire sample. We calculated genome-wide statistical power for a priori hypotheses. We performed clinical chemistries, nucleic acid extractions and genotyping, and report correlation and quality control metrics.


Vital signs, clinical chemistries, and laboratory measures of tobacco specific and non-specific toxicants are available from 3585 current cigarette smokers, and 1077 non-users. Peripheral blood mononuclear cells, red blood cells, plasma and 24-h urine biospecimens are available from 3073 participants (2355 smokers and 719 non-users). In multivariate analysis, participants with banked biospecimens were significantly more likely to self-identify as White, to be older, to have increased total nicotine equivalents per cigarette, decreased serum cotinine, and increased forced vital capacity, compared to participants without. Effect sizes were small (Cohen’s d-values ≤ 0.11). Power for a priori hypotheses was 57 % in non-Hispanic Black (N = 340), and 96 % in non-Hispanic White (N = 1840), smokers. All DNA samples had genotype completion rates ≥97.5 %; 68 % of RNA samples yielded RIN scores ≥6.0.


Total Exposure Study clinical and laboratory assessments and biospecimens comprise a unique resource for cigarette smoke health effects research. The Total Exposure Study Analysis Consortium seeks to perform molecular studies in multiple domains and will share data and analytic results in public repositories and the peer-reviewed literature. Data and banked biospecimens are available for independent or collaborative research.

Peer Review reports


The Total Exposure Study (TES) was designed by a tobacco company sponsor in the 1990s with the primary objectives of estimating exposure of current U.S. adult cigarette smokers to cigarette smoke constituents and of investigating relationships between FTC tar categories and cigarette smoke exposure. Other objectives included investigating associations of smoking behavior and biomarkers of exposure (BOE), comparing BOE in adult smokers and non-users, and investigating relationships between BOE and biomarkers of potential harm (BOPH) [1]. From 2002 to 2003, internationally-recognized contract research organizations (CROs), under contract to the tobacco company sponsor, collected questionnaire data, clinical data, and biological samples from 3,585 smokers and 1077 non-users at 39 clinical sites in 31 U.S. states and performed clinical chemistry, laboratory and statistical analyses [13]. TES participants were recruited using Institutional Review Board-approved advertisements [1, 2], with defined inclusion and exclusion criteria (Additional file 1). The study was approved by an Institutional Review Board at each clinical site and conducted in accordance with Good Clinical and Laboratory Practices and principles of the Declaration of Helsinki. Using blood and urine biospecimens and mass spectrometry-based and clinical chemistry-based analyses, the CROs determined levels of BOE and BOPH in smokers and non-users. Additional blood and urine biosamples were collected from consenting subjects for possible future analyses.

The Virginia Tobacco Health Research Repository (VTHRR) was formed in 2010 as a Virginia non-profit, non-stock corporation by authorization of the Virginia BioTechnology Research Partnership Authority Board, a political subdivision of the Commonwealth of Virginia. The VTHRR received TES data and biospecimens as a contribution from the tobacco company sponsor. The mission of the VTHRR is to make the TES data and banked biospecimens available to scientists, research institutions, regulatory agencies and industry for research to increase the scientific knowledge base of the health effects of cigarette smoking [4].

Under a 2012 Asset Transfer Agreement between the VTHRR and SRI International (SRI), an independent, non-profit research institute incorporated in 1946 in the state of California, TES data and biospecimens were transferred to SRI in 2012. The agreement between SRI and VTHHR provides SRI with complete independence to pursue valid scientific objectives. The principal intended result of any analysis of TES data or biospecimens is the generation of knowledge related to smoking and health that is shared in the scientific peer-reviewed literature and in appropriate databases. SRI will independently maintain, curate, and make both data and biospecimens available to the research community for this purpose.

In order to optimize the validity and utility of the TES data and banked biospecimens to support its full use by the global public health research community, there is a need for thoughtful, objective scientific analysis of the resource. The purpose of this analysis was to review TES data and biospecimens, investigate distributions of self-reported, clinical and laboratory measures of exposure and potential harm (biomarker), and pote`ntial differences in biomarker levels between those participants with banked biospecimens and those without, calculate statistical power for genomic analyses, and perform analyses of plasma and peripheral blood monocyte analytes.


We obtained ethical approval from the SRI International Human Subjects Committee to conduct these analyses of TES data and biospecimens.

Each study site selected to use either their individual site-specific IRB or a central IRB contracted by the primary clinical and laboratory CRO responsible for the conduct of the study. TES participants were recruited, provided informed consent and were screened in a two-visit, multicenter process as current cigarette smokers, stratified by their regular cigarette’s Federal Trade Commission (FTC) tar level (≤2.9, 3.0–6.9, 7.0–12.9, and ≥ 13 mg), and as non-users [1, 2]. Inclusion and exclusion criteria are described in Additional file 1. Participants were paid up to 300 U.S. dollars for completion of all study components. Recruitment sites were distributed in 31 States over four regions [Midwest (19.7 %), Northeast (13.0 %), South (37.8 %) and West (29.5 %)] and among urban (68.5 %) and non-urban (31.5 %) areas.

All participants provided vital signs (at both visits), medical history and concomitant medication data (at the first visit), and completed a questionnaire survey regarding smoking history and attitudes and preferences regarding smoking (in current smokers), demographics, lifestyle and environmental exposures (at the second visit). Between the first and second visit, smokers collected cigarette butts over a 24-h period and smoking topography information using a portable instrument which measured the number of puffs, the length of puffs and the length of the inter-puff interval. Both smokers and non-users collected their urine over 24-h. At the second visit, lung function tests were performed and blood was collected for processing, biomarker assays and, under a separate consent for future research, for banking.

Four tubes of whole blood [two 10 ml potassium ethylenediaminetetraacetic acid (KEDTA) and two 8.5 ml acid citrate dextrose solution A (ACDA) tubes] were obtained from each participant at the second visit after a minimum 6 h fast and processed for plasma, red blood cells and monocytes [1]. The TES biospecimen aliquots in SRI’s possession include approximately: a) 6000 peripheral blood mononuclear cell (PBMC) samples; b) 7000 red blood cell samples; c) 5000 24-h urine samples; and d) 3000 plasma samples. TES biospecimens have been stored at −80 °C by the VTHRR and SRI.

We examined TES publications, accessed the University of California San Francisco Legacy Tobacco Documents Library (UCSF LTDL) website [5] TES-related documents to 1) compare with documents we had received from the VTHRR and 2) to learn more about the study design and analysis goals of the TES, and engaged with colleagues regarding the potential value of the TES for tobacco research. We reviewed data collection, sample preservation, and laboratory assay protocols followed by the CROs that conducted the TES. We inspected TES clinical and biospecimen data and labeled biospecimens to confirm that the dataset was deidentified.

We queried the TES clinical data to assess the distribution of participant data among the analysis strata (age, sex, and BMI) among the four smoking categories defined by the smoker’s usual cigarette FTC tar level, and among non-users. We evaluated the distributions of analysis strata among all participants, by banked biospecimen availability, and by biospecimen type. We analyzed additional behavioral, demographic, biomarker and tobacco product variable distributions among participants, and compared distributions between participants with and without banked biospecimens.

We constructed logistic regression models predicting the availability of biospecimens in self-identified non-Hispanic Black and White smokers using individuals with complete data in three increasingly complex models. Model 1 comprised BMI and demographic covariates, Model 2 added BOE to the covariates in model 1, and Model 3 added BOPH to Model 2. We imputed missing data for each model and repeated analyses with the larger sample sizes. To determine the extent to which random variability was responsible for the ability of the demographic variables and biomarkers to predict biospecimen availability, we randomly permuted the variable indicating the availability of biospecimens and determined a 95 % confidence interval for the percent reduction in the variance of this randomly permuted variable attributable to the covariates.

Plasma biospecimens were randomly selected (women and men, aged 35–49 years, with BMI < 25 kg/m2, both current smokers and non-users) and sent to the SRI Clinical Analysis Laboratory (CAL). Six clinical laboratory assays were performed on 47 plasma samples to measure levels of glucose, aspartate and alanine aminotransferases, total bilirubin, albumin, and total cholesterol. We estimated the correlation between SRI CAL plasma and original CRO serum analyte values.

PBMCs randomly selected from TES participants (N = 30, ~1 % of participants with available biospecimens) from defined strata [ages 35–49 and with BMI < 25 kg/m2] resulted in a sample that was 37 % female, 70 and 20 % self-identified White and Black, with 67 % current smokers. Initially, we performed DNA extraction from a limited number of pellets using Gentra Puregene reagents (Qiagen). To conserve biospecimen resources, we reviewed several multiple analyte protocols, and selected a protocol for simultaneous DNA and RNA extraction (NORGEN 48700 kit with Proteinase K). We modified lysis buffer amounts by available white blood cell count data and extracted ~1X106 cells from each lysed pellet. DNA was sent to the Rutgers University Cell and DNA resource for genotyping with the Smokescreen® Array [6]. RNA quality (RNA integrity score, RIN) was analyzed using the Agilent 2100 BioAnalyzer using the Eukaryote Total RNA Nano assay.

Statistical analyses were performed using SAS version 9.2 (Cary, North Carolina) and STATA SE version 12.0 (Stata Corp, College Station, Texas). Except where specified, the alpha used for statistical significance was 0.05. We evaluated power to detect genetic variants for serum cotinine at genome-wide significance using Quanto [7].


Review of the published TES literature

Scientists employed by the tobacco company sponsor have published analyses in peer-reviewed scientific journals using data from the TES pilot study [8] and the TES main study [2, 3, 916]. Analyses included population estimates of BOE levels for smokers and non-users [2], estimates of levels of BOPH in smokers and non-users [3], the relationships between machine-derived tar yields of cigarette products and BOE in smokers [9], models of BOPH [11], the impact of menthol-containing cigarettes on selected BOE in White and Black smokers [10], and the relationships between selected BOE and BOPH in smokers [14]. These scientists have also reported on the relationships between BOE and nicotine dependence [13] and between nicotine and carbon monoxide BOE and other factors, including smoking topographical variables [12]. These authors utilized TES data to examine the relationships between smoking mentholated cigarettes or non-mentholated cigarettes and glucuronide metabolite ratios [15], and with measures of nicotine dependence [16]. We review Roethig et al. [2] and Frost-Pineda et al. [3] here to introduce TES BOE (Additional file 2: Table S1) and BOPH (Additional file 3: Table S2).

Roethig et al. published estimates of BOE (Additional file 2: Table S1) in smokers and non-users and, within smokers, within different age, sex, BMI, and self-identified racial strata [2]. Mean levels of BOE were weighted by age, sex and BMI variance estimates from the U.S. Behavioral Risk Factor Surveillance System (BRFSS), an annual telephone-based behavioral survey established in 1984 [17], to produce weighted estimates of BOE reported and described by Roethig et al. as population estimates [2]. The BRFSS used post-stratification weighting based on United States Census data from the 1980s until 2011 [17]. Lee and Messiah criticized the application of weights extracted from a nationally representative sample to a sample for which inclusion rates at recruitment sites were not known or not reported [18]. In response, Sarkar and Liang noted that the weighted means were similar to or unchanged from unadjusted means [19]. Weighted estimates of tobacco-specific biomarkers [nicotine, cotinine and trans-3′-hydroxycotinine and their glucoronides (nicotine equivalents, NE), serum cotinine, and total 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanol and glucuronide (total NNAL)] suggested that the younger participants (21–34 years) and female participants had the lowest tobacco-specific exposures, and that individuals with BMIs < 25 kg/m2 compared with individuals with BMIs ≥ 25 kg/m2 had higher serum cotinine levels and lower total NNAL levels, suggesting reduced cotinine clearance and NNK metabolism in heavier individuals [2]. Significant differences in serum cotinine by BMI similar to those reported in the TES have been previously observed in the National Health and Nutrition Examination Survey [20]. In the TES, self-identified White smokers smoked significantly more cigarettes per day and had greater NE and total NNAL exposure over 24 h, but lower NE and total NNAL exposure per cigarette, and lower serum cotinine exposure, than self-identified Black smokers [2]. It has previously been observed that White smokers smoke more cigarettes per day than Black smokers and that nicotine intake per cigarette measured by serum cotinine is higher in Black smokers than in White smokers [21, 22], which is related to significantly reduced nicotine clearance in Blacks compared to Whites [23, 24].

Frost-Pineda et al. [3] reported mean values for 29 BOPH (Additional file 3: Table S2) in both smokers and non-users. The BOPH represented various physiological functions: cardiovascular, endothelial, hematologic, inflammation, lipid, hepatic, renal, respiratory, metabolic, and oxidative stress [3]. The effects of multiple BOE [cigarettes per day (CPD), NE, and smoking duration] on the BOPH in current smokers versus non-users were evaluated in two stepwise regression models (model A with CPD and smoking duration, and model B with NE and smoking duration) with age, sex, BMI and self-identified race as additional independent variables [3]. The three most elevated mean BOPH in current smokers versus non-users were those reflecting oxidative stress, platelet activation and inflammation. The oxidative stress biomarker 8-epi-prostaglandin F exhibited the largest difference between smokers and non-users (+42 %), while BMI and age, and BMI and NE, were the most important correlates in models A and B, respectively. The platelet activation biomarker 11-dehydrothromboxane B2 exhibited the second largest difference between smokers and non-users (+29 %), and sex and BMI, and sex and NE were the most important correlates in models A and B, respectively. The inflammation biomarker white blood cell count exhibited the third largest difference between smokers and non-users (+19 %) and BMI and self-identified race were the most important correlates in both models. Overall, BMI and sex were the first and second most common significant correlates reported by Frost-Pineda et al. [3].

Our search of the UCSF LTDL identified multiple documents we had received from the VTHRR, including: the Amended Final Research Protocol, dated 19 August 2002, that describes the clinical protocol, laboratory testing and biospecimen banking procedures to be conducted by the primary clinical and laboratory CRO [1]; the TES Adult Smoker Survey [25]; and the TES Adult Non-Smoker Survey [26]. We found no differences between the documents we had received from VTHRR and those available on the UCSF LTDL. We also identified summary documents that provided information on the design and analysis goals of the TES, including: a Statement of Work for data management and analysis to be conducted by the primary data analysis CRO dated 30 April 2004 [27]; a draft version of the Statistical Analysis Plan dated 1 September 2004 [28]; and a PowerPoint presentation dated 7 February 2005 that presented TES pilot results, and design and initial analyses of the TES [29]. Review of these documents enriched our understanding of the design and conduct of the study and confirmed study parameters, e.g., numbers of individuals recruited within design strata. With the assistance of a UCSF Industry Documents Digital Librarian, we identified SAS datasets available in the Philip Morris collection but these refer to an unrelated study [30].

TES recruitment and analysis strata

The original enrollment goal of the TES [1] was 1000 smokers among four strata defined by FTC tar levels of the smoker’s usual cigarette (≤2.9, 3–6.9, 7–12.9, and ≥ 13 mg), and 1000 non-users [1]. The distribution of evaluable subjects in the five categories (504, 953, 1066, and 1062 smokers, and 1077 non-users) was significantly different from the design (Pearson χ24d.f. = 159.9, P < 0.0001). The distribution of participants with clinical data by enrollment strata and by demographic strata is shown in Table 1.

Table 1 TES participants with clinical data, by recruitment variables and by previously utilized analysis strata

TES demographics and smoking status

The demographic composition of TES participants with clinical data (N = 4662) was 57.9 % female, with mean (standard deviation, SD) age 42.1 (13.2) years and mean (SD) BMI 27.9 (6.7) kg/m2 (Table 2). Self-identified race distributions were 77.1 % “Caucasian or White”, 16.5 % “African American or Black”, and four other self-identified race categories comprising 6.5 % of participants. Only a small fraction of TES participants self-identified as Hispanic ethnicity (3.8 % of total participants). Most (76.9 %) TES participants were current smokers with mean (SD) CPD of 16.0 (8.9). Age, self-identified race, and education distributions differed significantly by smoking status (current smokers were significantly more likely to be older, self-identify as Black, and significantly less likely to have a college degree), while sex, BMI, and self-identified ethnicity (Hispanic versus Not Hispanic) did not differ by smoking status.

Table 2 TES demographics and smoking status, overall and among those with and without banked biospecimens

TES banked biospecimen availability and smoking status

Two-thirds (66 %) of TES participants have banked biospecimens. PBMCs are the most common biospecimen type while urine is the least common (Table 3). Among participants with banked biospecimens, and among the four biospecimen types, there are no significant differences in sex, age and BMI proportions, but there are significant differences in smoking status (Table 4). Compared to participants with banked PBMC biospecimens, participants with banked urine biospecimens are significantly more likely to be smokers (OR = 1.26, 95 % CI 1.11–1.44, P = 0.0004).

Table 3 TES banked biospecimen aliquots, by biospecimen type
Table 4 TES banked biospecimen availability, by biospecimen type, and by strata previously used for analysis

TES demographics

Participants with banked biospecimens are significantly older (age, continuous or categorical), have significantly increased BMI (continuous), and are more likely to self-identify as White compared to those without (Table 2). When stratified by ethnicity and race, self-identified non-Hispanic Black participants with banked biospecimens are significantly older and have significantly increased BMI than those without biospecimens [mean (SD) age 40.8 (10.7) vs 38.8 (11.1) years, t = 2.45, P = 0.0144, N = 755; mean (SD) BMI 30.5 (8.0) vs 28.5 (6.9) kg/m2, t = 3.58, P = 0.0004, N = 755, data not shown]. Significant differences in age and BMI among all participants and stratified by self-identified ethnicity and race are small (Cohen’s d-values = 0.10, 0.09, 0.18 and 0.27, respectively). Smoking duration, CPD (continuous and categorical), and usual cigarette FTC tar level (categorical) are significantly increased in those with banked biospecimens compared to those without, overall, and when stratified by self-identified race (Table 5). Significant differences are small; d-values for smoking duration overall, and among self-identified non-Hispanic Blacks and Whites are 0.13, 0.22 and 0.08, respectively, and d-values for CPD among self-identified non-Hispanic smokers, and among self-identified non-Hispanic White smokers, are 0.14 and 0.09, respectively.

Table 5 TES self-reported BOE, non-Hispanic current smokers by self-identified race, and by banked biospecimens


Most tobacco-specific (NE, serum cotinine and total NNAL) and non-specific BOE are significantly higher in smokers with banked biospecimens than in smokers without, except for serum cotinine, 4-ABP and MHBMA (Table 6). Metabolites of acreolein and 1,3 butadiene are significantly greater in non-users with banked biospecimens than in non-users without. All statistically significant differences in BOE by banked biospecimen availability have small effect sizes, ranging from 0.10 to 0.24. When stratified by self-identified ethnicity and race, more BOE differ significantly by biospecimen availability among non-Hispanic Whites than among non-Hispanic Blacks (Tables 7 and 8). The effect sizes of the two BOE differences in self-identified non-Hispanic Black smokers are small, and the effect size of the one BOE difference in self-identified non-Hispanic Black non-users is a medium effect size (d = 0.47). Among self-identified non-Hispanic White smokers, NE, total NNAL, carboxyhemoglobin and an acreolin metabolite, and among self-identified non-Hispanic White non-users, a 1,3 butadiene metabolite, exhibit significant differences. All these significant differences are of small effect size.

Table 6 TES laboratory-based BOEa among smokers and non-users, and among those with and without banked biospecimens
Table 7 TES Non-Hispanic Black laboratory-based BOE by smoking status, and by banked biospecimen availability
Table 8 TES non-Hispanic White laboratory-based BOE, by smoking status, and by banked biospecimen availability


The distribution of BOPH by banked biospecimen availability is shown in Table 9. Six of 29 BOPH measures have nominally significantly higher levels in TES participants with available banked biospecimens versus those without, while the respiratory function measure FVC and hemoglobin remain significantly different after false discovery rate correction (q-values = 0.0128 and 0.0496, respectively) [31]. After excluding individuals with implausible FEV1 values < 35 % or > 125 % of predicted, as suggested by Frost-Pineda et al. [3], and then stratifying by self-identified ethnicity and race, and then by smoking status, we observed that self-identified non-Hispanic White smokers with banked biospecimens have significantly increased FVC compared to those without [93.9 (23.2) vs 90.8 (17.9), t = 3.81, P = 0.0001, N = 2584]. The statistically significant increase in % predicted FVC among self-identified non-Hispanic White smokers with available biospecimens is unexpected because multiple BOE are significantly increased in self-identified non-Hispanic White smokers with banked biospecimens and lung function is expected to be reduced in individuals with increased measures of exposure. Evidence for the influence of current smoking on longitudinal decline in FEV1 and FVC suggests that current smoking influences longitudinal FEV1 decline more than FVC [32], though this would not explain an increase in FVC. We constructed another regression model including education and household income, but these potential confounders [33] had no effect on the observed differences in FVC within non-Hispanic White smokers (data not shown). Further analyses of lung function measures and other variables in the TES may identify possible explanatory factors or confounders. After stratifying by self-identified race and ethnicity, and then by smoking status, we observed that self-identified non-Hispanic White non-users exhibit a significant difference in hemoglobin by banked biospecimen availability [14.50 (1.42) vs 14.31 (1.25), t = 1.84, P = 0.033, N = 828]. Statistically significant differences in FVC and hemoglobin in these strata are small (d-values are 0.15 and 0.14, respectively).

Table 9 TES participant BOPH, by banked biospecimen availability

TES participant usual cigarette brand

Information on participant’s usual cigarette brand is available from 606 and 1336 self-identified non-Hispanic Black and non-Hispanic White smokers, respectively. The top 20 brands account for 66.0 and 49.4 % of the brand information available from self-identified non-Hispanic Black and non-Hispanic White smokers, respectively (Table 10). Usual cigarette brand distributions do not differ significantly among self-identified non-Hispanic Black or among non-Hispanic White smokers by the presence or absence of banked biospecimens (Table 10).

Table 10 TES participant usual cigarette brand, by self-identified race/ethnicity and by banked biospecimen availability

Modeling banked biospecimen availability by demographics, BOE and BOPH

Sample sizes among self-identified non-Hispanic Black and White smokers with complete data and with imputed data for the progressively more complex models were 3236 and 3318 (2.5 % of participants had missing data in Model 1), 2317 and 3318 (30.2 % of participants had missing data in Model 2), and 1090 and 3053 (64.3 % had missing data in Model 3), respectively. However, while a large fraction of the population was missing one or more variable values, on average they were only missing a single value out of a large number of independent variables. The number of missing values that were imputed was relatively small; 0.3 % of all values required imputation in Model 1, 1.9 % in Model 2, and 2.5 % in Model 3. Significant demographic variables, BOE and BOPH in progressively more complex multivariate models of banked biospecimen availability with imputed data were: Model 1) BMI, self-identified race, age and age squared; Model 2) self-identified race, age, age squared, and NE/24 h; and Model 3) self-identified race, age, age squared, NE/24 h, serum cotinine, MHBMA and FVC (Table 11). The mean (SD) predicted probabilities of banked biospecimen availability, in progressively more complex multivariate models without and with imputed data are: Model 1) 0.647 (0.069) and 0.665 (0.060); Model 2) 0.643 (0.077) and 0.668 (0.070); and Model 3) 0.625 (0.102) and 0.669 (0.095). Explanatory power estimates (r2) of the anthropometric, demographic, BOE and BOPH variables in progressively more complex multivariate models with imputed data among self-identified non-Hispanic Black and White smokers to predict banked biospecimen availability are 0.018, 0.024, and 0.037, respectively. In permutation analyses of self-identified non-Hispanic Black and White smokers with imputed data in Model 3, the mean (95 % confidence interval) r2 was 0.020 (0.016 - 0.023) suggesting that about half of the explanatory power of variables is due to random variability (0.020/0.037 = 0.54).

Table 11 Multivariate model of banked biospecimen availability, self-identified non-Hispanic black and non-Hispanic white smokers

Correlations of clinical chemistry results in 47 plasma samples from the SRI CAL (2013) and those from serum reported by the CRO (2002–2003) were high and statistically significant [glucose (0.922), aspartate aminotransferase (0.993), alanine aminotransferase (0.997), total bilirubin (0.960), albumin (0.702), and total cholesterol (0.913), all p-values < 0.001] (Table 12 and Fig. 1). The lower correlation for blood albumin may be due to the two different matrices, the increased variance of some albumin clinical chemistry analysis methods [34], or the use of different methods in the clinical analyzers in the two different clinical chemistry laboratories.

Table 12 Comparison of six circulating analytes in TES participant plasma and serum
Fig. 1
figure 1

Comparison of six circulating analytes in TES plasma (CAL) and serum (CRO)

Mean (SD) DNA and RNA from ~1 M cells was 4.63 (1.63) ug, and 2.10 (0.62) ug, respectively. We sent four DNA samples from Gentra Puregene extraction and 27 DNA samples from NORGEN extraction for Smokescreen Array genotyping at the Rutgers University Infinite Biologics facility. All DNA samples had genotype completion rates ≥ 97.5 % and passed the 97 % rate threshold; the mean genotype completion rate was 99.4 %. Mean (SD) RNA Integrity (RIN) scores from 28 RNA samples analyzed were 6.4 (2.2); 68 % of RIN scores were ≥ 6.0, a standard used in RNA sequence analysis [35]. There were no significant differences in sex, race, smoking status, or total nicotine equivalents between RNA samples with RIN ≥ 6.0 and < 6.0 (all p-values > 0.12). Thus, from PBMC pellets frozen at ultralow temperatures for over a decade, DNA quality and genotyping results were excellent, while RNA quality was good, but requires evaluation using transcriptome-wide methods.

Finally, we assessed statistical power to detect a priori genetic loci of interest from an example of a large-scale (1000 s) candidate gene association scan, and an example from a locus nominated by genome-wide association scans (GWAS), with genome-wide significance (GWS) as the statistical threshold. For self-identified non-Hispanic Black current smokers, we selected rs11187065 as an example, identified in the insulin-degrading enzyme gene as the gene-centric SNP most significantly associated with serum cotinine in the Coronary Artery Risk Development in Young Adults (CARDIA) study by Hamidovic et al. [36]. The influence of rs11187065 on serum cotinine was substantial with a β of −85.1 ng/ml, with mean (SE) of 236.5 (8.1) ng/ml from 365 African American smokers [36]. Mean (SD) CPD in the CARDIA sample was 10.5 (7.4), similar to that of the TES (Table 4). Using the sample size of self-identified non-Hispanic Black smokers with banked PBMCs in the TES (N = 340), there is 57 % power to detect the locus at genome-wide significance (and 83 % power to detect this locus at the original study’s Bonferroni adjustment level of 2.3 × 10−6) using an additive model, a one-sided test, the mean (SD) of serum cotinine among self-identified non-Hispanic Black smokers (Table 6), the rs11187065 minor allele frequency of 0.083 in the HapMap [37] African Americans in the Southwest sample, and the effect size from Hamidovic et al. For assessing power to detect a priori loci of interest among self-identified non-Hispanic White current smokers, we selected rs1051730 in the nicotinic acetylcholine receptor (nAChR) subunit gene cluster on chromosome 15q25.1, associated with smoking intensity and related phenotypes [38], including cotinine level [39], as an example. In an analysis of 2932 smokers with serum or plasma cotinine estimates, Munafo et al. estimated that each minor allele contributed to a mean increase in the unadjusted level of cotinine in European ancestry samples of 138.72 nmol/L [(95 % CI) 97.91 - 179.53 nmol/L, P = 2.7 × 10-11] [39], or 24.42 ng/mL, although this was reduced 18 % upon adjustment for self-reported CPD. Using the sample size of self-identified non-Hispanic White smokers with banked PBMCs in the TES (N = 1840), there is 70–96 % power to detect this locus at a genome wide significance level (5 × 10−8) using an additive model, a one-sided test, the mean (SD) of serum cotinine among self-identified non-Hispanic White smokers (Table 7), the rs1051730 minor allele frequency of 0.385 in HapMap Utah Residents with Northern and Western European Ancestry sample, and estimated allele effect sizes of Munafo et al. (adjusted and unadjusted for CPD).


TES research opportunities

TES data and banked biospecimens, together with current biotechnologies, offer opportunities for the tobacco research community to identify behavioral, clinical, environmental and molecular factors that may influence cigarette smoke exposures (susceptibility model) and identify molecular factors that may be modulated by cigarette smoke exposures (response model). In particular, the TES can provide existing BOE and BOPH data from a large sample of generally healthy individuals, as well as banked biospecimens for the generation of novel BOE and BOPH. We will conduct biomarker research in the context of an Analysis Consortium that will enhance the TES by adding novel biomarkers and biomarker analyses to elucidate relationships between cigarette smoke exposures and health effects. We will share data with other collaborations engaged in the analyses and meta-analyses of susceptibility and response models. We will deposit data with the repositories designed for genome-wide data per Federal guidance or journal practice and conditional on Human Subjects Committee approval.

Specifically, GWAS using TES PBMC DNA may contribute to the elucidation of relationships between germline variation and self-report and laboratory measures of exposures [38, 39], including genetic loci influencing the non-nicotine tobacco-specific BOE NNAL. Analyses of TES PBMC mitochondrial DNA (mtDNA) via copy number and deletion analysis [40] may enhance knowledge of the factors that influence mtDNA damage [41, 42]. Analysis of PBMC DNA and RNA will provide additional data to examine the effects of cigarette smoking on the PBMC epigenome [43, 44] and transcriptome [45]. Analyses of the plasma and urine proteome [46, 47] and metabolome [4850], may make a contribution to the developing literature of the impact of tobacco and other exposures defining the exposome, an integrated approach to biomarker discovery for exposure and disease paradigms [51]. Validation, integration and extension of these susceptibility and response models can be conducted in independent datasets and in meta-analyses, and may contribute to the development of biomarker panels for diagnostic, prognostic and therapeutic research in tobacco-attributable disease.

There are a number of differences in the landscape of smoking behaviors, tobacco/nicotine products and tobacco control between the time in which the TES was conducted and the present day [52]. These differences include: 1) the prevalence of cigarette use in U.S. adults has declined from ~21 to ~18 %; 2) the regular use of electronic cigarettes has increased in prevalence from 0 to almost 3 %; 3) the annual spending on advertising of tobacco products in the U.S. has declined from an all-time high of $15.4 billion in 2003 to $9.6 billion in 2012; 4) there has been a substantial increase in restrictions on smoking in public places due to increased recognition of harm associated with exposure to second and third-hand smoke; and 5) the passage in 2009 of the Family Smoking Prevention and Tobacco Control Act which prohibited the use of terms in advertising related to “light” cigarettes and created a regulatory framework by which the FDA can evaluate new tobacco products prior to their marketing to the public. Even with these temporal differences, there are several similarities concerning the cigarettes themselves that are of most relevance to the present investigation of cigarette smoking and its impact on BOE and BOPH. These include: 1) despite various changes in cigarette design over the past 12 years, there is no evidence that any of these have resulted in a “safer” cigarette; 2) the amount of tar and nicotine in cigarettes has remained relatively stable since 1993; 3) the most popular brands of cigarettes smoked (see Table 10) remain the same (Marlboro, Camel, and Newport); 4) the effects of exposure to combustible tobacco products both with respect to BOE and BOPH remain the same; and 5) the health consequences of exposure to cigarette smoke (either mainstream or sidestream) including cancer, cardiovascular disease, and respiratory disease remain the same. Since the primary focus of the present investigation is on BOE and BOPH that reside within pathways resulting in negative health outcomes, the TES remains as relevant today as in 2003.

TES biospecimens provide a sample of current smokers powered at GWS to identify the chr15q25.1 nAChR loci associated with BOE (cotinine levels [39, 53, 54], and NNAL [55]). These biospecimens may provide data for future meta-analyses of BOE in both European ancestry and African ancestry samples. TES participants who are current smokers, have smoking topography data, BOE and banked biospecimens are suitable subjects for pharmacogenetic or pharmacometabolic research, e.g., to identify drug metabolizing enzyme and transporter gene associations with existing tobacco-specific BOE, or with as yet undetermined metabolic profiles in 24-h urine. TES biospecimens and data can be used to identify or replicate novel susceptibility or response models, especially in collaborative meta-analyses. Such results may be validated in larger datasets focused on the analysis of tobacco exposures, such as the Population Assessment of Tobacco and Health (PATH) study [56].

Limitations to the resource

The TES was a multi-site, cross-sectional study with collection sites distributed across the U.S. The sample has limited numbers of individuals with self-identified race other than Black or White, and has limited numbers of individuals with self-identified ethnicity of Hispanic. The diversity in geographical collection is an opportunity to evaluate region as a covariate in both cigarette smoke exposure susceptibility and response-to-tobacco models, e.g., comparing BOE by region or state. However, regional diversity also represents a challenge for future analyses due to potential confounding. Some potential confounders can be measured at a molecular level and used as a covariate in analyses, e.g., principal components of population genetic variation [57] can be evaluated by region or by state.

TES participants with banked biospecimens exhibit small statistically significant differences in demographics and biomarkers compared to TES participants without banked biospecimens. With respect to differences in demographics, participants with banked biospecimens were significantly older and more likely to self-identify as White. The smaller proportion of Black TES participants with banked biospecimens compared to White TES participants with banked biospecimens is consistent with contemporaneous observations in epidemiologic cohorts of reduced willingness to provide consent for future genetic testing in the National Health and Nutrition Examination Survey of 1999–2000 [58], and reduced willingness to provide consent for storage of DNA for future genetic testing in the Baltimore Epidemiological Catchment Area study of 2004–2005 [59], even though the TES was not a representative population-based survey based on national or local sampling. With respect to differences in exposure, smokers with banked biospecimens had increased NE per 24 h and reduced serum cotinine, consistent with the differences observed in demographic characteristics. Despite these small statistically significant differences in demographics and exposure between TES participants with and without biospecimens, TES participants with banked biospecimens can be selected by specific clinical and laboratory criteria to create defined datasets for molecular analyses.

Use and availability of TES data and biospecimens

The principal intended result of any analysis of the TES is the generation of knowledge related to smoking and health that is shared with the public health community and in the scientific peer-reviewed literature. SRI and the VTHRR agreed on the following principles regarding use of TES data and biospecimens. First, maintain the integrity of the data and samples, i.e., establish infrastructure to track and make the data and biospecimens secure. Second, ensure that potential users of the TES data and/or biospecimens are scientific researchers or organizations focused on the intended analysis goals of the TES, as assessed by education, experience, or by publication track record. Third, include terms in Material Transfer Agreements requiring recipients of data and/or samples to make reasonable efforts to publish the results of studies approved after scientific advisory committee review in the peer-reviewed scientific literature. Under data-sharing guidance for researchers using Federal (e.g., NIH) funds [6062] and an Office of Science and Technology Policy memorandum [63], scientists who generate molecular data, using array-based or high-throughput genomic technologies are obligated to submit both phenotype and molecular data to qualifying databases.

This is the first time that TES data and biospecimens will be made available to independent scientists in any life sciences area. There is a need for careful, objective scientific analysis of the resource. Consistent with the 2012 recommendation of the U.S. Institute of Medicine to incorporate an independent Tobacco Research Governance Entity [64], SRI has engaged leading experts to form a TES Scientific Advisory Board. This board will provide oversight, review and adjudication of research applications to use the TES data and biorepository resources.

Due to the large size of the TES research resource and the possibilities for integrative analyses, we emphasize our interest in collaborating with individual or groups of investigators, institutions and/or sponsors. Analysis of multiple domains of molecular signatures from TES biospecimens will elucidate the contribution of the genome to exposure susceptibility and the subsequent response of multiple –omic domains to cigarette smoke exposure. Investigators interested in collaborative or independent investigations using the TES data and biospecimens are encouraged to contact the SRI authors.


The TES research resource represents a sample of 4662 current cigarette smokers and tobacco product and nicotine non-users and includes: behavioral and demographic data; cigarette product characteristics; self-reported clinical data and laboratory-based BOE and BOPH; and banked biospecimens suitable for molecular analyses from >3000 participants. We identified small but statistically significantly greater self-reported measures of cigarette consumption and NE in participants who had consented to contribute biospecimens for banking and future analysis, primarily in self-identified non-Hispanic White smokers, compared to those not contributing biospecimens. The sample of TES participants with biospecimens is statistically powered to provide information on existing susceptibility biomarkers in self-identified Blacks and in self-identified Whites, and represents a well-powered resource to identify novel biomarkers of susceptibility and response to cigarette smoke exposures. The TESAC will seek support to enable research efforts to generate and contribute –omic data to research consortia and to public databases, and findings to the peer-reviewed literature. Such findings will contribute to the understanding of the relationship between cigarette smoke exposures and attributable disease.



Total Exposure Study


Biomarkers of exposure


Biomarkers of potential harm


United States


Virginia tobacco and health research repository


Body mass index

FTC tar:

Federal trade commission machine-rated tar


Clinical research organizations


TES Analysis Consortium


Potassium ethylenediaminetetraacetic acid


Acid citrate dextrose solution A


Peripheral blood mononuclear cell


University of California San Francisco


Health insurance portability and accountability act


Clinical analysis laboratory


Total nicotine equivalents


Liquid chromatography-mass spectrometry-mass spectrometry










3-hydroxy-propylmercapturic acid






Gas chromatography–mass spectrometry


Monohydroxyl-butenylmercapturic acid


Dihydroxy-butyl-mercapturic acid


Behavioral Risk Factor Surveillance System


Enzyme immunoassay


White blood cell


High-sensitivity C-reactive protein


High-density lipoprotein


Low-density lipoprotein


Blood pressure


Forced expiratory volume in 1 second


Forced expiratory vital capacity


Cigarette per day


Total number


Standard deviation


Coronary artery risk development in young adults


Genome-wide association study


National health and nutrition examination survey


Nicotinic acetylcholine receptor


Population assessment of tobacco and health


Institute of medicine


  1. Amended final research protocol a multi-center study to determine the exposure of adult u s. smokers to cigarette smoke Philip Morris USA Clinical Evaluation Study No TESMC/01/02 WSA Project No PM-1337 Covance CRU Study No 12226–8451. 2002. []. Accessed 13 Aug 2013.

  2. Roethig HJ, Munjal S, Feng S, Liang Q, Sarkar M, Walk RA, et al. Population estimates for biomarkers of exposure to cigarette smoke in adult U.S. cigarette smokers. Nicotine Tob Res. 2009;11(10):1216–25.

    Article  CAS  PubMed  Google Scholar 

  3. Frost-Pineda K, Liang Q, Liu J, Rimmer L, Jin Y, Feng S, et al. Biomarkers of potential harm among adult smokers and nonsmokers in the total exposure study. Nicotine Tob Res. 2011;13(3):182–93.

    Article  CAS  PubMed  Google Scholar 

  4. Cancer Genome Atlas Research N. Comprehensive genomic characterization of squamous cell lung cancers. Nature. 2012;489(7417):519–25.

    Article  Google Scholar 

  5. University of California San Francisco Legacy Tobacco Documents Library [].

  6. Smokescreen® Genotyping Array. 2015. []. Accessed 24 Aug 2015.

  7. Gauderman WJ, Morrison JM: QUANTO: A computer program for power and sample size calculations for genetic-epidemiology studies. In., 1.2 edn; 2006.

  8. Zedler BK, Kinser R, Oey J, Nelson B, Roethig HJ, Walk RA, et al. Biomarkers of exposure and potential harm in adult smokers of 3–7 mg tar yield (Federal Trade Commission) cigarettes and in adult non-smokers. Biomarkers. 2006;11(3):201–20.

    Article  CAS  PubMed  Google Scholar 

  9. Mendes P, Liang Q, Frost-Pineda K, Munjal S, Walk RA, Roethig HJ. The relationship between smoking machine derived tar yields and biomarkers of exposure in adult cigarette smokers in the US. Regul Toxicol Pharmacol. 2009;55(1):17–27.

    Article  CAS  PubMed  Google Scholar 

  10. Wang J, Roethig HJ, Appleton S, Werley M, Muhammad-Kah R, Mendes P. The effect of menthol containing cigarettes on adult smokers’ exposure to nicotine and carbon monoxide. Regul Toxicol Pharmacol. 2010;57(1):24–30.

    Article  CAS  PubMed  Google Scholar 

  11. Warner JH, Liang Q, Sarkar M, Mendes PE, Roethig HJ. Adaptive regression modeling of biomarkers of potential harm in a population of U.S. adult cigarette smokers and nonsmokers. BMC Med Res Methodol. 2010;10:19.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Muhammad-Kah R, Liang Q, Frost-Pineda K, Mendes PE, Roethig HJ, Sarkar M. Factors affecting exposure to nicotine and carbon monoxide in adult cigarette smokers. Regul Toxicol Pharmacol. 2011;61(1):129–36.

    Article  CAS  PubMed  Google Scholar 

  13. Muhammad-Kah RS, Hayden AD, Liang Q, Frost-Pineda K, Sarkar M. The relationship between nicotine dependence scores and biomarkers of exposure in adult cigarette smokers. Regul Toxicol Pharmacol. 2011;60(1):79–83.

    Article  CAS  PubMed  Google Scholar 

  14. Liu J, Liang Q, Frost-Pineda K, Muhammad-Kah R, Rimmer L, Roethig H, et al. Relationship between biomarkers of cigarette smoke exposure and biomarkers of inflammation, oxidative stress, and platelet activation in adult cigarette smokers. Cancer Epidemiol Biomarkers Prev. 2011;20(8):1760–9.

    Article  CAS  PubMed  Google Scholar 

  15. Sarkar M, Wang J, Liang Q. Metabolism of Nicotine and 4-(methylnitrosamino)-l-(3-pyridyl)-lbutanone (NNK) in menthol and non-menthol cigarette smokers. Drug Metabolism Letters. 2012;6(3):198–206.

    Article  CAS  PubMed  Google Scholar 

  16. Frost-Pineda K, Muhammad-Kah R, Rimmer L, Liang Q. Predictors, indicators, and validated measures of dependence in menthol smokers. J Addict Dis. 2014;33(2):94–113.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Centers for Disease C, Prevention. Prevention: Methodologic changes in the Behavioral Risk Factor Surveillance System in 2011 and potential effects on prevalence estimates. MMWR Morb Mortal Wkly Rep. 2012;61(22):410–3.

    Google Scholar 

  18. Lee DJ, Messiah A. Population biomarker estimates and tobacco exposure: comment on the article by Roethig et al. Nicotine Tob Res. 2010;12(5):540. author reply 541–542.

    Article  CAS  PubMed  Google Scholar 

  19. Sarkar M, Liang Q. Explanation of the design of the total exposure study. Nicotine Tob Res. 2010;12(5):541–2.

    Article  Google Scholar 

  20. Jain RB, Bernert JT. Effect of body mass index and total blood volume on serum cotinine levels among cigarette smokers: NHANES 1999–2008. Clin Chim Acta. 2010;411(15–16):1063–8.

    Article  CAS  PubMed  Google Scholar 

  21. Muscat JE, Djordjevic MV, Colosimo S, Stellman SD, Richie Jr JP. Racial differences in exposure and glucuronidation of the tobacco-specific carcinogen 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK). Cancer. 2005;103(7):1420–6.

    Article  CAS  PubMed  Google Scholar 

  22. Rostron B. NNAL exposure by race and menthol cigarette use among U.S. smokers. Nicotine Tob Res. 2013;15(5):950–6.

    Article  CAS  PubMed  Google Scholar 

  23. Benowitz NL, Perez-Stable EJ, Fong I, Modin G, Herrera B, Jacob 3rd P. Ethnic differences in N-glucuronidation of nicotine and cotinine. J Pharmacol Exp Ther. 1999;291(3):1196–203.

    CAS  PubMed  Google Scholar 

  24. Perez-Stable EJ, Herrera B, Jacob 3rd P, Benowitz NL. Nicotine metabolism and intake in black and white smokers. JAMA. 1998;280(2):152–6.

    Article  CAS  PubMed  Google Scholar 

  25. TES Adult Smoker Survey.pdf. 2006. []. Accessed 13 Aug 2013.

  26. TES Adult Non-Smoker Survey.pdf. 2006. []. Accessed 13 Aug 2013.

  27. ATTACHMENT A STATEMENT OF WORK. 2004. []. Accessed 3 Mar 2014.

  28. 20041012 final SAP v15. 2004. []. Accessed 3 Mar 2014.

  29. TOTAL EXPOSURE STUDY. 2005. []. Accessed 3 Mar 2014.

  30. SAS DATASETS. 2004. []. Accessed 26 Sep 2013.

  31. Hochberg Y, Benjamini Y. More powerful procedures for multiple significance testing. Stat Med. 1990;9(7):811–8.

    Article  CAS  PubMed  Google Scholar 

  32. Bosse R, Sparrow D, Garvey AJ, Costa Jr PT, Weiss ST, Rowe JW. Cigarette smoking, aging, and decline in pulmonary function: A longitudinal study. Arch Environ Health. 1980;35(4):247–52.

    CAS  PubMed  Google Scholar 

  33. Van Sickle D, Magzamen S, Mullahy J. Understanding socioeconomic and racial differences in adult lung function. Am J Respir Crit Care Med. 2011;184(5):521–7.

    Article  PubMed  Google Scholar 

  34. Carfray A, Patel K, Whitaker P, Garrick P, Griffiths GJ, Warwick GL. Albumin as an outcome measure in haemodialysis in patients: the effect of variation in assay method. Nephrol Dial Transplant. 2000;15(11):1819–22.

    Article  CAS  PubMed  Google Scholar 

  35. GTEx Portal [] (2015). Accessed 24 Aug 2015.

  36. Hamidovic A, Goodloe RJ, Bergen AW, Benowitz NL, Styn MA, Kasberger JL, et al. Gene-centric analysis of serum cotinine levels in African and European American populations. Neuropsychopharmacology. 2012;37(4):968–74.

    Article  CAS  PubMed  Google Scholar 

  37. International HapMap C. The International HapMap Project. Nature. 2003;426(6968):789–96.

    Article  Google Scholar 

  38. Thorgeirsson TE, Geller F, Sulem P, Rafnar T, Wiste A, Magnusson KP, et al. A variant associated with nicotine dependence, lung cancer and peripheral arterial disease. Nature. 2008;452(7187):638–42.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Munafo MR, Timofeeva MN, Morris RW, Prieto-Merino D, Sattar N, Brennan P, et al. Association between genetic variants on chromosome 15q25 locus and objective measures of tobacco exposure. J Natl Cancer Inst. 2012;104(10):740–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Phillips NR, Sprouse ML, Roby RK. Simultaneous quantification of mitochondrial DNA copy number and deletion ratio: a multiplex real-time PCR assay. Scientific Reports. 2014;4:3887.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Ballinger SW, Bouder TG, Davis GS, Judice SA, Nicklas JA, Albertini RJ. Mitochondrial genome damage associated with cigarette smoking. Cancer Res. 1996;56(24):5692–7.

    CAS  PubMed  Google Scholar 

  42. Masayesva BG, Mambo E, Taylor RJ, Goloubeva OG, Zhou S, Cohen Y, et al. Mitochondrial DNA content increase in response to cigarette smoking. Cancer Epidemiol Biomarkers Prev. 2006;15(1):19–24.

    Article  CAS  PubMed  Google Scholar 

  43. Zeilinger S, Kuhnel B, Klopp N, Baurecht H, Kleinschmidt A, Gieger C, et al. Tobacco smoking leads to extensive genome-wide changes in DNA methylation. PLoS One. 2013;8(5):e63812.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Shenker NS, Polidoro S, van Veldhoven K, Sacerdote C, Ricceri F, Birrell MA, et al. Epigenome-wide association study in the European Prospective Investigation into Cancer and Nutrition (EPIC-Turin) identifies novel genetic loci associated with smoking. Hum Mol Genet. 2013;22(5):843–51.

    Article  CAS  PubMed  Google Scholar 

  45. Verdugo RA, Zeller T, Rotival M, Wild PS, Munzel T, Lackner KJ, et al. Graphical modeling of gene expression in monocytes suggests molecular mechanisms explaining increased atherosclerosis in smokers. PLoS One. 2013;8(1):e50888.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Bortner Jr JD, Richie Jr JP, Das A, Liao J, Umstead TM, Stanley A, et al. Proteomic profiling of human plasma by iTRAQ reveals down-regulation of ITI-HC3 and VDBP by cigarette smoking. J Proteome Res. 2011;10(3):1151–9.

    Article  CAS  PubMed  Google Scholar 

  47. Airoldi L, Magagnotti C, Iannuzzi AR, Marelli C, Bagnati R, Pastorelli R, et al. Effects of cigarette smoking on the human urinary proteome. Biochem Biophys Res Commun. 2009;381(3):397–402.

    Article  CAS  PubMed  Google Scholar 

  48. Hsu PC, Zhou B, Zhao Y, Ressom HW, Cheema AK, Pickworth W, et al. Feasibility of identifying the tobacco-related global metabolome in blood by UPLC-QTOF-MS. J Proteome Res. 2013;12(2):679–91.

    Article  CAS  PubMed  Google Scholar 

  49. Benowitz NL, Hukkanen J, Jacob P. Nicotine chemistry, metabolism, kinetics and biomarkers. Handb Exp Pharmacol. 2009;192:29–60.

    Article  CAS  Google Scholar 

  50. McGuffey JE, Wei B, Bernert JT, Morrow JC, Xia B, Wang L, et al. Validation of a LC-MS/MS Method for Quantifying Urinary Nicotine, Six Nicotine Metabolites and the Minor Tobacco Alkaloids-Anatabine and Anabasine-in Smokers’ Urine. PLoS One. 2014;9(7):e101816.

    Article  PubMed  PubMed Central  Google Scholar 

  51. Rappaport SM. Biomarkers intersect with the exposome. Biomarkers. 2012;17(6):483–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Smoking & Tobacco Use [] (2014). Accessed 23 Jul 2015.

  53. Timofeeva MN, McKay JD, Smith GD, Johansson M, Byrnes GB, Chabrier A, et al. Genetic polymorphisms in 15q25 and 19q13 loci, cotinine levels, and risk of lung cancer in EPIC. Cancer Epidemiol Biomarkers Prev. 2011;20(10):2250–61.

    Article  CAS  PubMed  Google Scholar 

  54. Keskitalo-Vuokko K, Pitkaniemi J, Broms U, Heliovaara M, Aromaa A, Perola M, et al. Associations of nicotine intake measures with CHRN genes in Finnish smokers. Nicotine Tob Res. 2011;13(8):686–90.

    Article  PubMed  PubMed Central  Google Scholar 

  55. Le Marchand L, Derby KS, Murphy SE, Hecht SS, Hatsukami D, Carmella SG, et al. Smokers with the CHRNA lung cancer-associated variants are exposed to higher levels of nicotine equivalents and a carcinogenic tobacco-specific nitrosamine. Cancer Res. 2008;68(22):9137–40.

    Article  PubMed  PubMed Central  Google Scholar 

  56. Overview of the Population Assessment of Tobacco and Health (PATH) Study [] (2013). Accessed 4 Oct 2013.

  57. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38(8):904–9.

    Article  CAS  PubMed  Google Scholar 

  58. McQuillan GM, Porter KS, Agelli M, Kington R. Consent for genetic research in a general population: the NHANES experience. Genet Med. 2003;5(1):35–42.

    Article  PubMed  Google Scholar 

  59. Mezuk B, Eaton WW, Zandi P. Participant characteristics that influence consent for genetic research in a population-based survey: the Baltimore epidemiologic catchment area follow-up. Community Genetics. 2008;11(3):171–8.

    PubMed  PubMed Central  Google Scholar 

  60. Policy for Sharing of Data Obtained in NIH Supported or Conducted Genome-Wide Association Studies (GWAS). 2007. []. Accessed 29 Sep 2013.

  61. Development of Data Sharing Policy for Sequence and Related Genomic Data. 2009. []. Accessed 29 Sep 2013.

  62. Tabak LA: Draft NIH Genomic Data Sharing Policy Request for Public Comments. In. Edited by Health NIo, vol. 78. Washington, DC: Federal Register; 2013: 57860–57865.

  63. Holdren JP: Increasing Access to the Results of Federally Funded Research. In. Edited by Office of Science and Technology Policy EOotP. Washington, D.C.:; 2013.

  64. Committee IOM: Scientific Standards for Studies on Modified Risk Tobacco Products. In. Washington, DC: National Academies Press; 2012: 350.

Download references


We thank the following individuals for helping us get to this stage: Robert T Skunda, Krishna Kodukula, Jocelyn To, Walter Moos, Ian Colrain, Joe Rogers, Laleh Shayesteh, Naseem Chini, Denise Nishita, Vinu Rathee, Jennifer Miller, Gabrielle Leblanc, Joe Perrone, Greg Stauber, Hua Lin, Tom Shaler, Lauren Haberland and Rachel Taketa. The Center for Advanced Drug Research, now SRI Shenandoah Valley, houses TES biospecimens and was established with support from the Commonwealth of Virginia to SRI. We acknowledge funding from SRI International. Andrew W Bergen acknowledges funding from the National Institute of Drug Abuse (DA033813, PI: Bergen). SRI International and the National Institute of Drug Abuse played no role in the design, collection, analysis, or interpretation of data; in the writing of the manuscript; or in the decision to submit the manuscript for publication. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Andrew W. Bergen.

Additional information

Competing interests

AWB, RK and HSJ disclose employment at SRI as a financial competing interest. SRI is financing the cost of the article-processing charge. GES declares employment at SRI within the last 5 years as a financial competing interest. MDL declares no competing interests in the data reported in this communication. JWB discloses employment at BioRealm LLC as a financial competing interest. XC declares no competing interests. LM and BZ were previously employed by the tobacco company sponsor of the TES and involved in aspects of the original study execution and analysis.

Authors’ contributions

AWB directed statistical analyses of TES data and laboratory analyses of biospecimens, performed power analyses, proposed the goals of a future analysis consortium, and drafted and revised the manuscript. RK performed univariate analyses of TES data. HSJ performed multivariate modeling and permutation of TES data. GES reviewed TES documents and publications and helped to draft and revise the manuscript. MDL suggested the concept for the manuscript and contributed to the goals of a future analysis consortium. JWB contributed to the goals of a future analysis consortium, provided comments on the manuscript, and performed genome-wide genotyping on TES DNA samples. XC contributed to the goals of a future analysis consortium. LM provided comments on the manuscript and provided background on the TES. BZ made extensive contributions to the manuscript, provided background on the TES, and contributed to the goals of a future analysis consortium. All authors approved submission of the final manuscript.

Additional files

Additional file 1:

Total Exposure Study inclusion and exclusion criteria. (PDF 75 kb)

Additional file 2: Table S1.

Biomarkers of exposure in the Total Exposure Study. (PDF 353 kb)

Additional file 3: Table S2.

Biomarkers of potential harm in the Total Exposure Study. (PDF 191 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bergen, A.W., Krasnow, R., Javitz, H.S. et al. Total Exposure Study Analysis consortium: a cross-sectional study of tobacco exposures. BMC Public Health 15, 866 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: