Skip to main content

Table 2 Demographic information and geometric mean of biomarkers for the potential harm classification model

From: Binary classification of users of electronic cigarettes and smokeless tobacco through biomarkers to assess similarity with current and former smokers: machine learning applied to the population assessment of tobacco and health study

   

Training data

Test data

Products

  

Full CS1

CS2

ExSM

test-CS2

test-ExSM

dual EPRODS

EPRODS

dual SMKLS

SMKLS

Participants (N)

 

2538

343

343

85

85

82

176

44

198

Age

Mean

39.5

38.8

29.5

37.1

30.0

39.0

39.2

33.4

42.2

 

SD

15.0

14.5

13.7

14.0

13.5

15.2

14.9

16.1

16.8

Gender (N)

Male

1217

166

205

40

45

47

74

-

187

 

Female

1321

177

138

45

40

35

102

-

11

Ethnicity (N)

White

1975

277

236

62

65

71

154

-

172

 

Black

329

40

53

13

12

4

11

-

11

 

Other

234

26

54

10

8

7

11

-

15

Alcohol (N)3

 

2280

306

316

78

82

73

165

38

183

Urban (N)3

 

2263

306

336

71

81

74

167

29

164

High blood pressure (N)3

 

659

80

49

20

6

18

28

11

65

High cholesterol (N)3

 

471

50

50

12

6

11

35

5

47

Diabetes (N)3

 

337

50

21

8

-

10

17

-

26

Cardiovascular disease (N)4

272

34

17

10

5

9

13

-

16

Respiratory diseases (N)4

 

635

100

61

16

14

18

39

6

28

FIB (mg/dL)

Mean

341.5

347.5

304.6

340.0

307.8

319.0

317.6

317.6

310.4

 

SD

103.1

109.8

93.1

101.0

80.6

88.6

102.2

99.8

89.9

hsCRP (mg/L)

Mean

4.0

5.2

2.5

4.2

2.1

3.1

3.2

2.4

3.2

 

SD

7.8

10.9

4.2

7.0

3.0

4.1

5.1

3.7

6.6

IL6 (pg/mL)

Mean

2.3

2.5

1.7

2.0

1.4

2.2

1.8

2.5

2.1

 

SD

2.1

2.4

1.8

1.9

1.4

2.3

1.8

2.7

2.0

sICAM (ng/mL)

Mean

299.0

307.4

217.3

285.7

218.9

298.6

254.5

272.0

242.0

 

SD

113.7

125.8

69.0

119.1

56.5

103.0

90.0

88.6

85.3

8PGFT (ng/g*CRE)

Mean

688.1

682.3

391.1

605.2

417.1

680.6

541.7

560.7

485.9

 

SD

403.4

441.2

200.9

232.7

242.4

460.9

348.3

246.7

516.0

  1. Abbreviations: 8PGFT 8-isoprostane, CS cigarette smoker, CRE creatinine, dual-EPRODS current smoker and user of electronic cigarettes, dual-SMKLS current smoker and user of smokeless tobacco, ExSM former smoker, EPRODS user of electronic cigarettes, FIB fibrinogen, hsCRP high-sensitivity C-reactive protein, IL6 interleukin 6, sICAM soluble intercellular adhesion module, SD standard deviation, SMKLS user of smokeless tobacco, test-CS test dataset of current smokers, test-ExSM test dataset of former smokers
  2. 1 Full data of all current smokers
  3. 2 Random selection from data of all current smokers to match the number of former smokers for machine learning
  4. 3 Number of participants who answered “Yes” in the questionnaire
  5. 4 Number of participants who answered “Yes” in the relevant disease questionnaire
  6. - In accordance with ICPSR data release rules, tables with cell sizes smaller than the threshold for the specific dataset will not be released