Skip to main content

Table 2 Performance metrics of random forest models based on combinations of personal, neighborhood and environmental characteristics

From: Predicting self-perceived general health status using machine learning: an external exposome study

Exposure Types

2012 Dataset

2016 Model Dataset

 

N variables

AUCd (95% CI)

N variables

AUC (95% CI)

Personal

Characteristicsa

29

0.875 (0.862 – 0.888)

34

0.898 (0.894 – 0.902)

Neighborhood

Characteristicsb

23

0.563 (0.544 – 0.582)

25

0.561 (0.553 – 0.569)

Environmental

characteristicsc

29

0.572 (0.556 – 0.588)

32

0.550 (0.540 – 0.558)

All

81

0.864 (0.852– 0.876)

91

0.890 (0.885—0.895)

  1. aVariables from the Public Health Monitor on sociodemographic characteristics, lifestyle, loneliness, and financial status
  2. bAir pollution, noise levels, distance to urban green, area biodiversity levels and neighborhood water surface area variables
  3. cNeighborhood variables on social benefits, high- and low-income households, neighborhood demographics etc.
  4. dMean AUC of five-fold crossed validated random forest models