Skip to main content

Table 2 The models’ performance with 95% confidence interval according to the number of features used

From: Prediction of metabolic and pre-metabolic syndromes using machine learning models with anthropometric, lifestyle, and biochemical factors from a middle-aged population in Korea

 

F1-score

Accuracy

Sensitivity

Specificity

AUC

Original

SMOTE

Original

SMOTE

Original

SMOTE

Original

SMOTE

Original

SMOTE

4 Features (Demographic and anthropometric Features)

 Decision Tree

0.711 (0.66–0.76)

0.758 (0.71–0.80)

0.711 (0.66–0.76)

0.758 (0.71–0.80)

0.573 (0.52–0.63)

0.758 (0.71–0.80)

0.782 (0.74–0.83)

0.758 (0.71–0.80)

0.677 (0.63–0.73)

0.758 (0.71–0.80)

 Gaussian NB

0.789 (0.75–0.83)

0.780 (0.74–0.82)

0.790 (0.75–0.83)

0.780 (0.74–0.82)

0.684 (0.63–0.73)

0.790 (0.75–0.83)

0.844 (0.80–0.88)

0.769 (0.72–0.81)

0.764 (0.72–0.81)

0.780 (0.74–0.82)

 KNN

0.774 (0.73–0.82)

0.783 (0.74–0.83)

0.777 (0.73–0.82)

0.783 (0.74–0.83)

0.619 (0.57–0.67)

0.826 (0.79–0.87)

0.859 (0.82–0.90)

0.740 (0.69–0.79)

0.739 (0.69–0.79)

0.783 (0.74–0.83)

 XGBoost

0.771 (0.73–0.82)

0.802 (0.76–0.84)

0.773 (0.73–0.82)

0.802 (0.76–0.85)

0.626 (0.57–0.68)

0.812 (0.77–0.85)

0.848 (0.81–0.89)

0.792 (0.75–0.84)

0.737 (0.69–0.78)

0.802 (0.76–0.85)

 RF

0.772 (0.73–0.82)

0.813 (0.77–0.86)

0.774 (0.73–0.82)

0.814 (0.77–0.86)

0.628 (0.58–0.68)

0.832 (0.79–0.87)

0.850 (0.81–0.89)

0.795 (0.75–0.84)

0.739 (0.69–0.79)

0.814 (0.77–0.86)

 Logistic R

0.777 (0.73–0.82)

0.783 (0.74–0.83)

0.787 (0.74–0.83)

0.784 (0.74–0.83)

0.558 (0.50–0.61)

0.799 (0.76–0.84)

0.904 (0.87–0.94)

0.768 (0.72–0.81)

0.731 (0.68–0.78)

0.784 (0.74–0.83)

 SVM

0.787 (0.74–0.83)

0.785 (0.74–0.83)

0.795 (0.75–0.84)

0.785 (0.74–0.83)

0.585 (0.53–0.64)

0.809 (0.77–0.85)

0.903 (0.87–0.93)

0.762 (0.72–0.81)

0.744 (0.70–0.79)

0.786 (0.74–0.83)

 MLP

0.785 (0.74–0.83)

0.770 (0.72–0.82)

0.792 (0.75–0.84)

0.772 (0.73–0.82)

0.607 (0.55–0.66)

0.735 (0.69–0.78)

0.887 (0.85–0.92)

0.809 (0.77–0.85)

0.747 (0.70–0.79)

0.772 (0.73–0.82)

 1D-CNN

0.779 (0.73–0.82)

0.783 (0.74–0.83)

0.782 (0.74–0.83)

0.784 (0.74–0.83)

0.657 (0.61–0.71)

0.784 (0.74–0.83)

0.846 (0.81–0.88)

0.784 (0.74–0.83)

0.752 (0.71–0.80)

0.784 (0.74–0.83)

12 Features (Lifestyle-related features added)

 Decision Tree

0.722 (0.67–0.77)

0.765 (0.72–0.81)

0.724 (0.68–0.77)

0.765 (0.72–0.81)

0.570 (0.52–0.62)

0.776 (0.73–0.82)

0.803 (0.76–0.85)

0.755 (0.71–0.80)

0.686 (0.64–0.74)

0.765 (0.72–0.81)

 Gaussian NB

0.775 (0.73–0.82)

0.766 (0.72–0.81)

0.774 (0.73–0.82)

0.766 (0.72–0.81)

0.685 (0.64–0.74)

0.773 (0.73–0.82)

0.820 (0.78–0.86)

0.759 (0.71–0.81)

0.753 (0.71–0.80)

0.766 (0.72–0.81)

 KNN

0.738 (0.69–0.78)

0.780 (0.73–0.82)

0.743 (0.70–0.79)

0.782 (0.74–0.83)

0.551 (0.50–0.60)

0.879 (0.84–0.91)

0.842 (0.80–0.88)

0.685 (0.63–0.73)

0.696 (0.65–0.75)

0.782 (0.74–0.83)

 XGBoost

0.778 (0.73–0.82)

0.834 (0.79–0.87)

0.782 (0.74–0.83)

0.834 (0.79–0.87)

0.622 (0.57–0.67)

0.837 (0.8–0.88)

0.863 (0.83–0.90)

0.832 (0.79–0.87)

0.743 (0.70–0.79)

0.834 (0.79–0.87)

 RF

0.791 (0.75–0.83)

0.838 (0.80–0.88)

0.795 (0.75–0.84)

0.838 (0.80–0.88)

0.635 (0.58–0.69)

0.850 (0.81–0.89)

0.876 (0.84–0.91)

0.826 (0.79–0.87)

0.756 (0.71–0.80)

0.838 (0.80–0.88)

 Logistic R

0.785 (0.74–0.83)

0.779 (0.73–0.82)

0.792 (0.75–0.84)

0.779 (0.73–0.82)

0.595 (0.54–0.65)

0.791 (0.75–0.83)

0.893 (0.86–0.93)

0.767 (0.72–0.81)

0.744 (0.70–0.79)

0.779 (0.73–0.82)

 SVM

0.790 (0.75–0.83)

0.783 (0.74–0.83)

0.797 (0.75–0.84)

0.783 (0.74–0.83)

0.605 (0.55–0.66)

0.796 (0.75–0.84)

0.894 (0.86–0.93)

0.770 (0.72–0.82)

0.750 (0.70–0.80)

0.783 (0.74–0.83)

 MLP

0.772 (0.73–0.82)

0.797 (0.75–0.84)

0.778 (0.73–0.82)

0.798 (0.75–0.84)

0.619 (0.57–0.67)

0.790 (0.75–0.83)

0.859 (0.82–0.90)

0.806 (0.76–0.85)

0.739 (0.69–0.79)

0.798 (0.75–0.84)

 1D-CNN

0.771 (0.73–0.82)

0.770 (0.72–0.82)

0.776 (0.73–0.82)

0.774 (0.73–0.82)

0.635 (0.58–0.69)

0.861 (0.82–0.90)

0.848 (0.81–0.89)

0.688 (0.64–0.74)

0.742 (0.69–0.79)

0.775 (0.73–0.82)

20 Features (Biochemical measurements added)

 Decision Tree

0.743 (0.70–0.79)

0.777 (0.73–0.82)

0.743 (0.70–0.79)

0.778 (0.73–0.82)

0.631 (0.58–0.68)

0.797 (0.75–0.84)

0.801 (0.76–0.84)

0.758 (0.71–0.80)

0.716 (0.67–0.76)

0.778 (0.73–0.82)

 Gaussian NB

0.786 (0.74–0.83)

0.759 (0.71–0.81)

0.795 (0.75–0.84)

0.762 (0.72–0.81)

0.577 (0.52–0.63)

0.646 (0.59–0.70)

0.906 (0.87–0.94)

0.878 (0.84–0.91)

0.741 (0.69–0.79)

0.762 (0.72–0.81)

 KNN

0.748 (0.70–0.79)

0.787 (0.74–0.83)

0.756 (0.71–0.80)

0.788 (0.74–0.83)

0.540 (0.49–0.59)

0.871 (0.83–0.91)

0.866 (0.83–0.90)

0.705 (0.66–0.75)

0.703 (0.65–0.75)

0.788 (0.74–0.83)

 XGBoost

0.801 (0.76–0.84)

0.851 (0.81–0.89)

0.804 (0.76–0.85)

0.851 (0.81–0.89)

0.662 (0.61–0.71)

0.859 (0.82–0.9)

0.877 (0.84–0.91)

0.843 (0.8–0.88)

0.769 (0.72–0.81)

0.851 (0.81–0.89)

 RF

0.815 (0.77–0.86)

0.843 (0.80–0.88)

0.818 (0.78–0.86)

0.844 (0.80–0.88)

0.690 (0.64–0.74)

0.857 (0.82–0.89)

0.883 (0.85–0.92)

0.831 (0.79–0.87)

0.786 (0.74–0.83)

0.844 (0.80–0.88)

 Logistic R

0.812 (0.77–0.85)

0.804 (0.76–0.85)

0.818 (0.78–0.86)

0.804 (0.76–0.85)

0.638 (0.59–0.69)

0.812 (0.77–0.85)

0.910 (0.88–0.94)

0.796 (0.75–0.84)

0.774 (0.73–0.82)

0.804 (0.76–0.85)

 SVM

0.811 (0.77–0.85)

0.810 (0.77–0.85)

0.817 (0.78–0.86)

0.810 (0.77–0.85)

0.636 (0.58–0.69)

0.831 (0.79–0.87)

0.909 (0.88–0.94)

0.790 (0.75–0.83)

0.773 (0.73–0.82)

0.810 (0.77–0.85)

 MLP

0.807 (0.76–0.85)

0.811 (0.77–0.85)

0.812 (0.77–0.85)

0.812 (0.77–0.85)

0.638 (0.59–0.69)

0.836 (0.80–0.88)

0.901 (0.87–0.93)

0.787 (0.74–0.83)

0.770 (0.72–0.81)

0.812 (0.77–0.85)

 1D-CNN

0.799 (0.76–0.84)

0.814 (0.77–0.86)

0.803 (0.76–0.85)

0.815 (0.77–0.86)

0.662 (0.61–0.71)

0.807 (0.76–0.85)

0.875 (0.84–0.91)

0.822 (0.78–0.86)

0.768 (0.72–0.81)

0.815 (0.77–0.86)

  1. Presented are the results before (Original) and after (SMOTE) applying the synthetic minority oversampling technique
  2. AUC Area under the receiver operating characteristic curve, Gaussian NB Gaussian naïve bayes classifier, KNN K-nearest neighbor, XGBoost Extreme gradient boosting, Logistic R Logistic regression, RF Random forest, SVM Support vector machine, MLP Multilayer perceptron, 1D-CNN 1-dimensional convolutional neural network