Skip to main content

Machine-learning model predicting quality of life using multifaceted lifestyles in middle-aged South Korean adults: a cross-sectional study

Abstract

Background

In the context of population aging, advances in healthcare technology, and growing interest in healthy aging and higher quality of life (QOL), have gained central focus in public health, particularly among middle-aged adults.

Methods

This study presented an optimal prediction model for QOL among middle-aged South Korean adults (N = 4,048; aged 30–55 years) using a machine-learning technique. Community-based South Korean population data were sampled through multistage stratified cluster sampling. Twenty-one variables related to individual factors and various lifestyle patterns were surveyed. QOL was assessed using the Short Form Health Survey (SF-12) and categorized into total QOL, physical component score (PCS), and mental component score (MCS). Seven machine-learning algorithms were used to predict QOL: decision tree, Gaussian Naïve Bayes, k-nearest neighbor, logistic regression, extreme gradient boosting, random forest, and support vector machine. Data imbalance was resolved with the synthetic minority oversampling technique (SMOTE). Random forest was used to compare feature importance and visualize the importance of each variable.

Results

For predicting QOL deterioration, the random forest method showed the highest performance. The random forest algorithm using SMOTE showed the highest area under the receiver operating characteristic (AUC) for total QOL (0.822), PCS (0.770), and MCS (0.786). Applying the data, SMOTE enhanced model performance by up to 0.111 AUC. Although feature importance differed across the three QOL indices, stress and sleep quality were identified as the most potent predictors of QOL. Random forest generated the most accurate prediction of QOL among middle-aged adults; the model showed that stress and sleep quality management were essential for improving QOL.

Conclusion

The results highlighted the need to develop a health management program for middle-aged adults that enables multidisciplinary management of QOL.

Peer Review reports

Introduction

Quality of life (QOL) is a broad and complicated construct encompassing important domains of daily functioning and subjective experiences; these include physical and social role functioning, physical sensation, and subjective well-being [1]. The importance of QOL—pursuing good health and not living with major illnesses—is highlighted by the 6.6-year increase in global average life expectancy, and the 1.3% decrease in premature, preventable mortality from non-communicable diseases [2, 3]. Moreover, aging well comprises minimizing physical and mental exacerbation and QOL, and the ability to enjoy a meaningful life [4].

QOL is influenced by various factors, including demographic and socioeconomic factors and comorbidities [5, 6]. Many clinical and epidemiological studies have presented evidence of the positive impact of healthy lifestyle practices on QOL. A path analysis model showed that multiple health practices—comfortable sleep, adequate physical activity, and fruit and vegetable intake—are associated with overall QOL. Adherence to a systematic lifestyle modification program leads to improvements in QOL at the one-year follow-up mark [7, 8]. A randomized controlled trial meta-analysis of lifestyle interventions in patients with metabolic syndrome showed that lifestyle interventions produce marked improvements in physical and mental QOL, compared with regular care [9]. In addition, healthy lifestyle practices and positive QOL reduce chronic disease burden and mortality risk among middle-aged adults [10, 11]. However, middle-aged adults’ healthy lifestyle practice was much lower than younger and older adults, which consequently impaired their QOL. This finding highlights the need to design age group-specific health behaviors that effectively improve QOL [7].

In recent years, machine learning (ML) has been widely used to manage big clinical data. This trend is important in understanding multiple factors’ complexity and nonlinear relations [12]. A multidimensional analysis of QOL calls for sophisticated techniques that enable automated analyses. One study identified five significant predictors of health-related QOL among older adults with chronic diseases, and confirmed that stepwise logistic regression produces an effective QOL prediction model [13]. A study on depression, which is strongly associated with QOL, attempted to establish an ML model to predict QOL using demographic and psychometric data [14]. It found that ML algorithms show superior predictive performance to conventional logistic regression and shed light on the potential of ML in individualized mental health management [14]. In Korea, a study on QOL prediction among older adults using the 36-item Short Form Survey (SF-36) conducted an elastic net-based analysis and reported that grip strength is strongly associated with older adults’ QOL [15]. As such, studies have demonstrated the effectiveness of ML in utilizing clinical data, and more active research on this topic is anticipated. We established an optimal ML model for predicting QOL, considering individual factors and multifaceted lifestyle factors of middle-aged adults in Korea. We also examined the importance and effects of various factors influencing overall, physical, and mental QOL.

Materials and methods

Study population and sampling

This cross-sectional study used data from the 2017–2019 Korean Medicine Daejeon Citizen Cohort study [16]. The inclusion criteria were residents of Daejeon—a city in South Korea—aged 30–55 years. The exclusion criteria were diagnosis of cancer (malignant tumor) or cardiovascular disease (e.g., myocardial infarction, angina, stroke/apoplexy) and difficulty responding to the questionnaire.

We performed multistage stratified cluster sampling. In the first stage, we divided Daejeon into five administrative units (gu), based on the resident registration population (approximately 610,000) aged 30–55, defined as middle-aged in this study [16, 17]. We then identified the sample size and survey point for each unit using probability proportional to size. In the second stage, we allocated samples proportional to sex (men, women) and age (30–39 years, 40–49 years, and 50–55 years) by unit. If the survey could not be performed at the identified survey point, the survey site was moved to another location within the same stratum. All participants were selected randomly at the survey point. We used a structured questionnaire in Korean that included questions on demographic factors (4 items), physical measurement and cold–heat pattern factors (4 items), lifestyle factors (13 items), and QOL, administered via one-on-one interviews.

A total of 4,063 participants completed the survey. After excluding 15 with missing data for significant variables, we analyzed data from 4,048 participants, consisting of 1,751 men and 2,297 women (ratio = 1:1.31).

Measurements

Demographic factors

Individual demographic factors were sex, age, marital status, household income, and disease history. Marital status was divided into married and single (including never married, divorced, or widowed). Monthly household income was divided into ≤ 2.99, 3.00–4.99, and ≥ 5 million KRW. Chronic conditions were determined based on disease history (physician’s diagnosis of hypertension, diabetes mellitus, and hyperlipidemia) and obesity (high BMI). Based on the presence of these four conditions, chronic condition was categorized as 0 or ≥ 1 [18].

Physical measurement and cold–heat pattern

Individual physical measurement and cold–heat pattern factors were height, weight, BMI and cold–heat pattern identification. BMI was calculated by dividing self-reported weight (kg) by height squared (m2); with reference to 25 kg/m2, participants were categorized as being of normal weight or obese [19]. Cold–heat pattern identification is a Korean medicine pattern identified for each person, based on their preference or sensitivity to cold or hot temperatures and the temperature of their hands and feet. In traditional East Asian medicine, this pattern is used to provide health management for patients [20]. We used the cold–heat pattern identification questionnaire to analyze the cold and heat pattern scores, which consists of eight items for cold pattern and seven items for heat pattern [21]. The cold and heat pattern scores were calculated sum of the items included in each pattern on a five-point response scale from 1 to 5, with higher scores indicating closer to being in a cold or heat pattern. The questionnaire was acceptably reliable (Cronbach’s α coefficient = 0.75) and valid (72.9– 82.8% agreement, compared to two professional’s examination) [21].

Lifestyles

Considering alcohol consumption, we calculated the average volume of alcohol per day (g/day) based on drinking frequency (times/day), the volume of alcohol per seating (drinks/seating), and alcohol content (g/drink) for different types of alcohol in the past year. Concerning the sex-specific criteria for the average volume of alcohol per day, we divided participants into non-drinker, responsible drinker, hazardous drinker, and harmful drinker [22]. In addition, smoking status was assessed using the questions “Have you smoked more than 100 cigarettes in your lifetime?” and “Do you currently smoke?” Based on the responses, participants were categorized into current, past, and non-smoker.

Night snacking was assessed with the question, “Do you frequently eat snacks after dinner or before bed?” The responses were divided into 0–1, 2–3, and ≥ 4 times a week. The eating index was assessed using the semi-quantitative food frequency questionnaire consisting of the frequency and average intake of 34 food groups. Eating index was composed of 14 components including adequacy, moderation and balance, and the total score ranged from 0 to 100 by adding up the scores of each component using the eating index equation following the previously reported calculation method of the Korean Healthy Eating [23]. A higher eating index represents healthier eating.

Sleep in the past month was assessed using the 19-item Pittsburgh Sleep Quality Index Korean version (PSQI-K), which had high reliability and validity (Cronbach’s α coefficient = 0.84 in PSQI-K) [24]. The PSQI score ranges from 0 to 21, with the seven component scores weighted from 0 to 3 and then summed, with higher scores indicating worse sleep quality. We used sleep duration (hours) and sleep quality (PSQI score); sleep quality was divided into two groups based on a cutoff of 5 (good sleeper vs. poor sleeper) [25].

Physical activity was assessed using the Korean Global Physical Activity Questionnaire (GPAQ) developed by the World Health Organization (WHO) [26]. According to the GPAQ analysis guidebook, we calculated each domain-related physical activity (minutes/week) for work (high, moderate intensity), transport (walking or riding a bike), and recreation (high, moderate intensity), as well as sedentary time (minutes/day). We used the metabolic equivalent task (MET), which represents the intensity of physical activity, to calculate physical activity in the unit of MET-minutes/week. Finally, we assessed stress levels using the 18-item Psychosocial Well-being Index – Short Form with high internal consistency (Cronbach’s α coefficient = 0.90) [27]. Each item was evaluated on a four-point response scale from 0 to 3, with the total score ranging from 0 to 54, and a higher score indicates increased stress.

Health-related QOL

We assessed QOL using the Short Form Health Survey (SF-12), widely used to measure physical and mental health [28]. Although it is a short form of a 36-item survey, it is useful for clinical research and in measuring the overall impact of disease on a patient’s life [29]. This instrument consists of a physical component score (PCS) and a mental component score (MCS). The PCS includes physical functioning, role-physical, bodily pain, and general health. The MCS includes mental health, role-emotional, social functioning, and vitality. We applied norm-based scoring to the calculated PCS and MCS to convert them to a score with an average of 50 and a standard deviation of 10. The total score for each index ranges from 0 to 100, and higher scores indicate better QOL [30].

This study used three QOL indices (PCS, MCS, and total QOL [sum of PCS and MCS]). We defined a poor QOL group to establish a prediction model. Each of the three scores was divided into terciles, and a score below the lowest tercile was classified as low QOL, with any scores at or above the lowest tercile classified as high QOL. The cutoff for the lowest tercile was 49.69 for PCS, 48.46 for MCS, and 98.40 for total QOL.

Data analysis

Data are expressed as mean and standard deviation, and frequency and percentage. We compared the general characteristics of participants between the normal and the low QOL groups using the Fisher’s exact or chi-squared test for categorical variables and independent t-tests for continuous variables.

We used supervised ML as a low-level ML model for QOL. The algorithms used to develop the models were decision tree, Gaussian Naïve Bayes, k-nearest neighbor, logistic regression analysis, XGBoost, random forest, and support vector machine. Min-max normalization was applied to each variable. The dataset was split into training and validation sets using six-fold cross-validation to compare model performance. The ratio of training and validation datasets was 5:1. Of the 4,048 datasets, 3,373 and 675 were used for training and validation, respectively. In addition, the 2:1 ratio of the high and low QOL groups was configured to remain the same for the training and validation datasets. To address the data imbalance, owing to the use of 1,336 datasets for the poor QOL group and 2,712 datasets for the high QOL group, we applied oversampling, performed using the synthetic minority oversampling technique (SMOTE). SMOTE generates random synthetic data based on Euclidean distance for the minority group [31]. The synthetic data exhibit features similar to existing data, which were applied in this study to compare performance before and after application. The random forest model, which showed high performance, was used to analyze the importance of variables in each of the three QOL prediction models (total QOL, PCS, and MCS).

We assessed the performance of QOL prediction models based on five indices: F1-score, accuracy, sensitivity, specificity, and area under the receiver operating characteristic (AUC). Data were analyzed using Python 3.8.10 (Python Software Foundation, PSF). The Scikit-learn library in Python was used. For analysis and comparison, we built a model using default parameters. Finally, the explanatory power of the prediction model was analyzed, using Shapley additive explanations (SHAP) [32]. SHAP analysis was performed through the SHAP library in Python, and a tree-based model was used.

Results

General characteristics

Table 1 shows participants’ general characteristics and the differences between the high and low total QOL groups. The two groups significantly differed in all variables, except marital status and alcohol consumption.

Table 1 Participants’ general characteristics

Comparison of machine-learning models for predicting QOL

As shown in Table 2, we created total QOL, PCS, and MCS prediction models and compared their performances. For total QOL, the AUC of the models ranged from 0.688 to 0.747, and the XGBoost model had the highest performance. Regarding the F1-score, random forest and logistic regression models showed high performance. For PCS, the AUC of the models ranged from 0.610 to 0.658, and the random forest model also had high performance. Regarding F1-score and accuracy, the model had logistic regression analyses of 0.721 and 0.739, respectively. For MCS, the AUC ranged from 0.615 to 0.698, and XGBoost showed high performance. Regarding F1-score and accuracy, the random forest model showed high performance. Although the results varied for the target QOL, the tree-based random forest model showed good performance, overall.

Table 2 Comparison of ML model performance for predicting total QOL, PCS, and MCS

The results before (Original) and after (SMOTE) SMOTE application were included. SMOTE, synthetic minority oversampling technique; AUC, area under the receiver operating characteristic curve; GaussianNB, Gaussian Naïve Bayes classifier; KNN, K-nearest neighbor; XGBoost, extreme gradient boosting; SVM, support vector machine; PCS, Physical component score; MCS, Mental component score.

Performance with and without the synthetic minority oversampling technique

Table 2 shows performance comparison after using SMOTE. When total QOL was predicted after applying SMOTE, the random forest and XGBoost models had an AUC of 0.822, with the former having a higher F1-score (0.829). When PCS was predicted after applying SMOTE, the random forest model showed good performance, with an AUC of 0.770 and an F1-score of 0.778. Finally, when MCS was predicted after applying SMOTE, the random forest model showed good performance in terms of AUC (0.786), F1-score (0.794), and accuracy (0.796). Figure 1 shows the ROC curve, by fold-in cross-validation, for predicting the three types of QOL after applying SMOTE.

Fig. 1
figure 1

ROC curve for QOL prediction. ROC curves by fold for the random forest model for predicting total QOL (a), PCS (b), and MCS (c). PCS, Physical component score; MCS, Mental component score; ROC curve, Receiver Operating Characteristic curve

Key factors in predicting QOL

Figure 2 illustrates the degree of importance of each variable for predicting QOL. For total QOL, stress and sleep quality were significant features of high importance, followed by BMI, cold pattern score, eating index, physical activity, and sleep. Stress and sleep quality were significant features of high importance for predicting PCS and MCS. The order of importance of each variable for predicting PCS and MCS was similar to that for total QOL, but the degree of influence differed.

Fig. 2
figure 2

Feature importance for predicting QOL. Feature importance in a random forest model for total QOL (a), PCS (b), MCS (c) (x-axis). Order of features on the y-axis is listed in order of importance for total QOL. PCS, Physical component score; MCS, Mental component score; BMI, body mass index; PA, physical activity

Visualization of feature importance

Figure 3 visualizes the influence of 21 variables on predicting the three types of QOL, using SHAP. Each row plots the influence of each feature on the validation data as dots. The greater the absolute SHAP value, the more important the feature is in predicting QOL. Stress was the most potent predictor of total QOL and MCS, followed by sleep quality and physical activity. The most potent predictor of PCS was sleep quality, followed by stress and age. Further, QOL decreased with increasing overall stress and poor sleep quality, and this was expressed as red dots (high), indicating the degree of prediction of low QOL.

Fig. 3
figure 3

Visualization of feature importance using the SHAP. Summary plot where features appear in order of their sum of SHAP value magnitudes for total QOL (a), PCS (b), and MCS (c). Feature ranking (y-axis) is the order of importance of a feature in a prediction model. SHAP values (x-axis) indicate the predictive power of the prediction model. Each row is a plot of the influence on each validation data as dots. Red dots (high) represent the degree of prediction of “low QOL,” and blue dots (low) represent the degree of prediction of “high QOL.” PCS, Physical component score; MCS, Mental component score; PA, physical activity; BMI, body mass index

Discussion

This study generated QOL prediction models for use among middle-aged adults in South Korea using ML and analyzed the differences in the influence of each variable. We predicted QOL deterioration using individual factors linked to QOL and modifiable multifaceted lifestyle practices. A random forest model showed good performance. Although the feature importance varied depending on the target index, stress and sleep quality were major features. Further, applying SMOTE enhanced the performance of QOL prediction models, particularly the random forest. Our study utilizes explainable artificial-intelligence (XAI) techniques to provide detailed evidence on the effects of positive lifestyle changes on the quality of life of middle-aged adults—an age group at an important stage in life in terms of aging well. We expect that our findings will contribute to developing a health management model that induces changes by prioritizing the influence of various daily lifestyle practices to promote better QOL, such as promoting self-management and establishing systems to detect health change.

In our study, the random forest model showed good performance in predicting three QOL health indices. Models using 21 variables without applying SMOTE showed good performance with an F1-score ranging from 0.724 to 0.792 for total QOL, 0.653–0.719 for PCS, and 0.674–0.750 for MCS. These results were similar to the performance reported in previous studies. A study that predicted the QOL of older adults using ML also reported a model with an accuracy of 0.93 and an F-score of 0.49 [13]. The differences in performance indicators across studies are presumed to be attributable to differences in participant data and QOL indices chosen as targets for prediction with ML models.

In terms of feature importance, stress and sleep quality were identified as significant features for predicting QOL. Although feature importance varied depending on the target index (PCS, MCS, total QOL), two features were identified as important. A study that analyzed older adults with chronic diseases using nationally representative survey data showed that monthly income, diagnosis of chronic disease, depression, discomfort, and perceived health status are five significant factors associated with health-related QOL among older adults [13]. Another study reported that confidence, level of self-care, and acceptance of chronic disease are some factors that predict health status and, ultimately, QOL [33]. Further, a study on low perceived QOL among medical students reported that poor sleep quality is associated with poor QOL in a multiple linear regression analysis using questionnaire data [34]. Although these studies differ from ours, which used data collected from middle-aged adults, the findings show the importance of sleep quality.

ML algorithms are useful in diagnosing various health conditions, and active research seeks to enhance their diagnostic performance [35]. We performed data synthesis via SMOTE to resolve data imbalance and confirmed that SMOTE could improve model performance. A study that predicted diabetes mellitus using healthcare data applied SMOTE to resolve data imbalances. The performance improved from 0.027 to 0.667 for the probabilistic neural network and 0.215 to 0.726 for the decision tree [36]. A study that used demographic, lifestyle, and blood test parameters to predict metabolic syndrome reported that model performance was enhanced up to an AUC of 0.091 after using SMOTE [37]. In our study, the F1-score for the random forest algorithm rose from 0.792 to 0.827 for total QOL, 0.719 to 0.777 for PCS, and 0.752 to 0.802 for MCS. Considering that clinical data frequently exhibit data imbalance across classes, data oversampling techniques, such as SMOTE, will be useful for developing diagnostic techniques, such as predicting QOL.

QOL is also associated with multiple factors. Our results showed that low QOL is associated with female sex, old age, high BMI, low income, presence of a chronic condition, non-smoking status, high cold–heat pattern score, high night snacking frequency, low eating index, low sleep quality, high physical activity, low sedentary lifestyle, moderate-/high-intensity work, high transport time, low leisure time, and high stress. These results are somewhat consistent with previous findings [5, 7]. Although the percentage of non-smokers was high in the low QOL group, this seems to pertain to the high percentage of women (74.4%) among non-smokers. Our study also confirmed the effects of cold–heat pattern identification—an individual characteristic—on QOL, and observed that it is an important factor in health [38].

Our study presents some points for improvement and limitations. A previous survey of QOL reported that physical functioning indices, such as grip strength, are correlated with QOL [39]. Therefore, adding physical functioning parameters to the basic demographic factors and lifestyle factors could enhance model performance. In addition, several studies are underway to interpret the results of ML model predictions, and such research is particularly valuable for clinical decision support systems [40]. As with our study, much research is being conducted on explainable AI; however, more research is needed to enhance the reliability of interpretation [41]. Finally, we analyzed the major predictors of QOL using cross-sectional data, but our analysis cannot infer the order between various factors and QOL. Despite these limitations, the representativeness of our study data was ensured by sex-, age-, and region-based population stratification, whereas the accuracy of assessment of various lifestyle factors was improved using standardized instruments with established validity and reliability.

Conclusions

This study compared the performance of several QOL prediction models, using individual and lifestyle factors among middle-aged adults in South Korea. A tree-based ML algorithm, random forest, was identified to have high accuracy in predicting the three QOL indices of total QOL, PCS, and MCS. The performance of the models improved when SMOTE was applied. As a result of analysis using XAI, stress and sleep quality were identified as two major predictors of QOL among middle-aged adults. Interest in QOL is increasing amid population aging and advances in healthcare technology. It is important to develop health management programs that enhance QOL for middle-aged adults from a multidisciplinary perspective, and the use of artificial intelligence technologies such as XAI will be useful.

Data Availability

The datasets are not available owing to confidentiality and ethical concerns. Further inquiries can be directed to the corresponding author or Korea Medicine Data Center (www.kdc.kiom.re.kr).

References

  1. Kempen GI, Ormel J, Brilman EI, Relyveld J. Adaptive responses among Dutch elderly: the impact of eight chronic medical conditions on health-related quality of life. Am J Public Health. 1997;87(1):38–44. https://doi.org/10.2105/AJPH.87.1.38.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Martinez R, Lloyd-Sherlock P, Soliz P, Ebrahim S, Vega E, Ordunez P, et al. Trends in premature avertable mortality from non-communicable Diseases for 195 countries and territories, 1990–2017: a population-based study. Lancet Glob Heal. 2020;8(4):e511–23. https://doi.org/10.1016/S2214-109X(20)30035-8.

    Article  Google Scholar 

  3. World Health Organization, Geneva. Global Health estimates: life expectancy and leading causes of death and disability. WHO; 2020. https://www.who.int/data/gho/data/themes/theme-details/GHO/mortality-and-global-health-estimates.

  4. Bowling A, Dieppe P. What is successful ageing and who should define it? BMJ (clinical research ed.). 2005;331: 1548–51. https://doi.org/10.1136/bmj.331.7531.1548.

  5. Giannouli P, Zervas I, Armeni E, Koundi K, Spyropoulou A, Alexandrou A, et al. Determinants of quality of life in Greek middle-age women: a population survey. Maturitas. 2012;71(2):154–61. https://doi.org/10.1016/j.maturitas.2011.11.013.

    Article  PubMed  Google Scholar 

  6. Makovski TT, Schmitz S, Zeegers MP, Stranges S, Van den Akker M. Multimorbidity and quality of life: systematic literature review and meta-analysis. Ageing Res Rev. 2019;53:100903. https://doi.org/10.1016/j.arr.2019.04.005.

    Article  PubMed  Google Scholar 

  7. Tan SL, Storm V, Reinwand DA, Wienert J, De Vries H, Lippke S. Understanding the positive associations of sleep, physical activity, fruit and vegetable intake as predictors of quality of life and subjective health across age groups: a theory based, cross-sectional web-based study. Front Psychol. 2018;9:977. https://doi.org/10.3389/fpsyg.2018.00977.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Lidin M, Ekblom-Bak E, Rydell Karlsson M, Hellénius ML. Long-term effects of a Swedish lifestyle intervention programme on lifestyle habits and quality of life in people with increased cardiovascular risk. Scand J Public Health. 2018;46(6):613–22. https://doi.org/10.1177/1403494817746536.

    Article  PubMed  Google Scholar 

  9. Marcos-Delgado A, Hernández-Segura N, Fernández-Villa T, Molina AJ, Martín V. The effect of lifestyle intervention on health-related quality of life in adults with metabolic syndrome: a meta-analysis. Int J Environ Res Public Health. 2021;18(3):887. https://doi.org/10.3390/ijerph18030887.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Colpani V, Baena CP, Jaspers L, Van Dijk GM, Farajzadegan Z, Dhana K, et al. Lifestyle factors, Cardiovascular Disease and all-cause mortality in middle-aged and elderly women: a systematic review and meta-analysis. Eur J Epidemiol. 2018;33(9):831–45. https://doi.org/10.1007/s10654-018-0374-z.

    Article  CAS  PubMed  Google Scholar 

  11. Phyo AZZ, Freak-Poli R, Craig H, Gasevic D, Stocks NP, Gonzalez-Chica DA, et al. Quality of life and mortality in the general population: a systematic review and meta-analysis. BMC Public Health. 2020;20(1):1–20. https://doi.org/10.1186/s12889-020-09639-9.

    Article  Google Scholar 

  12. Weng SF, Reps J, Kai J, Garibaldi JM, Qureshi N. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS ONE. 2017;12(4):e0174944. https://doi.org/10.1371/journal.pone.0174944.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Soo-Kyoung L, Youn-Jung S, Jeongeun K, Hong-Gee K, Jae-ll L, Bo-Yeong K, et al. Prediction model for health-related quality of life of elderly with chronic Diseases using machine learning techniques. Healthc Inf Res. 2014;20(2):125–34. https://doi.org/10.4258/hir.2014.20.2.125.

    Article  Google Scholar 

  14. Hatton CM, Paton LW, McMillan D, Cussens J, Gilbody S, Tiffin PA. Predicting persistent depressive symptoms in older adults: a machine learning approach to personalized mental healthcare. J Affect Disord. 2019;246:857–60. https://doi.org/10.1016/j.jad.2018.12.095.

    Article  PubMed  Google Scholar 

  15. Lee SH, Choi I, Ahn WY, Shin E, Cho SI, Kim S, et al. Estimating quality of life with biomarkers among older Korean adults: a machine-learning approach. Archives of Gerontology and Geriatricsr. 2020;87:103966. https://doi.org/10.1016/j.archger.2019.103966.

    Article  Google Scholar 

  16. Baek Y, Seo B, Jeong K, Yoo H, Lee S. Lifestyle, genomic types and non-communicable Diseases in Korea: a protocol for the Korean Medicine Daejeon Citizen Cohort study (KDCC). BMJ Open. 2020;10(4):e034499. https://doi.org/10.1136/bmjopen-2019-034499.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Daviglus ML, Liu K, Prizadda A, Yan LL, Garside DB, Feinglass J, et al. Favorable cardiovascular risk profile in middle age and health-related quality of life in older age. Arch Intern Med. 2003;163(20):2460–8. https://doi.org/10.1001/archinte.163.20.2460.

    Article  PubMed  Google Scholar 

  18. Grundy SM, Cleeman JI, Daniels SR, Donato KA, Eckel RH, Franklin BA, et al. Diagnosis and management of the metabolic syndrome: an American Heart Association/National Heart, lung, and Blood Institute scientific statement. Circulation. 2005;112(17):2735–52. https://doi.org/10.1161/CIRCULATIONAHA.105.169404.

    Article  PubMed  Google Scholar 

  19. World Health Organization. The Asia-Pacific perspective: redefining obesity and its treatment. 2000. https://apps.who.int/iris/handle/10665/206936.

  20. World Health Organization. WHO international standard terminologies on traditional medicine in the western pacific region. 2007. https://apps.who.int/iris/handle/10665/206952.

  21. Bae KH, Jang ES, Park K, Lee Y. Development on the questionnaire of cold-heat pattern identification based on usual symptoms: reliability and validation study. J Physiol & Pathol Korean Med. 2018;32(5):341–6.

    Article  Google Scholar 

  22. Ezzati M, Lopez AD, Rodgers AA, Murray CJL. Comparative quantification of health risks: global and regional burden of Disease attributable to selected major risk factors. World Health Organization; 2004.

  23. Yook SM, Park S, Moon HK, Kim K, Shim JE. Development of Korean healthy eating index for adults using the Korea national health and nutrition examination survey data. J Nutr Health. 2015;48(5):419–28. https://doi.org/10.4163/jnh.2015.48.5.419.

    Article  Google Scholar 

  24. Sohn SI, Kim HD, Lee MY, Cho YW. The reliability and validity of the Korean version of the Pittsburgh Sleep Quality Index. Sleep & breathing. Schlaf & Atmung. 2012;16(3):803–12. https://doi.org/10.1007/s11325-011-0579-9.

    Article  Google Scholar 

  25. Buysse DJ, Reynolds CF, Monk TH, Berman SR, Kupfer DJ. The Pittsburgh Sleep Quality Index: a new instrument for psychiatric practice and research. Psychiatry Res. 1989;28(2):193–213. https://doi.org/10.1016/0165-1781(89)90047-4.

    Article  CAS  PubMed  Google Scholar 

  26. Armstrong T, Bull F. Development of the world health organization global physical activity questionnaire (GPAQ). J Public Health. 2006;14(2):66–70. https://doi.org/10.1007/s10389-006-0024-x.

    Article  Google Scholar 

  27. Chang S. Standardization of collection and measurement for health data. Seoul: Kyechukmunhwasa; 2000. pp. 121–59.

    Google Scholar 

  28. Gandek B, Ware JE, Aaronson NK, Apolone G, Bjorner JB, Brazier JE, et al. Cross-validation of item selection and scoring for the SF-12 Health Survey in nine countries: results from the IQOLA Project. J Clin Epidemiol. 1998;51(11):1171–8. https://doi.org/10.1016/s0895-4356(98)00109-7.

    Article  CAS  PubMed  Google Scholar 

  29. Pezzilli R, Bini R, Fantini L, Baroni L, Campana E, Tomassetti D. Quality of life in chronic Pancreatitis. World J Gastroenterol. 2006;12(39):6249. https://doi.org/10.3748/wjg.v12.i39.6249.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Ware JE Jr. SF-36 health survey update. Spine. 2000 15;25(24):3130–9. https://doi.org/10.1097/00007632-200012150-00008.

  31. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. 2002;16:321–57. https://doi.org/10.1613/jair.953.

    Article  Google Scholar 

  32. Wang K, Tian J, Zheng C, Yang H, Ren J, Liu Y, et al. Interpretable prediction of 3-year all-cause mortality in patients with Heart Failure caused by coronary Heart Disease based on machine learning and SHAP. Comput Biol Med. 2021;137:104813. https://doi.org/10.1016/j.compbiomed.2021.104813.

    Article  PubMed  Google Scholar 

  33. Lee D, Bin S. Structure relationships for diseased and health-related quality of life in the elderly. J Korea Content Assoc. 2011;11(1):216–24. https://doi.org/10.5392/JKCA.2011.11.1.216.

    Article  Google Scholar 

  34. Miguel AQC, Tempski P, Kobayasi R, Mayer FB, Martines MA. Predictive factors of quality of life among medical students: results from a multicentric study. BMC Psychol. 2021;9(1):1–13. https://doi.org/10.1186/s40359-021-00534-5.

    Article  Google Scholar 

  35. Ishaq A, Sadiq S, Umer M, Ullah S, Mirjalili S, Rupapara V, et al. Improving the prediction of Heart Failure patients’ survival using SMOTE and effective data mining techniques. IEEE Access. 2021;9:39707–16. https://doi.org/10.1109/access.2021.3064084.

    Article  Google Scholar 

  36. Ramezankhani A, Pournik O, Shahrabi J, Azizi F, Hadaegh F, Khalili D. The impact of oversampling with SMOTE on the performance of 3 classifiers in prediction of type 2 Diabetes. Med Decis Making. 2016;36(1):137–44. https://doi.org/10.1177/0272989X14560647.

    Article  PubMed  Google Scholar 

  37. Kim J, Mun S, Lee S, Jeong K, Baek Y. Prediction of metabolic and pre-metabolic syndromes using machine learning models with anthropometric, lifestyle, and biochemical factors from a middle-aged population in Korea. BMC Public Health. 2022;22(1):1–10. https://doi.org/10.1186/s12889-022-13131-x.

    Article  Google Scholar 

  38. Bae KH, Lee Y, Go HY, Kim SJ, Lee SW. The relationship between cold hypersensitivity in the hands and feet and health-related quality of life in koreans: a nationwide population survey. Evid Based Complementary and Altern Med. 2019;6217036. https://doi.org/10.1155/2019/6217036.

  39. Chun SW, Kim W, Choi KH. Comparison between grip strength and grip strength divided by body weight in their relationship with metabolic syndrome and quality of life in the elderly. PLoS ONE. 2019;14(9):e0222040. https://doi.org/10.1371/journal.pone.0222040.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Antoniadi AM, Du Y, Guendouz Y, Wei L, Mazo C, Becker BA, et al. Current challenges and future opportunities for XAI in machine learning-based clinical decision support systems: a systematic review. Appl Sci. 2021;11(11):5088. https://doi.org/10.3390/app11115088.

    Article  CAS  Google Scholar 

  41. Antoniadi AM, Galvin M, Heverin M, Hardiman O, Mooney C. Prediction of caregiver quality of life in Amyotrophic Lateral Sclerosis using explainable machine learning. Sci Rep. 2021;11(1):1–13. https://doi.org/10.1038/s41598-021-91632-2.

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank all clinical research staff and participants of the study for their hard work.

Funding

This work was supported by the Development of Korean Medicine Original Technology for Preventive Treatment based on Integrative Big Data grant from the Korea Institute of Oriental Medicine [grant number: KSN1731121].

Author information

Authors and Affiliations

Authors

Contributions

JK and YB analyzed and interpreted the data regarding QOL and prediction. SL was a major contributor to writing the manuscript. KJ conducted quality control of the data and wrote the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Younghwa Baek.

Ethics declarations

Ethics approval and consent to participate

The study was designed and conducted in line with the declaration of Helsinki and was approved by the Ethics Committee of the Korea Institute of Oriental Medicine (code: I-1703/002–002, I-1904/002 − 001). Written informed consent was obtained from all participants.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, J., Jeong, K., Lee, S. et al. Machine-learning model predicting quality of life using multifaceted lifestyles in middle-aged South Korean adults: a cross-sectional study. BMC Public Health 24, 159 (2024). https://doi.org/10.1186/s12889-023-17457-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12889-023-17457-y

Keywords