The diabetes mellitus multimorbidity network in hospitalized patients over 50 years of age in China: data mining of medical records

Objective Many diabetes mellitus (DM) patients suffer from multimorbidity. Understanding the DM multimorbidity network should be given priority. The purpose of this study is characterize the DM multimorbidity network in people over 50 years. Methods Data on 75 non-communicable diseases (NCDs) were extracted from electronic medical records of 309,843 hospitalized patients older than 50 years who had at least one NCD. The association rules analysis was used as a novel classification method and combined with the Chi-square tests to identify associations between NCDs and DM. Result A total of 12 NCDs were closely related to DM, {cholelithiasis, DM} was an unexpected combination. {dyslipidemia, DM} and {gout, DM} had the largest \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\text{lift}$$\end{document}lift in the male and female groups, respectively. The negative related group included 7 NCDs. There were 9 NCDs included in the strong association rules. Most combinations were different by age and sex. In males, the strongest rule was {peripheral vascular disease (PVD), dyslipidemia, DM}, while {hypertension, dyslipidemia, chronic liver disease (CLD), DM} was the strongest in females. In patients younger than 70 years, hypertension, CLD, and dyslipidemia were the most dominant NCDs in the DM multimorbidity network. In patients 70 years or older, chronic kidney disease (CKD), CVD, CHD, and heart disease (HD) frequently co-occurred with DM. Conclusion Future primary healthcare policies for DM should be formulated based on age and sex. In patients younger than 70 years, more attention to hypertension, CLD, and dyslipidemia is required, while attention to CKD, CVD, CHD and HD is needed in patients older than 70 years. Supplementary Information The online version contains supplementary material available at 10.1186/s12889-024-18887-y.


Background
The global prevalence of DM in adults is on the rise: in 2017 it was 8.8% and it is expected to rise to 9.9% by 2045.In addition to diagnosed DM, approximately 352 million people worldwide are at risk of developing DM or pre-DM, and that number is expected to rise to 532 million by 2045 [1].China has one of the largest DM populations in the world.In 2017, there were 425 million adults with DM worldwide, of which 114 million (more than a quarter) were from China.The number of adults with DM in China is expected to rise to 120 million in 2045 [2].Moreover, many DM patients suffer from at least one additional disease called multimorbidity.It means two or more NCDs co-occur in a patient [3].Multimorbidity affects more than half of the elderly population and almost all hospitalized geriatric patients [4].The coexistence of NCDs in DM patients is more than a random event.Typically, it is due to the causal relationship between some NCDs or a shared pathogenic factor [5,6].Therefore, prevention and treatment of DM with multimorbidity are very important.Some studies aimed to identify the multimorbidity patterns in patients and confirmed the existence of clinically plausible multimorbidity patterns that evolve over time [7,8].Unfortunately, recommended management approaches for multimorbidity in patients with DM are lacking in most practice guidelines [9].
There is a difference in the etiological analysis of patients with a single NCD and those with multimorbidity [10].Most clinical practice guidelines and healthcare training and delivery focus on a single NCD, leading to care that is sometimes inadequate [11] and often results in an increase of intervention measures, such as numerous hospital visits, and polypharmacy [12,13].Consequently, the current healthcare systems fail to appropriately address the healthcare needs of patients with multimorbidity.
Although multimorbidity has been introduced in policy and practice in developed countries, developing countries have not considered it a matter of public health urgency [14].Improving the health status and quality of life of people affected by multimorbidity requires a new integrated and innovative treatment model [15].We aimed to explore the interrelationships in the DM multimorbidity network.It may help to address the challenge and provide new insights for interventions in DM multimorbidity.

Objectives
We presented the DM multimorbidity network in middle-aged and older adults and used association rules mining (ARM) to explore the relationship between 74 NCDs and DM.We focused on examining patterns that are present in people with DM.The first step was to investigate whether there were any associations between 74 NCDs and DM.The lift generated by the ARM was used as a classification indicator to identify the relationship between 74 NCDs and DM.The Chi-square test was used to test the statistical significance of the associations between the NCDs and DM.The second step was to explore the DM multimorbidity network and assess the variations in these patterns by age and sex.Based on the ARM algorithm, which can fully consider the importance and correlation strength between the NCDs and DM, we can obtain the multimorbidity patterns of DM.

Data source
The original data was obtained from the homepages of inpatient medical records through the Shenzhen National Health Information Platform, a data center that collects medical cases information from all medical institutions in Shenzhen.All inpatient records from January 1, 2017 to December 31, 2018 were included.Data include demographic characteristics of hospitalized patients (sex and age), information on inpatient diagnoses of NCDs and information on personal identifiers.All diagnoses were coded according to International Classification of Disease version 10 (ICD-10).First, we extracted a total of more than 3 million records with age ≥ 49 years in 2017 and age ≥ 50 years in 2018.Second, according to ICD-10, patients were excluded if they had not been diagnosed with at least one of the NCDs.Patients were also excluded if they had incomplete information, such as sex, age and personal identification.After matching information on personal identifiers, if the same personal identifiers appeared in both 2017 and 2018, the age in 2018 was used.The 49-year-old patients who only appeared in 2017 were deleted.Then the same personal identifiers were merged.The NCDs were selected based on those most frequently mentioned in previous studies of multimorbidity [16,18,19,21].On the other hand, NCDs with a proportion larger than 0.001 were selected, as they were considered to have a significant impact on the long-term management and quality of life of middle-aged and older Chinese individuals.Finally, we included data from a total of 309,843 participants aged ≥ 50 years with at least one of the 75 NCDs.Supplementary Material 1 lists all NCDs included and their corresponding ICD-10 codes.

Defining multimorbidity
Multimorbidity was defined as concurrently suffering from two or more NCDs [13].NCDs were identified if they had been documented using inpatient ICD-10 codes in individual recent medical records from January 1, 2017 to December 31, 2018.To explore the DM multimorbidity network more comprehensively, we included 75 NCDs to study the DM multimorbidity network based on previous studies and the current data of this study.

Descriptive analysis
Patients were categorized into four subgroups according to age (50 − 59, 60 − 69, 70 − 79, and ≥ 80 years).Sex was categorized into two subgroups and cross-combined with age into eight age-sex subgroups.Descriptive statistics, including number and proportion (%), were used in the study population.The Chi-square test was performed to compare differences in the characteristics of patients with and without DM.Age was presented as median (interquartile range; IQR).The 10 most frequent dyads, triads and quartets of NCDs combined with DM by sex and age were evaluated.A P < 0.05 was considered statistically significant.All the descriptive statistical analyses in this study were performed using R 3.4.0(The R Foundation for Statistics and Mathematics, Vienna, Austria).ARM was performed using R 3.4.0with the arules package.

Association rule mining
ARM is used to examine associations between NCDs [17,27,28].It is a fast method to discover combinations of NCDs that occur more frequently than expected and might provide insights into NCDs and aging mechanisms.Several applications of ARM in the medical domain include examining disease co-occurrences [12,15,16], identifying adverse effects of drugs [29], and detecting risk factors for disease [30][31][32][33][34].We analyzed the data using the Apriori algorithm and applied the ARM to determine the common multimorbidity patterns for DM that met a minimum requirement of measurement indicators.
The three commonly used measurement ratios were used.The support ( sup ) is a measure of how frequently NCD A and NCD B combinations appear in the dataset.It measures the importance of rule {A, B} and is defined as: sup(AB) = P(AB) .A higher sup indicates that the rule is more important, and it is usually needed to set a minimum threshold to exclude rules that are not important.The confidence ( con ) is the conditional probabil- ity that a participant who has NCD A also has NCD B , and it is defined as: con(AB) = P(B|A) = P(AB)/P(A) .The lift(AB) is the ratio of the observed sup(AB) to that expected if A and B are independent.It is defined as: lift(AB) = P(AB)/(P(A)P(B)) = con(AB)/P(B) [28].A higher lift indicates a higher chance of co-occurrence of NCD A with NCD B and a more significant associa- tion.The lift measures the strength of an association as a rule within ARM and is therefore considered the main outcome in this study.It can be used to identify rules whether the dependence between A and B is weak or strong [35].We applied this method to examine the association in a dataset of people with DM and 74 other NCDs using a classifier based on the lift .When lift(AB) = a > 1 , this indicates that A combined with B occurs a-fold more than expected under statistical independence.It can be interpreted as a positive relationship between A and B .When lift(AB) < 1 indicates that the joint set {A, B} appear less often than expected, there is a negative relationship between A and B .When lift(AB) = 1 , this indicates that no association between A and B .Hence, a higher sup indicates a more impor- tant joint set {A, B} .A higher lift indicates a stronger association of the joint set {A, B} .The sup , con , lift are related to the effect size of associations, as opposed to simple tests of statistical significance [17].Association rules with more than two items are similar to those with two items.The sup There were 2 74 possible combinations for the 74 mor- bidities we included.Setting a higher threshold value would reduce the number of rules that might result in missing essential rules with low frequencies.Setting a lower threshold would prevent management from aggregating rules [36].Appropriate sup and lift values help to mine reasonable rules and ensure the robustness of the model performance.Therefore, many rounds of testing and evaluation were carried out before defining final thresholds to mine reasonable rules and to ensure the robustness of the model performance.To avoid missing any critical association rule, we set the sup threshold range from 0.001 to 0.01, increasing by 0.005 each time, and no minimum con or lift thresholds were limited.Since lift(AB) = con(AB)/sup(A) , the value of con affects the value of lift .Other studies focus on the case of lift > 1 , so a high con is accepted.However, we focus on lift > 1 , lift = 1 and lift < 1 , so we do not limit the value of con .In our programme the minimum con set as 0.0000001.In addition, using the objective indicator lift , we developed ARM as a novel classification method to examine associations between 74 NCDs and DM.It does not rely on preconceived assumptions about whether certain conditions are associated, thereby minimizing confirmation biases because no hypotheses were postulated [12] and is thus an objective parameter.The flow chart of the above analysis was shown in Fig. 1.
Tables 2 and 3 show the 10 most common dyads, triads and quartets of combinations of NCDs associated with DM by sex and age, respectively.
The most common NCDs included hypertension, dyslipidemia, CVD, CLD, PVD, CHD, CKD, HD, gout, arrhythmia, anemia, and prostate disease (PD).The proportion of men with triad combinations including DM was generally higher than that of women with the same combinations (P < 0.001).Among quartets of NCD combinations, the 10 most common combinations differed by sex and age (P < 0.001).

Multimorbidity patterns in people with DM
In this part, the minimum sup threshold was set as 0.005, to get as many association rules as possible.The reason for not setting sup=0.001 is to avoid a large error in lift .The number of items of the association rule was set to 2, and the "consequent" of the association rule was set to "DM".When lift > 1.1 , the ARM showed a list of NCDs that were positively related to DM, while lift < 0.9 cre- ated a list of NCDs that were negatively related to DM.The remaining NCDs were weakly related or not related to DM.After analysis, there was a total of 25 NCDs.A total of 7 NCDs were negatively related to DM.The co-occurrence of these NCDs in DM patients is unlikely to be due to randomness.In the subgroups of sex and age, disc degeneration appeared in both the positively related and negatively related groups.
In order to support the reliability of the conclusions, Chi-square tests were used to assess the statistical significance of the association rules.This includes 12 positively associated NCDs and 7 negatively associated NCDs.Odds ratios (OR) and 95% confidence intervals (95% CI) of associations between antecedent NCDs and DM in the association rules are shown in Table 4.For 12 NCDs in the positive group, all the ORs are greater than 1, and P < 0.001, indicating that the DM was more likely to be positive when the combinations of antecedent NCDs were positive than negative.For the 7 NCDs in the negative group, the results were reversed.

Variations in DM multimorbidity patterns by sex and age
For the selection of combinations that were important and closely related, we set sup ≥0.01, lift ≥ 1.5, and the con was unbounded.Among the four age groups of men, 48, 81, 137 and 108 rules were detected, while 16, 53, 136 and 115 rules were found in women.The top 10 association rules with larger lift in 8 subgroups are described in Fig. 4. The association rules include a total of 9 NCDs.The types and order of the most common NCDs are quite different between age-sex subgroups.Multimorbidity in patients with DM was more prominent in men and older individuals.Most of men's sup were higher than women's, especially in the group of 50 − 59 years.CHD occurred more frequently in men than in women and more frequently in the group of 70 years and older than in group of 50 − 69 years.In addition, 4 rules included gout in women, but zero in men.Most of the rules in the groups younger than 70 years were triads, while most of the rules were quartets in the groups of 70 years and older.Hypertension, CLD and dyslipidemia appears more frequently in the association rules for the groups younger than 70 years.The DM multimorbidity network was complex in the group of 70 years and older.CKD, CVD, CHD and HD frequently appeared in the association rules.

Discussion
This study examined the relationship between 74 NCDs and DM using ARM as a novel classification method.Of the 74 NCDs, 12 were positively associated with DM, and 7 were negatively associated.We also used ARM to explore the DM multimorbidity pattern in age-sex  subgroups, with 9 common NCDs included in the results.Men and older people were more vulnerable to multimorbidity in those with DM, and particular multimorbidities in people with DM cluster together frequently and more often than expected by chance.CVD, CHD, CKD, CLD, HD, hypertension, dyslipidemia, gout, and PVD were common in the DM multimorbidity network and were directly or indirectly related to DM. Hypertension, CLD and dyslipidemia were more common in people younger than 70 years, while CKD, CVD, CHD and HD were more common in people older than 70 years.Among the positive correlation group, the shared etiologies of most NCDs and DM have been demonstrated in previous studies, such as for CVD [37], CHD [38,39],  CKD [40], CLD [41], heart disease [42], dyslipidemia [43,44], prostate disease [45], gout [46], PVD [47], and transient cerebral ischemia [48].Further explanations of the clinical significance are as follows.A study of 2,400 older people with and without DM confirms that DM is significantly associated with brain infarction [37].A genomewide, multi-ancestry study of genetic variation for DM and CHD shows that a genetically mediated increase in DM risk confers a higher risk of CHD [40].A 10-year follow-up study showed that DM is associated with higher risks of liver cancer and CLD [41].Epidemiologic and clinical data from the last 2 decades have shown that the prevalence of heart failure in DM is very high and that the prognosis for patients with heart failure is worse in those with DM than in those without it [42].Data from animal models and humans show that very low levels of high-density lipoprotein cholesterol are often associated with hyperglycaemia and DM.Cholesterol homeostasis is important for adequate beta-cell insulin secretion [45].Data from Francesco et al. suggest that metabolic alterations and CVD influence aggressive and metastatic prostate cancer [45].A genome-wide analysis study showed that after excluding obesity and alcohol consumption behaviour, this study showed that patients with gout and DM share the common genetic factors most, and that there is a mutual inter-dependent effect on higher incidences [46].The {dyslipidemia, DM} and {gout, DM} had the greatest lift in men and women, respectively.The relation- ship between dyslipidemia, gout and DM in men and women needs to be considered.The {cholelithiasis, DM} occurred at least 1.2 times more than expected under statistical independence.This was likely as cholelithiasis and DM have the same pathological pathways or potential risk factors.However, previous multimorbidity studies have not found this relationship.There needs to be more clarity in understanding the co-occurrence of cholelithiasis and DM.A study showed that cholelithiasis was directly related to body weight and abdominal adiposity [49].Obesity is related to DM, suggesting a potential relationship between cholelithiasis and DM.Clarifying the relationship between cholelithiasis and DM is of great significance for patients.
Seven NCDs were negatively related to DM.There have been limited studies on chronic gastritis, malignant tumor, osteoporosis, bronchiectasis and pulmonary heart disease with DM.A review study [50] listed seven studies on DM and disc degeneration, of which four showed that DM was a significant risk factor for disc degeneration, and the remaining three failed to find any association.Another study concluded that DM has a devastating effect on disc degeneration [51].Our results showed a negative association between disc degeneration and DM, adding to the clinical evidence that is not consistent.The published studies on the co-occurrence of COPD and DM are controversial [52,53].There may be some potential influencing factors of COPD and DM, resulting in a negative association between COPD and DM.The biological link between COPD and DM is still unclear.
Our other goal was to explore the association rules between 74 NCDs and DM by subgroup analyses.Compared with the existing literature, this study focused on the DM multimorbidity network rather than that of all included NCDs.Our results are more detailed and comprehensive.In the published studies, only several rules on DM were generated, and most of them were already well-known, such as {hypertension, DM} [7, 15, 17-19, 21, 54-56], {dyslipidemia, DM} [7,15,[17][18][19]56], {CHD, DM} [12,19,57], {CKD, DM} [12].Our results highlight some important combinations with DM and show differences in the type and order of the most common associations by sex and age, which is consistent with the results of Han et al. [55].Multimorbidity in patients with DM are more prominent in men and older people.There were significant differences in {gout, DM} and {dyslipidemia, DM} for men and women.Gout was more strongly related to DM in women than that in men, meaning there is a higher risk for women.Among men, the most common rule for dyslipidemia appeared in {hypertension, dyslipidemia, DM} with large sup .Among women, dyslip- idemia was more likely to be related to other NCDs than that in men.The proportion of people with multimorbidity in those with DM increased with advancing age, but it was lower in those older than 80 years compared to those aged 70 to 79 years.For those younger than 70 years, triad was the most common type of rule.Hypertension, dyslipidemia and CLD are common NCDs.They play an important part in multimorbidity in patients with DM, while CKD, CVD, CHD and heart disease frequently cooccur in people with DM older than 70 years.It suggests that screening for additional NCDs in each age group in a targeted manner becomes more efficient.The difference in age and sex may be explained by survival bias.
Several limitations of our study need to be acknowledged.First, our research data were from hospitalized patients, therefore the proportions of people with multimorbidities cannot be applied to the whole population.However, this was separate from the main aim of this study which was to focus on the DM multimorbidity network.In addition, our results based on more severe cases may provide ideas for research into the early prevention of combinations of DM and other NCDs.Second, due to the cross-sectional nature of the data the results did not demonstrate causal links between NCDs and DM.Finally, the time of DM onset and detailed information on patients' physical strategies, lifestyle factors, socioeconomic status and family history were not included in the model in this analysis due to lack of data availability, and the data set anonymized participants to avoid possible misuse; therefore, some potential confounding factors were not taken into consideration.
Despite the increasing prevalence of multimorbidity in patients with DM, there are no specific recommendations for diagnosis and treatment [58].The management and prevention of DM with multimorbidity through health interventions should be offered to individuals by the primary health care providers.However, there is a lack of evidence of effective interventions in previous studies.Investigating the DM multimorbidity network remains an area that needs to be explored in future research [59].Our results confirm and expand the findings of previous studies on multimorbidity in patients with DM.Because of the large sample size in this study, our results are generally more reliable than those in previous studies.These results have the potential to consider the DM multimorbidity network as a framework for addressing the care of older adults with DM multimorbidity, and to support policies for the management of DM patients with multimorbidity in primary care and community settings.The results also provide support and a new perspective for future longitudinal or experimental studies to identify potential mechanisms and risk factors for the DM multimorbidity network.This will help healthcare providers improve the effectiveness of DM management.In addition, different strategies should be developed prevent multimorbidity in people with DM.When developing guidelines for the management of DM patients, age, sex and potential risks of diseases need to be taken into account for recommendations on the diagnosis and monitoring.

Conclusion
Our results indicate that the DM multimorbidity network varies by age and sex.It suggests that targeted screening for DM according to age and sex will increase efficiency.The combination {cholelithiasis, DM} gave an unexpected multimorbidity score and represented a complex comorbid condition.Of course, further longitudinal or experimental studies are needed to establish causal relationships between NCDs and DM.A more integrated multidisciplinary approach focusing on improved management and prevention of DM may help prevent other NCDs in the network.The guidelines on the management of patients with DM should be focused on recommendations based on age and sex and potentially revised to consider the co-management of NCDs that cluster around DM.
Figures 2 and 3 show the results of the ARM by lift and sup , respectively.There were 12 NCDs positively related to DM.The proportion of people with most of these NCDs in combination with DM increased with age.The proportions increased more in women than in men.The {PVD, DM}

Fig. 1
Fig. 1 The flow chart of the main research steps

Fig. 2 Fig. 3
Fig. 2 Heatmap of lift values between 25 NCDs and DM for 8 age-sex-based subgroup ( sup >.005).The red grid represents a lift > 1, the redder the color, the greater the lift ; the blue grid represents a lift < 1, the bluer the color, the smaller the lift ; and the white grid represents a lift close to 1.The y-axis represents the age group.The 1 represents a positive association between NCD and DM.The 2 represents a weak or no association between NCD and DM.The 3 represents a negative association between NCD and DM.The F represents the female group.The M represents the male group.(CVD: cerebrovascular disease; CHD: coronary heart disease; Cho: cholelithiasis; CKD: chronic kidney disease; CLD: chronic liver disease; HD: heart disease; Hyp: hypertension; Dys: dyslipidemia; PD: prostate disease; PVD: peripheral vascular disease; Tci: transient cerebral ischemia; Ane: anemia; Diz: dizziness/vertigo; Ost: osteoarthropathy; SC: senile cataract; Spo: spondylosis: Bro: bronchiectasis; CG: chronic gastritis; COPD: chronic obstructive pulmonary disease; DD: disc degeneration; MT: malignant tumor; OP: osteoporosis; Pul: pulmonary heart disease)

Table 1
Prevalence of DM in different age groups of males and females

Table 2
Prevalence of the 10 most common morbidity about DM by sex DM in the table is omitted.Hyp hypertension, Dys dyslipidemia, CVD cerebrovascular disease, CLD chronic liver disease, CHD coronary heart disease, CKD chronic kidney disease, PVD peripheral vascular disease, HD heart disease, Arr arrhythmia, Cho cholelithiasis, SC senile cataract, Ane anemia

Table 3
Prevalence of the 10 most common morbidities about DM by age DM in the table is omitted.Hyp hypertension, Dys dyslipidemia, CVD cerebrovascular disease, CLD chronic liver disease, CHD coronary heart disease, CKD chronic kidney disease, PVD peripheral vascular disease, HD heart disease, Arr arrhythmia, Cho cholelithiasis, SC senile cataract, Ane anemia

Table 4
OR and 95% CIs of the associations between NCDs and DM CVD cerebrovascular disease, CHD coronary heart disease, CKD chronic kidney disease, CLD chronic liver disease, HD heart disease, PD prostate disease, PVD peripheral vascular disease, TCI transient cerebral ischemia, CG chronic gastritis, COPD chronic obstructive pulmonary disease, DD disc degeneration, MT malignant tumor, PHD pulmonary heart disease