Data sources
All data in this retrospective cohort study were obtained from the CHARLS database [14], which is a nationally representative investigation of Chinese adults with 45 years or older. The investigation aimed at assessing the social, economic and health circumstances of residents. Respondents were followed every 2-3 years by conducting face-to-face computer-assisted personal interviews, physical measurements and blood tests. The baseline survey was carried out in 2011, with three follow-up surveys conducted in 2013, 2015, and 2018 [15, 16]. http://charls.pku.edu.cn/
Study eligibility criteria
Due to the high rate of lost follow-up for the included population of CHARLS database before 2015, in this retrospective cohort study, we chose the baseline data in the CHARLS database 2015, and follow-up data in 2018. Included criteria: participants had information about sleep duration and depression in the CHARLS database 2015 (n=14,962). Excluded criteria: participants already diagnosed with CVD before survey in 2015 (Fig. 1). All interviewees in CHARLS database needed to sign informed consent, and Biomedical Ethics Review Committee of Peking University approved the ethical review for the data collection in CHARLS database [14], thus according to the Ethics Review Committee of Guang’anmen Hospital, China Academy of Chinese Medical Sciences, secondary database analysis has been exempted from an ethical review.
Data collection
Baseline variables and laboratory indicators were collected, including age (years), sex, educational background, marital status, deposit (CN¥), disability, exercise time (h/day), drinking, smoking, sleep time (h/day), depression, nap (min/day), chronic kidney disease (CKD), dyslipidemia, sleep disturbance, the use of hypnotics, diabetes, CVD, triglyceride (TG, mg/dl), high-density lipoprotein (HDL, mg/dl), low-density lipoprotein (LDL, mg/dl), systolic blood pressure (SBP, mmHg), diastolic blood pressure (DBP, mmHg), total cholesterol (TC, mg/dl), blood glucose (GLU, mg/dl), glycosylated hemoglobin (GHB, %).
Sleep duration was assessed by the respondents’ self-reported question which asked, “During the past month, how many hours of actual sleep did you get at night (average hours for one night)? This may be shorter than the number of hours you spend in bed.” The short, normal and long sleep duration were defined as <6 h, 6-8 h, >8 h, respectively [17]. Nap duration was measured by the following question “During the past month, how long did you take a nap after lunch on average?” (0 represent that respondent did not nap duration) [14]. Sleep disturbance was defined as how many days a week did participants have trouble falling asleep, frequently nighttime awakenings and earlier waking [18]: rarely or none of the time (<1 day), some or a little of the time (1-2 days), occasionally or a moderate amount of the time (3-4 days), and most or all of the time (5-7 days). The Epidemiological Studies Depression Scale (CES-D) was used to assess depression, which has been used to measure depression of the population [19]. The scale options consisted of 4 levels and were assigned: “rarely or none of the time=0”, “some or few times=1”, “occasionally or moderate number of times=2”, “most or all of the time=3”; The total score ranges from 0 to 30, with a scores ≥10 were defined as having depression [20]. Hypertension, CKD, dyslipidemia and diabetes was assessed by a self-report of physician's diagnosis: Have you been diagnosed with hypertension, CKD, dyslipidemia or diabetes by a doctor? Participants who answered “yes” to the question were defined as having hypertension, CKD, dyslipidemia or diabetes [21].
Outcomes
Outcome variable was defined as the occurrence of CVD in the present study. The CVD was assessed by the following questions: “Have you been told by a doctor that you have been diagnosed with a stroke” or “Have you been diagnosed with heart attack, coronary heart disease, angina, congestive heart failure, or other heart problems?” Participants who answered “yes” to the question during the follow-up period were defined as having CVD [22].
Statistical analysis
The normality test of measurement data was conducted by Shapiro-Wilk, normal distribution data was exhibited as mean ± standard deviation (Mean ± SD), and comparison between groups adopted independent sample t-test and ANOVA was used for comparison between multiple groups. Non-normal data were described in terms of
median and interquartile range [M (Q1, Q3)], and the comparison between groups was performed by Mann-Whitney U test and Kruskal-Wallis H test was used for comparison between multiple groups. The enumeration data were expressed as number of cases and composition ratio n (%), Chi-square or Fisher's exact test was used for comparison between two groups.
We adopted the univariate negative binomial regression model to explore the possible covariates that were associated with CVD. Then, multivariate negative binomial regression model was carried out to assess the statistical correlation of sleep duration and depression on CVD separately. Three models were used in this study. Model 1 was regarded as unadjusted; Model 2 adjusted several covariates that were performed for statistically significant in univariate analysis and had an impact on CVD in the literature, including age, sex, educational background, marital status, exercise time, chronic kidney disease, hypertension, diabetes, dyslipidemia, the use of hypnotics, disability, nap, drinking and deposit; Model 3 adjusted age, sex, educational background, marital status, exercise time, chronic kidney disease, hypertension, diabetes, dyslipidemia, the use of hypnotics, disability, nap, drinking, deposit, sleep disturbance, HDL, TC, TG, GLU and GHB. Additionally, we used the multivariate negative binomial regression models to evaluate the joint effect of sleep duration and depression on the CVD risk in different populations. Relative risk (RR) with 95% confidence interval (CI) was reported. With respect to missing data of the variables, we adopted multiple interpolation method. The data were interpolated for five times, and five datasets were generated. In the five datasets, the mean of the data with five times interpolations was taken for measurement data, and the mode of the data interpolated for five times was taken for enumeration data. A new interpolated dataset was obtained for subsequent analysis. Sensitivity analysis of missing data before and after interpolation was shown in Supplemental Table 1. Smoking data was missing too much and not participated in the analysis. We used the SAS (version 9.4, SAS institute., Cary, NC, USA) software for the statistical analysis and R (version 4. 0. 3, Mice package) for the multiple interpolation. Statistical tests were performed by using bilateral tests. P<0.05 was regarded as statistically significant.