Data source
Data were extracted from the 2000–2006 Medical Expenditure Panel Survey (MEPS), a nationally representative household survey of the US population, primarily designed to obtain national estimates of healthcare use, expenditure, and health insurance cover. Data were collected through five sets of in-person interviews in each panel at 4–5-month intervals over 2.5 years. These five interviews correspond to five rounds of data per panel, and each panel produced data on more than 15,000 individuals. Respondents reported on health service use related to their health condition, any physical and mental health problems, and loss of work or school days as a result of illness. Information on each condition was recorded verbatim and later coded by professional coders into appropriate International Classification of Diseases, 9th Revision (ICD-9) codes. The overall response rate across panels have generally ranged from 65% to 71%, with individual follow-up response rates at over 90% [10].
This study was exempt from the requirement for subject consent under category 4 (research of existing data publicly available) by the Harvard School of Public Health Human Subjects Committee (IRB).
Study population
Figure 1 illustrates the exclusion process for the final analytic sample. The longitudinal panel was constructed using the household respondents’ files for each year, later merged with the files on medical conditions and job information. Six constructed MEPS panels were pooled. Panel 5 began its interviews in 2000 and Panel 10 ended its interviews in 2006. The pooled data yielded an initial eligible total (IET) of 95,594 respondents. The baseline was set as Round 2, and the information at Round 1 was used as an indicator of a previous history of depression or other comorbidity.
Individuals were excluded from the sample if they met any of the following six criteria: 1) they did not complete the 2-year survey in each panel due to death, departure from the United States, institutionalization, or military service (n = 2,573; 2.7% of IET); 2) they were not eligible for all five rounds (n = 706); 3) they had a proxy interview (n = 148); 4) they were aged under 18 or over 65 (n = 38,410); 5) they were unemployed at baseline (n = 12,857); or 6) data on key covariates were missing from their information (n = 242). To ensure the temporal relationship between exposure and outcome, and to reduce the possibility that depression would affect the likelihood of injury in the following rounds, subjects with a previous history of depression (n = 1,433) at Round 1 and/or concurrent depression at baseline (Round 2, n = 548) were excluded. Finally, respondents who had reported an injury at Round 1 (n = 3,522) were also excluded, to avoid residual confounding by injury-prone characteristics at baseline. The remaining 35,155 subjects comprised the analytic sample.
Measures
The main predictor in this study was injury at baseline, which was determined from medical condition files using responses to the question of whether “the medical condition they experienced during the 4 or 5 months since the previous interview” was the result of an accident or injury. If the injury happened while the person was at work, it was identified as an occupational injury and ICD-9 codes were used to categorize the injured body region and type of injury based on the Barell classification matrix [11]. Injury severity score was calculated using the Abbreviated Injury Scale with ICD-9 code and the self-perceived overall health impact of the injury. Musculoskeletal disorders encompassed sprains, strains, and dislocations (ICD-9 codes 830–848) and diseases of the musculoskeletal system and connective tissue (codes 710–739). The most recent and severe injury was selected if the respondent reported multiple injury conditions in the preceding 4–5 months. Multiple injury episodes per person, number of healthcare utilizations, and duration of treatment for each injury condition were calculated.
The primary outcome variable in this study was depression incidence at rounds 3 to 5 of the survey. Depression was identified using the ICD-9 codes 296.2 (major depression, single episode) and 311 (depressive disorder, not elsewhere classified). At each interview round, information about depression was collected regarding healthcare utilization, such as prescribed antidepressants, hospital inpatient services, outpatient services, and emergency department services. Analysis was confined to the first reported depression episode for each respondent across the five rounds, because treatment often continued in the following rounds. Later episodes of depression in the same individual were regarded as a continuing treatment or recurrent episode. Individuals experiencing chronic depression that had occurred before the first round of each panel were excluded. Incident cases of depression were defined as those who had first reported depression at rounds 3, 4, or 5.
A comorbidity score was calculated based on D’Hoore’s [12] implementation of the Charlson Comorbidity Index. Occupation was defined using a condensed occupation code, based on the 2003 Census Industry and Occupation Coding scheme, which was collapsed into four occupational groups: white-collar (management, professional, sales, office and administration-related occupations), service, farming (farming, fishing, and forestry occupations), and blue-collar (construction, extraction, and maintenance; and production, transportation and material moving). Based on the risk factors for depression, injury, or both as reported in the literature [2, 3, 13], five types of potential confounding covariates measured at rounds 1 or 2 were considered in the analysis: sociodemographic factors (age, sex, race, education, marital status, family income level), job-related factors (occupation, company size, self-employment, job tenure, overtime work, work status), medical factors (co-morbidity, activity limitation, self-rated physical and mental health, number of health care events per each condition), health behaviors (current smoking, alcohol or substance abuse problem, exercise, obesity), access to healthcare (insurance coverage, regular visits to a particular doctor or health center), and any cognitive function impairment, such as experiencing confusion or memory loss, having problems making decisions, or requiring supervision for their own safety (yes vs. no).
Data analysis
The distributions of major demographic characteristics and work-related variables among workers with no injury, occupational injury, and non-occupational injury were compared using a chi-squared test. The incidence rates of depression were calculated for persons with injury and compared with the incidence rates among those without injury. A discrete time-proportional odds model [14] was used to estimate the likelihood of an individual developing depression during rounds 3 through 5 based on the experience of the baseline injury condition.
In the univariate analysis, the crude association between baseline injury and depression at follow-up was assessed. In the multivariate analysis, the full logistic regression model included injury and all other variables with a p-value < 0.2 in the univariate analysis. Variables that did not reach statistical significance at the p ≤ 0.05 level in the regression analysis were removed, but were subsequently retained if their removal changed the magnitude of the main effect by more than 10%. The final model included age, sex, race, education, occupation, family income level, marital status, healthcare accessibility, current smoking, obesity, exercise, activity limitation, cognitive function impairment, comorbidity, perceived physical and mental health status, job tenure, working hours per week, work status, and time since injury. No statistically significant interactions were found for these variables.
To explore the mechanisms that could explain the relationship between injury and depression, six models were tested. Model 1 (base) included the terms age, sex, and time. Model 2 additionally adjusted for race/ethnicity, education, marital status, family income, and health care accessibility. Model 3 added work-related factors such as occupational group, job tenure, number of working hours per week, and work status. Model 4 added smoking, alcohol or substance abuse disorder, exercise, and obesity. Model 5 added activity limitation because of a chronic medical condition, cognitive function impairment, and comorbidity. Model 6 added self-rated physical and mental health status. To evaluate the contribution that each set of risk factors made to the association between injury and depression, the odds ratios (ORs) and 95% confidence intervals (CIs) were calculated for each Model to determine the excess risk.
The initial analysis was carried out separately for men and women. However, the pattern of associations was similar for both sexes and the interaction was not statistically significant, so the results were not reported separately. All analyses were performed using SAS 9.2 (SAS Institute, Cary, NC, USA). A statistically significant association between an exposure and the outcome was declared at a p-value < 0.05.