Study Population
The Health and Retirement Study (HRS) is a national, population-based study that has tracked individuals and households since 1992 [28]. The first cohort (1992) included 12,654 adults born from 1931-1941. Adjusting for mortality, the response rates have remained above 84% in the seven subsequent biennial waves.
Analytic Sample
Longitudinal data spanning 16 years is examined. Data were collected biennially, with a maximum of 8 repeat observations. Data are drawn from the RAND (2008) combined data files. Item missing data is imputed according to RAND's criteria [29]. The analytic sample consists of 2,494 diabetic individuals of the 3 ethnic groups examined (white, black, or Hispanic) who participated in at least 3 survey waves (those who became diabetic are only included in the waves after which they report having diabetes). The majority of these respondents (2,379) were included in the analytic sample at baseline. As this study focuses on older adults, the sample was restricted to respondents the age of 65 years or greater at the time of a given wave's interview. The study population was restricted to focus on processes by which health declines among an ever-growing segment of our population--chronically ill older adults--and how these processes intersect with race/ethnicity and socioeconomic statuses. The age 65 was selected as a minimum to reduce age-effects introduced by specific age-eligible programs and transitions such as retirement, Medicare and Social Security, which could have different effects for health status according to racial/ethnic or socioeconomic group. Observations are in person-year due to the longitudinal tracking nature of the study. The analytic sample includes 2,494 respondents, but there are a total of 19,061 observations, as respondents participated in multiple waves.
Outcome measure
Health decline was assessed by changes in self-reported health over time. Health status is examined in cumulative odds of reporting an incrementally higher health value. Values were recoded to reflect 5 (excellent), 4 (very good), 3 (good), 2 (fair), and 1 (poor).
Race/Ethnicity measures
Baseline race/ethnicity categories were assigned by looking at reports from all waves of data for race. Respondents initially identified as White/Caucasian, Black/African American, or Other. When asked whether Hispanic or non-Hispanic, respondents were categorized as Hispanic according to the first non-missing value answered. Therefore, three mutually exclusive categories of non-Hispanic white, non-Hispanic black, and Hispanic were created that remain consistent across waves.
Socioeconomic measures
A baseline measure of education was used. The cross-wave highest degree categorical variable is assigned by utilizing the first non-missing value across survey waves. Three distinct categories were generated, including less than high school (some high school or less), high school (high school or GED), and some college or more (AA, BA, or graduate-level education). The time-varying measure total household income (in $100 s) is the sum of all income in the household. RAND adjusted for slight variations in questions across waves [29]. Smaller increments were used to examine household income than household assets as the former has a smaller range of values. The time-varying measure total household assets (in $1000 s) is the net value of total wealth minus all debt, including primary and secondary residences, and assets. Debts (including mortgages and other debts) are subtracted from positive assets to equal the final value. Education, household income, and household assets were generally not highly correlated: the highest correlation was 0.5 (between household assets and household income). Analyses were also tested with only income (in which income was slightly significant) and with only assets (in in which assets were slightly more significant) - as well as subsequent analyses - suggested that this high correlation is not driven by a variable omitted from this analysis. Income and asset variables were also tested in logged and quadratic forms (with similar levels of significance). Initial values were retained for improved interpretation of the coefficients.
Covariates Analyzed
Multivariate models controlled for health and socio-demographic covariates. These control variables included time-varying body mass index (BMI) and whether or not the respondent had private health insurance (in addition to Medicare). Time-varying BMI was calculated as weight divided by the square of height. As older adults often do not maintain their height, height measures are updated by wave when available. When updated height information was not available, height from previous waves was carried forward for missing cases. Changes in BMI over time suggests individual trends in adhering to practices that are typical of a diabetes regimen, such as engaging in physical activity and eating a healthful diet. The variable private health insurance (at baseline) indicates whether or not the respondent received insurance from his or her current or prior employer (or spouse's employer) in addition to public insurance such as Medicare [29]. If a respondent received private health insurance but did not in a subsequent wave, then the observation was replaced with a negative response. Baseline measures of gender and number of chronic illness comorbidities (in addition to diabetes) were also included. The baseline measure of chronic illness comorbidities (comorbidity) was calculated by summing the maximum number of up to five selected chronic illnesses at baseline: high blood pressure, cancer, lung disease, heart disease, stroke, psychiatric problems, and arthritis. Models also adjusted for baseline working status. Respondents were posed the question "Are you currently working for pay?" Missing data were imputed by RAND based on related questions in some waves (i.e. "are you working now?"). Working status could relate negatively or positively to one's position in society. For example, individuals might have to be physically healthy enough to work. Work could also be indicative of financial well-being (the option to work) or of financial hardship (the necessity to work). Benefits of work could include physical activity, social engagement, and cognitive exercise, while negative aspects could include stress or physical demands/strain. Finally, models adjusted for time-varying age in years.
Analytic Strategy
Multilevel modeling was used to examine the relationships between socioeconomic status and race/ethnicity with long-term health decline. Three multilevel cumulative logit regression models were used [30–32]. The first two models examine the extent to which long-term health outcomes are predicted by socioeconomic status and race/ethnicity, respectively (controlling for health and sociodemographic covariates). The third model incorporates socioeconomic status measures and race/ethnicity, as well as covariates. Together, the models enable the analysis of the relationship between race/ethnicity and health as well as socioeconomic status and health.
The multilevel cumulative logit regression model, a specific form of multilevel ordered logit analysis, is appropriate due to the ordinal rank if the dependent variable from 1 (poor) to 5 (excellent), which considers floor, ceiling effects, and skewness more than does an OLS regression model. This model takes ordering of response categories (from poor to excellent health) into consideration when estimating how predictor variables relate to probabilities of a given response, has fewer parameters than multinomial logit regression models, and is therefore more parsimonious [33].
According to Heeringa and colleagues, the estimated coefficients from the multilevel cumulative logit regression models provide information about the relationships between the cumulative logits and the relationships of response and predictors [33]. For example, positive values suggest that relative to the reference category of poor health, the distribution of ordinal responses is shifted above that distribution. The resulting coefficient is an individual-specific proportional odds ratio. The ratio is interpreted as the odds of change from one interval of health to another over the entire study period.
This model allows the slope and intercept parameters of self-reported health status to vary across individuals and over time (as a random slope, random intercept model), so that they become dependent variables in the level two (or person-level) model, where individual characteristics are included as predictors. The multilevel cumulative logit procedure is appropriate for this analysis due to the ability to analyze changes within and between individuals and groups over time, taking into consideration the dynamic role of behaviors and circumstances over the life course. This model assumes that the random intercept and random slope follow a normal distribution and are independent across respondents. The proportional odds assumption was not violated. All were conducted using Stata. Significance is tested at P < .05.
Model 1: The first model examines racial/ethnic differences in the steepness or rate of health decline. Racial/ethnic variables (white, black, and Hispanic) and all health-related and sociodemographic covariates, excluding socioeconomic measures were included in the model.
Model 2: The second model examines socioeconomic differences in the steepness of health decline. Specifically, it examines whether those with lower levels of education experienced successively steeper rates of health decline, whether increases in income in are associated with steadier rates of decline, and whether increases in wealth are associated with steadier rates of decline of self-rated health. Finally, socioeconomic measures and all health-related and sociodemographic covariates are included. Race/ethnicity are excluded from the model.
Model 3: The final model tests whether any racial/ethnic and socioeconomic differences suggested by prior models remain when simultaneously adjusted in the model. This model includes all measures and covariates.