Educational inequality in multimorbidity: causality and causal pathways. A mendelian randomisation study in UK Biobank

North, Teri-Louise; Harrison, Sean; Bishop, Deborah C; Wootton, Robyn E; Carter, Alice R; Richardson, Tom G; Payne, Rupert A; Salisbury, Chris; Howe, Laura D

doi:10.1186/s12889-023-16369-1

Research
Open access
Published: 28 August 2023

Educational inequality in multimorbidity: causality and causal pathways. A mendelian randomisation study in UK Biobank

Teri-Louise North¹,
Sean Harrison¹,
Deborah C Bishop¹,
Robyn E Wootton^1,2,3,
Alice R Carter¹,
Tom G Richardson¹,
Rupert A Payne^4,5,
Chris Salisbury⁴ &
…
Laura D Howe¹

BMC Public Health volume 23, Article number: 1644 (2023) Cite this article

1657 Accesses
1 Citations
1 Altmetric
Metrics details

Abstract

Background

Multimorbidity, typically defined as having two or more long-term health conditions, is associated with reduced wellbeing and life expectancy. Understanding the determinants of multimorbidity, including whether they are causal, may help with the design and prioritisation of prevention interventions. This study seeks to assess the causality of education, BMI, smoking and alcohol as determinants of multimorbidity, and the degree to which BMI, smoking and alcohol mediate differences in multimorbidity by level of education.

Methods

Participants were 181,214 females and 155,677 males, mean ages 56.7 and 57.1 years respectively, from UK Biobank. We used a Mendelian randomization design; an approach that uses genetic variants as instrumental variables to interrogate causality.

Results

The prevalence of multimorbidity was 55.1%. Mendelian randomization suggests that lower education, higher BMI and higher levels of smoking causally increase the risk of multimorbidity. For example, one standard deviation (equivalent to 5.1 years) increase in genetically-predicted years of education decreases the risk of multimorbidity by 9.0% (95% CI: 6.5 to 11.4%). A 5 kg/m² increase in genetically-predicted BMI increases the risk of multimorbidity by 9.2% (95% CI: 8.1 to 10.3%) and a one SD higher lifetime smoking index increases the risk of multimorbidity by 6.8% (95% CI: 3.3 to 10.4%). Evidence for a causal effect of genetically-predicted alcohol consumption on multimorbidity was less strong; an increase of 5 units of alcohol per week increases the risk of multimorbidity by 1.3% (95% CI: 0.2 to 2.5%). The proportions of the association between education and multimorbidity explained by BMI and smoking are 20.4% and 17.6% respectively. Collectively, BMI and smoking account for 31.8% of the educational inequality in multimorbidity.

Conclusions

Education, BMI, smoking and alcohol consumption are intervenable causal risk factors for multimorbidity. Furthermore, BMI and lifetime smoking make a considerable contribution to the generation of educational inequalities in multimorbidity. Public health interventions that improve population-wide levels of these risk factors are likely to reduce multimorbidity and inequalities in its occurrence.

Peer Review reports

Background

Multimorbidity, defined as patients living with two or more chronic health conditions, is associated with reduced quality of life and life expectancy [1]. The ageing population is driving an increase in the prevalence of multimorbidity, which already affects approximately one in four of the population in the UK and USA [2, 3]. Identifying the main reversible causes of multimorbidity could inform the design of preventative strategies, helping to improve quality of life for patients and reduce the economic impact of multimorbidity.

There are considerable inequalities in multimorbidity. People from more deprived backgrounds are more likely to be multimorbid, and more likely to develop multimorbidity at an earlier age. For example, a study covering one third of the Scottish population found that young and middle-aged adults over 30 years in the most deprived areas had comparable sex-specific rates of multimorbidity to those in the least deprived areas who were 10–15 years older [2]. Other risk factors have been postulated for multimorbidity, including alcohol, smoking and BMI [4]. Given the social patterning of these exposures, however, associations are likely to be highly confounded and establishing causality is challenging. In addition, the association of education and multimorbidity may be mediated by these risk factors [5].

Mendelian Randomisation (MR) uses genetic variants in non-experimental (observational) data to make causal inference. MR is an instrumental variable (IV) analysis implemented using genetic variants robustly associated with an exposure to estimate the causal effect of the exposure on an outcome less prone to confounding and reverse causation bias [6]. For an introduction to MR analysis see [7]. In brief, IV analyses use another variable to proxy the exposure of interest. This ‘instrument’ is chosen because it meets strict statistical criteria and the IV estimate of the exposure-outcome association is much less likely to be biased. More recently, the MR arena has undergone rapid development [8], with new methods available to assess causality for both mediation (the causal pathways linking an exposure to an outcome) [9] and effect modification (the study of whether one exposure alters the effect of another) [10]. Commonly, the genetic instrument used is a polygenic risk score (PRS) for the exposure derived by weighting each SNP by the regression coefficients from the discovery genome-wide association study.

In this paper, our aim is to use MR to evaluate the causal effects of BMI, smoking, alcohol intake and years of education on multimorbidity. Further, we evaluate the degree to which BMI, smoking, and alcohol consumption explain educational inequalities in multimorbidity, and we consider whether the risk factors interact with one another in their effects on multimorbidity. To our knowledge, this is the first study to date to interrogate the causality of observed determinants of multimorbidity, and to study the mechanisms linking education to multimorbidity using an approach that is robust to confounding and reverse causality.

Methods

Data

UK Biobank is a population-based health research resource consisting of approximately 500,000 people, aged between 38 years and 73 years, who were recruited between the years 2006 and 2010 from across the UK [11]. Particularly focused on identifying determinants of human diseases in middle-aged and older individuals, participants provided a range of information (such as demographics, health status, lifestyle measures, cognitive testing, personality self-report, and physical and mental health measures) via questionnaires and interviews; anthropometric measures, blood pressure readings and samples of blood, urine and saliva were also taken (data available at www.ukbiobank.ac.uk). A full description of the study design, participants and quality control (QC) methods have been described in detail previously [12]. UK Biobank received ethical approval from the Research Ethics Committee (REC reference for UK Biobank is 11/NW/0382).

Exposures were all assessed at the baseline research assessment. We followed a published approach [13] for inferring years of education from highest achieved qualification (see Supplement for further detail). Body Mass Index (BMI) in kg/m² was calculated using height and weight measurements. We derived a lifetime smoking index, representing a continuous score of smoking behaviours and incorporating smoking initiation, duration, heaviness, and cessation, using a previously published approach [14, 15]. (This approach was used because lifetime smoking scores incorporate heaviness but are applicable to both smokers and non-smokers.) Following the approach used in a previous Genome-Wide Association Study (GWAS) [16], we derived estimated units of alcohol consumed per week. We used responses to the baseline touchscreen questionnaire on weekly red wine, white wine and champagne, beer and cider, fortified wine, spirit and other consumption to estimate the typical units of alcohol consumed per week. Former drinkers (those who previously drank alcohol but no longer do) were set to missing and excluded from analyses because treating them as non-drinkers would be inappropriate and data on their previous alcohol consumption was unavailable. Similarly, we excluded individuals with very high current alcohol consumption (> 200 units per week). Responders who indicated they were never-drinkers were set to 0 units per week.

Our primary outcome was the standard definition of multimorbidity, the presence of two or more chronic conditions. Three additional multimorbidity measures were used in secondary analyses; the presence of 3 + and 4 + conditions, and the Cambridge multimorbidity score (CMMS) with general-outcome weights [17]. This general CMMS is a continuous measure, with conditions weighted according to the average standardised weights from models of consultations, mortality and emergency admissions. For all measures of multimorbidity, the presence or absence of 35 health conditions were considered as per Payne et al. [17] (see Condition Definitions table, Supplement). Blindness/low vision and learning disability were excluded from the original condition list owing to the lack of appropriate self-reported variables. In contrast to the condition definitions applied by Payne et al [17]., which included temporal restrictions and use of medications, our definitions were simplified to self-reported ‘ever’ having had a condition with the exception of cancer (self-reported doctor diagnosed new cancer estimated to be within the last 5 years, excluding non-melanoma skin cancer), hearing loss, constipation and painful condition (see supplement for full details). The information was obtained via a touchscreen questionnaire which was followed by a nurse-led interview to clarify and categorise conditions correctly. We derived each measure of multimorbidity twice, including and excluding alcohol problems in the definition, because alcohol consumption was included as an exposure or mediator in certain models. (We used the multimorbidity outcomes excluding alcohol for all models that included alcohol as an exposure/mediator.) We used the CPRD @ Cambridge – code lists (GOLD) Version 1.1 (Cambridge, UK; University of Cambridge, 2018) as a point of reference when assigning variables to condition categories, available here: https://www.phpc.cam.ac.uk/pcu/research/research-groups/crmh/cprd_cam/codelists/v11/ [downloaded May 2020].

Genetic data: Details of the in-house quality control filtering applied to the genetic data are provided in the supplement. Quality Control filtering of the UK Biobank data was conducted by R.Mitchell, G.Hemani, T.Dudding, L.Corbin, S.Harrison, L.Paternoster as described in the published protocol (doi: https://doi.org/10.5523/bris.1ovaau5sxunp2cv8rcy88688v) [18].

Statistical methods

Participants were included in our analysis if they had complete data (outcome, covariates, polygenic risk score and exposure) for at least one exposure, they were of white British ancestry (to avoid confounding by population stratification) and they passed genetic QC criteria (see Supplement: Quality Control of Genetic Data). Related individuals were included in the GWAS (where relatedness was accounted for) but excluded from subsequent regression analyses. For analyses including alcohol consumption, former drinkers were removed from analyses because we were unable to consider the timing of stopping alcohol consumption in relation to the development of multimorbidity. In addition, former drinkers may be violating their genetic trajectory (for example, they may be genetically predisposed to be heavier drinkers). All analyses were conducted using both standard regression models (with no instrumental variable), and using MR. The study design was cross-sectional cohort.

Multivariable regression analyses were used to assess the association between each exposure and each measure of multimorbidity (2 + conditions, 3 + conditions, 4 + conditions and the CMMS). Linear, rather than logistic, regression was used for all regression models so that the estimates were on the same scale as the MR estimates and represented risk differences. All multivariable regressions were run with robust standard errors and adjusted for age, sex, 40 genetic principal components and UKBB assessment centre.

Mendelian randomization (MR) uses genetic variants known to be robustly related to the exposure of interest as instrumental variables. MR analyses were run via two-stage least squares using ivreg2 [19] in Stata [20] with the “robust” option specified (to enforce robust standard errors). For all MR analyses, we used a ‘split sample’ approach to avoid sample overlap with published GWASs (which can bias estimates [21]) and to implement uniform methodology across exposures. This involved splitting the UK Biobank sample into two halves randomly. Within each half, we ran a GWAS (using BOLT-LMM [22] and the MRC IEU UK Biobank GWAS pipeline https://doi.org/10.5523/bris.pnoat8cxo0u52p6ynfaekeigi) to identify genetic variants related to each of the four exposures, adjusting for age at baseline clinic, sex and 40 genetic principal components (to account for population structure). All SNPs with a p-value less than or equal to 5 × 10^− 8 were used to derive a polygenic risk score (PRS) for each exposure in the alternative split weighted by the regression coefficients from the GWAS. Clumping was performed at an R² threshold of 0.001 within a 10,000 kb window, and proxies were identified using the European sub-sample of the 1000 Genomes as a reference panel [23] and a lower R² limit of 0.8. The PRSs were standardised by subtracting the mean and dividing by the standard deviation. The PRSs defined based on the GWAS from one half of the UKBB sample were applied in MR analyses of the other half of the UKBB sample. MR analyses were run using two-stage least squares [19] and were adjusted for age, sex, 40 genetic principal components and UKBB assessment centre. The beta coefficients and standard errors from MR analyses within each half of the sample were then meta-analysed to give one estimate for beta, a 95% confidence interval, and an I² estimate as an indication of heterogeneity between the estimates in each split [24, 25]. Fixed-effects meta-analyses were performed using the metan command [26] in Stata. Analyses were scaled such that coefficients represented an SD change in education (equivalent to 5.1 years), a 5 kg/m² increase in BMI, a 5 units per week increase in alcohol consumption, and an SD unit increase in lifetime smoking index. (As an example, a 1 SD increase in lifetime smoking is roughly the same as being a current smoker who has smoked 5 cigarettes per day for 12 years, rather than a never smoker.)

Sensitivity analyses [27] to test the assumption of no pleiotropy in MR analysis were run for the main outcome (at least two chronic conditions) (MR Egger [28], IVW [29], simple modal estimator [30] and unweighted median estimator [31]).

BMI, smoking, and alcohol consumption are all potential consequences of educational attainment, possibly lying on the causal pathway between education and multimorbidity and explaining part of the effect and thus are considered as potential mediators of the education-multimorbidity relationship. Mediation of the association between years of education and multimorbidity was assessed by including each potential mediator (BMI, smoking, and alcohol consumption) in turn as a covariate in a linear regression of multimorbidity on years of education. The joint contribution of the BMI and smoking mediators was assessed by including both variables as covariates. Similarly, in MR analyses, mediation was assessed by including both years of education and (a) each mediator in turn, and (b) both smoking and BMI mediators as exposures in a multivariable MR analysis [32, 33]. The coefficients for years of education from these regressions and multivariable MR models estimate the ‘direct effect’; i.e. the effect of years of education on multimorbidity that operates independently of the mediator(s) being considered. The ‘indirect effect’, i.e. the effect of years of education on multimorbidity that operates through the mediator(s) is estimated by subtracting the direct effect from the total effect (the coefficient for years of education from a regression on multimorbidity not accounting for any mediators). The proportion mediated is calculated as the indirect effect divided by the total effect, multiplied by 100 to express as a percentage. 95% confidence intervals of the indirect effects and proportions mediated were calculated using Stata’s -bootstrap- command and 200 repeats. MR analyses used the same ‘split sample’ approach as the main analysis. For mediation analyses, we restricted analyses to two definitions of multimorbidity – the main outcome variable of two or more chronic conditions, and the CMMS, which, as a continuous variable, offers greatest statistical power.

Additive interaction effects between each pairwise exposure combination were assessed in multivariable linear regressions by including the product term. For MR analyses, we used a previously published approach to assessing interactions [10]; for two exposures, A and B, the instruments used in the multivariable MR model are PRS exposure A, PRS exposure B, PRS exposure A x PRS exposure B, and PRS exposure A x PRS exposure A. The last instrument was included as this has been shown to be necessary in the presence of a causal effect of one exposure on the other [10]. Our assumptions regarding the causal ordering of the risk factors, and hence the instruments used, are provided in the Supplement. In both multivariable regression and MR models, interactive effects were only estimated for the outcomes of multimorbidity status (2 + conditions) and the CMMS.

As a sensitivity analysis we re-ran the main observational regressions with multimorbidity status defined as 2 + conditions using logistic regression (as opposed to linear regression). We used gformula [34] to estimate the proportion mediated and indirect effects which allows for a binary outcome.

Analyses were run using Stata version 16 [20] and R version 3.6.1 [35]. Stata packages used in this analysis include rsource [36], ivreg2 [19] and mrrobust [27]. R packages used include reshape [37], data.Table [38], plyr [39], dplyr [40], R.utils [41] and devtools [42].

Testing the assumptions of MR

We performed sensitivity analyses to test the MR assumption of no pleiotropy. We did not examine the association of the PRSs with potential confounders for several reasons. Firstly, when the exposure is education, plausible confounders would be early-life and intergenerational factors, for which data are not available. When the exposure is BMI, smoking or alcohol consumption, the most plausible confounder is education. However, MR studies [43, 44] have shown effects of these risk factors on education, and thus testing for an absence of association of these PRSs with education is not a reasonable test for the assumptions of MR in this instance.

Results

336,891 individuals (67% of original sample) were included in the analysis (after removal of withdrawals, those failing genetic QC/without genetic data, those without phenotype data and related individuals). In the final sample the mean age was 56.9 years (IQR 51–63 years), of whom 53.8% were female (Table 1). 55.1% of the participants had a history of at least 2 chronic conditions at baseline. 12.6% of individuals had at least 4 chronic conditions. The most common conditions overall were hearing loss (37%), anxiety & other neurotic, stress related & somatoform disorders OR depression (35%) and painful condition (29%) (Supplement page 3). The mean CMMS in the total sample was 0.7 (IQR 0.1–1.1).

Former drinkers (N = 11,461) were removed from analyses involving alcohol. In this subset, 73% had 2 + conditions, 27% had 4 + conditions; the mean CMMS was 1.1 (1 d.p.).

Associations of educational attainment, BMI, smoking, and alcohol consumption with multimorbidity (2 + conditions)

Both multivariable regression and MR suggest that lower years of education, higher BMI, and higher lifetime smoking index are all associated with increased risk of multimorbidity (Fig. 1). In MR analyses, a one SD higher level of education (equivalent to an additional 5.1 years), is associated with a reduction in risk of multimorbidity (2 + conditions) by 9% (risk difference (RD) = -0.090, 95% CI -0.114, -0.065), a 5 kg/m² increase in BMI is associated with a 9.2% increased risk of multimorbidity (RD = 0.092, 95% CI = 0.081 ,0.103), and a one SD higher lifetime smoking index is associated with a 6.8% increased risk of multimorbidity (RD = 0.068, 95% CI = 0.033, 0.104). Although both multivariable regression and MR analyses also suggest that higher alcohol consumption is a risk factor for multimorbidity, the magnitude of the effect sizes were smaller than for the other exposures. In MR analyses, an increase of 5 units of alcohol per week increases the risk of multimorbidity (2 + conditions) by 1.3% (RD = 0.013, 95% CI=-0.002, 0.025). For all exposures, the estimates from MR analyses were more extreme than the estimates from multivariable regression, but the confidence intervals were wider for MR; e.g. the risk difference for multimorbidity for a 1 SD higher smoking index was 0.048 (95% CI 0.046 to 0.050) in multivariable regression, and 0.068 (95% CI 0.033 to 0.104) in MR. The R² and F statistics from the unadjusted linear regression of the exposure on the PRS, in addition to the number of SNPs in the PRS, are presented in Supplementary Table 6 for each split.

Mechanisms explaining educational inequality in multimorbidity

In MR analyses, the proportions of the educational inequality in multimorbidity explained by BMI and smoking when each risk factor was considered separately were 20% and 18% respectively (Fig. 2). When considered together in MR analyses, the two risk factors explained 32% of the educational inequality in multimorbidity. This contrasts with multivariable regression analyses, where the proportions mediated were estimated to be 28% and 25% for BMI and smoking respectively, and 51% for both risk factors combined. Multivariable regression estimated the proportion of the educational inequality in multimorbidity explained by alcohol consumption to be 0.1% (Supplementary Table 3). We did not generate an overall MR estimate for the proportion mediated by alcohol consumption because of inconsistent mediation, i.e. direct effect greater than the total effect, in one of the dataset splits (Supplementary Table 3).

Interactions between risk factors for multimorbidity

Multivariable regression analyses to evaluate the interactive effect of pairwise combinations of the exposures on the risk of having at least 2 chronic conditions (Table 2) suggest that there are interactions between some of the risk factors, namely BMI*smoking, BMI*alcohol, smoking*alcohol, and smoking*education. However, the magnitude of all interaction effects was small. Analogous MR analyses of these interactive effects gave point estimates that were larger in magnitude than the estimates from multivariable regression, with the direction being consistent for 3/6 of the pairwise combinations, but the interactions were imprecisely estimated, with wide confidence intervals that crossed the null for all interaction effects.

Sensitivity analyses

Secondary analyses using alternative definitions of multimorbidity yielded a similar pattern of results for the associations of years of education, BMI, smoking, and alcohol consumption with multimorbidity (Supplementary Tables 1, 2) and for mediation of the educational inequality in the CMMS (Supplementary Table 3). Similar to the main outcome, MR confidence intervals for all interaction terms were wide when analyses were repeated with CMMS as the outcome and the direction of effect was consistent with multivariate regression for 2/6 pairwise combinations (Supplementary Table 4). In multivariate regression analyses, the direction of the interactive effect of smoking and education on the CMMS was in the opposite direction compared with the main outcome.

Sensitivity analyses to test the assumption in the main MR analysis (outcome of 2 + conditions) of no pleiotropy (Supplementary Table 5) revealed estimates that were generally directionally consistent. The MR-egger constant estimates suggest evidence for directional pleiotropy for BMI and smoking.

The sensitivity analyses re-running the main observational regressions using logistic regression and the mediation analysis using gformula [34] are presented in Supplementary Tables 7 and 8. The single exposure logistic regressions (Supplementary Table 7) revealed associations in the same direction as the linear regression analyses. The interaction analyses revealed that as with the main analysis, any interactions were of very small magnitude. The proportion of the education association with multimorbidity mediated by the other exposures when calculated by gformula which allows for a binary outcome was strikingly similar for all mediators examined to the mediation analyses using linear regression (Supplementary Table 8).

Table 1 Descriptive Characteristics of Study Participants

Full size table

Table 2 Interactions between risk factors for multimorbidity (2 + chronic conditions). Analyses conducted using multivariable regression (MVR) and Mendelian randomization (MR) to estimate additive interactions on the risk difference scale **

Full size table

Discussion

This study has provided evidence for a causal effect of lower educational attainment, higher BMI and higher level of smoking on multimorbidity status. There was also weak evidence for a causal effect of greater alcohol consumption on risk of multimorbidity, although the magnitude of effects was generally smaller than for the other risk factors. In our analyses, one standard deviation of years of education (equivalent to 5.1 additional years) equates approximately to a 9% decrease in risk of multimorbidity. For education, BMI, smoking, and alcohol consumption, estimated effects on multimorbidity were greater in MR analyses compared with multivariable regression. However, confidence intervals for MR results were wide and, with the exception of the coefficient for education, spanned the point estimate from multivariable regression models.

Our analysis suggests that 20% of educational inequality in multimorbidity is explained by BMI, and 32% is jointly explained by BMI and smoking. This is slightly less than the 51% of educational inequality in multimorbidity explained by BMI and smoking in multivariable regression. However, 48–88% of the total effect of education on multimorbidity remains unaccounted for by these risk factors. We did not include alcohol in conjunction with the other potential mediators because neither multivariable nor MR analyses provided evidence that alcohol consumption mediated the effect of education on multimorbidity. Units consumed per week is also a crude measure of alcohol consumption, which could partially explain the lack of mediation by alcohol use. Looking ahead we need to consider other explanatory mechanisms, which are likely to be numerous, complex and span multiple social, behavioural and biological domains.

While there may be interactions between various lifestyle and anthropometric exposures on risk of multimorbidity, we could not provide evidence for these within a causal framework possibly due to low power to detect interactive effects. In multivariable analyses, where statistical power is greater than MR, interactions were generally of small magnitude, and were most often in the opposite direction to the main effects of the risk factors (i.e. the cumulative effect of having both risk factors was generally less than would be predicted from their individual effects), suggesting that interactions between the risk factors we studied are not a major contributor to the aetiology of multimorbidity.

A recent study [45] of over 400,000 GP-registered adults in England concluded that over half of health service utilisation is attributable to individuals with multimorbidity. Furthermore, the ageing population is leading to an increase in the prevalence of multimorbidity over time. Identifying the preventable causal determinants of multimorbidity is thus paramount for easing the pressure on health services. Our analysis suggests that population-level interventions to reduce BMI and smoking would likely lead to both a reduction in the occurrence of multimorbidity, and a reduction in educational inequalities in multimorbidity.

A key strength of our study is the use of Mendelian randomization to improve causal inference. In traditional epidemiological study designs, confounding factors and reverse causation can bias the estimated associations between putative risk factors for multimorbidity. Furthermore, when analysing the mediating pathways explaining educational inequalities in multimorbidity, measurement error in the mediator can lead to an underestimate of the contribution of mediating variables [46]. The use of MR overcomes these limitations of previous analyses.

There is a body of work devoted to defining multimorbidity [2, 17, 45]. We explored three definitions of multimorbidity increasing in severity from 2 + to 4 + chronic conditions, in addition to a multimorbidity score, which captures all available information as a continuum with conditions weighted by the average standardised weights from models of consultations, mortality and emergency admissions. Findings were generally consistent across these definitions. Nonetheless, our findings may be driven by the prevalence of conditions feeding into the definition of multimorbidity. Multimorbidity is not a single ‘entity’; for different patients the state of multimorbidity can represent diverse combinations of health conditions. The most commonly reported health conditions in this study were hearing loss (37%), anxiety & other neurotic, stress related & somatoform disorders OR depression (35%) and painful condition (29%). Thus, our findings may represent established causal effects of education, BMI [47], smoking, and alcohol on these conditions (e.g. associations of obesity [48] and smoking behaviour [49] with hearing loss have been previously reported). Nonetheless, as these conditions underlie many cases of multimorbidity, this does not detract from the implications of our findings about the potential public health and clinical impact of interventions to improve population levels of these risk factors. Further work identifying distinct clusters of health conditions to explore whether different ‘types’ of multimorbidity have distinct aetiologies may be of interest.

Our study has several limitations. Firstly, the use of self-reported health conditions may have led to misclassification of multimorbidity status for some people. However, self-reported data (unlike linked primary and secondary care data) was available across the whole sample. Secondly, UK Biobank participants are known to be over-selected from higher socio-economic categories [50], and the use of genetic data in this analysis necessitated restricting to people of white British ethnicity. This may have led to underestimation of the effects of exposures on multimorbidity, such that the effects we demonstrate can be viewed as minimal likely causal effects in a population more representative of Great Britain as a whole. Thirdly, with the exception of cancer, hearing loss, painful condition and constipation, we defined chronic conditions based on self-reported ever having been diagnosed by a doctor. In contrast, some studies [45] base their definition on long-term “currently active” conditions, making our definition less specific. However, our definition ensures that we can be as inclusive as possible with regards to the conditions contributing to multimorbidity. A further disadvantage of using condition history, rather than active conditions, in the definition of multimorbidity is that the multivariable regression analyses could be subject to reverse causation bias. The MR analyses, however, should not be as the exposures and mediators here are the “lifetime average”. In addition, because the multimorbidity outcome is defined as past or current illness, there is no temporal ordering of the exposure, mediator and outcome in the observational analyses. Again, the MR analysis overcomes this. Fourthly, we excluded former drinkers from analyses of alcohol because these individuals are known to have worse health outcomes than never drinkers and analysing them as non-drinkers would be inappropriate. There is no suitable option for how to address the former drinker group; the available data in UK Biobank does not permit detailed analysis of prior drinking patterns or time-since stopping alcohol consumption. However, this means that our conclusions about the effects of alcohol on multimorbidity may not extend to former drinkers. In addition, removal of former drinkers reduced the sample size for these analyses and hence the power to detect effects. Although we found weak evidence of an effect of alcohol on multimorbidity, the effect size was smaller than for the other risk factors. This may at least partially reflect the complexity of defining alcohol use. Here we used a continuous measure of units per week, but other measures such as binge drinking may also be relevant for disease outcomes, particularly in the context of educational attainment [51]. Our analysis of current alcohol units consumed per week also does not capture previously heavy but now light drinkers. An additional study limitation is that our analyses assume linear effects of the risk factors on multimorbidity; this assumption may not hold for all relationships. In particular, there is evidence that BMI has a non-linear relationship with mortality, albeit only in smokers [52]. Although methods to investigate non-linear relationships are available, statistical power would be insufficient to combine these methods with the multivariable MR approach used in this paper. Importantly, although we checked where possible that our analysis met the assumptions [8] of MR, our conclusions rely on the validity of these assumptions. Lastly, although multivariable regression analyses demonstrated some interactions between risk factors, these were not detected in MR analyses. This likely reflects insufficient power to examine interactive effects within a causal framework.

The results of this study suggest that education, BMI, smoking, and, to a lesser degree, alcohol consumption, all have causal effects on multimorbidity. Furthermore, BMI and smoking explain approximately one third of the educational inequality in multimorbidity. In the UK, school attendance is compulsory until age 18, and policies to increase educational attendance would therefore focus on increasing university participation. Such policies may potentially influence multimorbidity risk. However, policies to mitigate the health disadvantage of low education may be more realistic and within reach of public health, thus motivating our study of the pathways explaining educational inequality in multimorbidity. Interventions to reduce population levels of BMI and smoking could lead to reduced occurrence and reduced educational inequalities in multimorbidity.

Data availability

UK Biobank data access procedures are governed by UK Biobank, https://www.ukbiobank.ac.uk/.

The code for the analyses can be found here: https://github.com/laurahowebristol/multimorbidity.

References

National Guideline Centre. Multimorbidity: clinical assessment and management. London: National Institute for Health and Care Excellence; 2016.
Google Scholar
Barnett K, Mercer SW, Norbury M, Watt G, Wyke S, Guthrie B. Epidemiology of multimorbidity and implications for health care, research, and medical education: a cross-sectional study. The Lancet. 2012;380(9836):37–43.
Article Google Scholar
US Department of Health and Human Services. Multiple chronic conditions - a strategic framework: optimum health and quality of life for individuals with multiple chronic conditions. Washington, DC; 2010.
Katikireddi SV, Skivington K, Leyland AH, Hunt K, Mercer SW. The contribution of risk factors to socioeconomic inequalities in multimorbidity across the lifecourse: a longitudinal analysis of the Twenty-07 cohort. BMC Med. 2017;15(1):152.
Article PubMed PubMed Central Google Scholar
Mutz J, Roscoe CJ, Lewis CM. Exploring health in the UK Biobank: associations with sociodemographic characteristics, psychosocial factors, lifestyle and environmental exposures. BMC Med. 2021;19(1):240.
Article PubMed PubMed Central Google Scholar
Lawlor DA, Harbord RM, Sterne JAC, Timpson N, Davey Smith G. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat Med. 2008;27(8):1133–63.
Article PubMed Google Scholar
Davies NM, Holmes MV, Davey Smith G. Reading mendelian randomisation studies: a guide, glossary, and checklist for clinicians. BMJ. 2018;362:k601.
Article PubMed PubMed Central Google Scholar
Zheng J, Baird D, Borges M-C, Bowden J, Hemani G, Haycock P, et al. Recent developments in mendelian randomization studies. Curr Epidemiol Rep. 2017;4(4):330–45.
Article PubMed PubMed Central Google Scholar
Burgess S, Thompson SG. Multivariable mendelian randomization: the Use of Pleiotropic Genetic Variants to Estimate Causal Effects. Am J Epidemiol. 2015;181(4):251–60.
Article PubMed PubMed Central Google Scholar
North T-L, Davies NM, Harrison S, Carter AR, Hemani G, Sanderson E, et al. Using Genetic Instruments to Estimate interactions in mendelian randomization studies. Epidemiology. 2019;30(6):e33–e5.
Article PubMed Google Scholar
Allen NE, Sudlow C, Peakman T, Collins R. UK Biobank Data: come and get it. Sci Transl Med. 2014;6(224):224ed4.
Article PubMed Google Scholar
Collins R. What makes UK Biobank special? The Lancet. 2012;379(9822):1173–4.
Article Google Scholar
Okbay A, Beauchamp JP, Fontana MA, Lee JJ, Pers TH, Rietveld CA, et al. Genome-wide association study identifies 74 loci associated with educational attainment. Nature. 2016;533:539.
Article CAS PubMed PubMed Central Google Scholar
Leffondré K, Abrahamowicz M, Xiao Y, Siemiatycki J. Modelling smoking history using a comprehensive smoking index: application to lung cancer. Stat Med. 2006;25(24):4132–46.
Article PubMed Google Scholar
Wootton RE, Richmond RC, Stuijfzand BG, Lawn RB, Sallis HM, Taylor GMJ, et al. Evidence for causal effects of lifetime smoking on risk for depression and schizophrenia: a mendelian randomisation study. Psychol Med. 2020;50(14):2435–43.
Article PubMed Google Scholar
Clarke TK, Adams MJ, Davies G, Howard DM, Hall LS, Padmanabhan S, et al. Genome-wide association study of alcohol consumption and genetic overlap with other health-related traits in UK Biobank (N = 112 117). Mol Psychiatry. 2017;22(10):1376–84.
Article CAS PubMed PubMed Central Google Scholar
Payne RA, Mendonca SC, Elliott MN, Saunders CL, Edwards DA, Marshall M, et al. Development and validation of the Cambridge Multimorbidity score. Can Med Assoc J. 2020;192(5):E107.
Article Google Scholar
Mitchell R, Hemani G, Dudding T, Corbin L, Harrison S, Paternoster L. UK Biobank Genetic Data: MRC-IEU Quality Control, version 2 2019
Baum CF, Schaffer ME, Stillman S. ivreg2: Stata module for extended instrumental variables/2SLS, GMM and AC/HAC, LIML and k-class regression 2010. http://ideas.repec.org/c/boc/bocode/s425401.html.
StataCorp. Stata Statistical Software: release 16. College Station. TX: StataCorp LLC; 2019.
Google Scholar
Hartwig FP, Davies NM, Hemani G, Davey Smith G. Two-sample mendelian randomization: avoiding the downsides of a powerful, widely applicable but potentially fallible technique. Int J Epidemiol. 2017;45(6):1717–26.
Article PubMed Central Google Scholar
Loh P-R, Tucker G, Bulik-Sullivan BK, Vilhjálmsson BJ, Finucane HK, Salem RM, et al. Efficient bayesian mixed-model analysis increases association power in large cohorts. Nat Genet. 2015;47(3):284–90.
Article CAS PubMed PubMed Central Google Scholar
McVean GA, Altshuler DM, Durbin RM, Abecasis GR, Bentley DR, Chakravarti A, et al. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491(7422):56–65.
Article CAS PubMed Google Scholar
Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ. 2003;327(7414):557–60.
Article PubMed PubMed Central Google Scholar
Higgins JP, Thompson SG. Controlling the risk of spurious findings from meta-regression. Stat Med. 2004;23(11):1663–82.
Article PubMed Google Scholar
Harris RJ, Bradburn MJ, Deeks JJ, Harbord RM, Altman DG, Sterne JAC. Metan: fixed- and random-effects meta-analysis. Stata J. 2008;8(1):3–28.
Article Google Scholar
Spiller W, Davies NM, Palmer TM. Software application profile: mrrobust—a tool for performing two-sample summary mendelian randomization analyses. Int J Epidemiol. 2019;48(3):684–90.
Article Google Scholar
Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol. 2015;44(2):512–25.
Article PubMed PubMed Central Google Scholar
Burgess S, Butterworth A, Thompson SG. Mendelian randomization analysis with multiple genetic variants using Summarized Data. Genet Epidemiol. 2013;37(7):658–65.
Article PubMed PubMed Central Google Scholar
Hartwig FP, Davey Smith G, Bowden J. Robust inference in summary data mendelian randomization via the zero modal pleiotropy assumption. Int J Epidemiol. 2017;46(6):1985–98.
Article PubMed PubMed Central Google Scholar
Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent estimation in mendelian randomization with some Invalid Instruments using a weighted median estimator. Genet Epidemiol. 2016;40.
Sanderson E, Davey Smith G, Windmeijer F, Bowden J. An examination of multivariable mendelian randomization in the single sample and two-sample summary data settings. bioRxiv. 2018.
Carter AR, Sanderson E, Hammerton G, Richmond RC, Davey Smith G, Heron J, et al. Mendelian randomisation for mediation analysis: current methods and challenges for implementation. Eur J Epidemiol. 2021;36(5):465–78.
Article PubMed PubMed Central Google Scholar
Daniel R. GFORMULA. Stata module to implement the g-computation formula for estimating causal effects in the presence of time-varying confounding or mediation. Statistical Software Components S457204. Revised 29 September 2021 ed. Boston College Department of Economics; 2010.
R Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2019.
Google Scholar
Newson R. RSOURCE. Stata Module to run R from inside Stata using an R source file. revised 09 May 2016 ed. Boston College Department of Economics Statistical Software Components S456847; 2007.
Wickham H. Reshaping data with the reshape package. J Stat Softw. 2007;21(12).
Dowle M, Srinivasan A. data.table: Extension of data.frame. 2021.
Wickham H. The Split-Apply-combine strategy for Data Analysis. J Stat Softw. 2011;40(1):1–29.
Article Google Scholar
Wickham H, Francois R, Henry L, Muller K. dplyr: A Grammar of Data Manipulation R Package Version 1.0.4. 2021.
Bengtsson H. Rutils. Various Programming Utilities R Package Version 2.10.1. 2020.
Wickham H, Hester J, Chang W. devtools: Tools to Make Developing R Packages Easier R Package Version 2.2.0. 2019.
Howe LD, Kanayalal R, Harrison S, Beaumont RN, Davies AR, Frayling TM, et al. Effects of body mass index on relationship status, social contact and socio-economic position: mendelian randomization and within-sibling study in UK Biobank. Int J Epidemiol. 2020;49(4):1173–84.
Article PubMed Google Scholar
Harrison S, Davies AR, Dickson M, Tyrrell J, Green MJ, Katikireddi SV, et al. The causal effects of health conditions and risk factors on social and socioeconomic outcomes: mendelian randomization in UK Biobank. Int J Epidemiol. 2020;49(5):1661–81.
Article PubMed PubMed Central Google Scholar
Cassell A, Edwards D, Harshfield A, Rhodes K, Brimicombe J, Payne R, et al. The epidemiology of multimorbidity in primary care: a retrospective cohort study. Br J Gen Pract. 2018;68(669):e245–e51.
Article PubMed PubMed Central Google Scholar
Blakely T, McKenzie S, Carter K. Misclassification of the mediator matters when estimating indirect effects. J Epidemiol Commun Health. 2013;67(5):458.
Article Google Scholar
Tyrrell J, Mulugeta A, Wood AR, Zhou A, Beaumont RN, Tuke MA, et al. Using genetics to understand the causal influence of higher BMI on depression. Int J Epidemiol. 2018;48(3):834–48.
Article PubMed Central Google Scholar
Li W, Peng Y, Chen D, Lu Z, Tao Y. Association of weight change across adulthood with hearing loss: a retrospective cohort study. Int J Obes. 2022;46(10):1825–32.
Article Google Scholar
Garcia Morales EE, Ting J, Gross AL, Betz JF, Jiang K, Du S, et al. Association of cigarette smoking patterns over 30 years with audiometric hearing impairment and Speech-in-noise perception: the atherosclerosis risk in Communities Study. JAMA Otolaryngology–Head & Neck Surgery. 2022;148(3):243–51.
Article Google Scholar
Fry A, Littlejohns TJ, Sudlow C, Doherty N, Adamska L, Sprosen T, et al. Comparison of Sociodemographic and Health-Related characteristics of UK Biobank participants with those of the General Population. Am J Epidemiol. 2017;186(9):1026–34.
Article PubMed PubMed Central Google Scholar
Beard E, Brown J, West R, Kaner E, Meier P, Michie S. Associations between socio-economic factors and alcohol consumption: a population survey of adults in England. PLoS ONE. 2019;14(2):e0209442.
Article CAS PubMed PubMed Central Google Scholar
Sun Y-Q, Burgess S, Staley JR, Wood AM, Bell S, Kaptoge SK, et al. Body mass index and all cause mortality in HUNT and UK Biobank studies: linear and non-linear mendelian randomisation analyses. BMJ. 2019;364:l1042.
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This research has been conducted using the UK Biobank Resource under Application Number 19278.

This work was carried out using the computational facilities of the Advanced Computing Research Centre, University of Bristol – http://www.bristol.ac.uk/acrc/.

This study used the MRC IEU UK Biobank GWAS pipeline. Please see: Elsworth, BL, Mitchell, R, Raistrick, CA, Paternoster, L, Hemani, G, Gaunt, TR (2019): MRC IEU UK Biobank GWAS pipeline version 2. https://doi.org/10.5523/bris.pnoat8cxo0u52p6ynfaekeigi.

We would like to thank Dr Gemma Hammerton for her help using gformula.

Funding

The funders had no role in the design of the study or the decision to publish.

LDH and TLN are supported by a Career Development Award from the UK Medical Research Council, to LDH (MR/M020894/1). ARC is supported by the MRC Integrative Epidemiology Unit (MC_UU_00011/6) and the University of Bristol British Heart Foundation Accelerator Award (AA/18/7/34219).

CS is partially supported by NIHR ARC West and is an NIHR Senior Investigator. The views expressed in this article are those of the author(s) and not necessarily those of the NIHR, or the Department of Health and Social Care.

DCB is supported by Wellcome and is a PhD student. REW is funded by the Norwegian South Eastern Regional Health Authority (2020024).

Author information

Authors and Affiliations

MRC Integrative Epidemiology Unit, Population Health Sciences, University of Bristol, Bristol, UK
Teri-Louise North, Sean Harrison, Deborah C Bishop, Robyn E Wootton, Alice R Carter, Tom G Richardson & Laura D Howe
Nic Waals Institute, Lovisenberg Diaconal Hospital, Oslo, Norway
Robyn E Wootton
School of Psychological Science, University of Bristol, Bristol, UK
Robyn E Wootton
Centre for Academic Primary Care, Population Health Sciences, University of Bristol, Bristol, UK
Rupert A Payne & Chris Salisbury
Exeter Collaboration for Academic Primary Care, Department of Health and Community Sciences, University of Exeter, Exeter, UK
Rupert A Payne

Authors

Teri-Louise North
View author publications
You can also search for this author in PubMed Google Scholar
Sean Harrison
View author publications
You can also search for this author in PubMed Google Scholar
Deborah C Bishop
View author publications
You can also search for this author in PubMed Google Scholar
Robyn E Wootton
View author publications
You can also search for this author in PubMed Google Scholar
Alice R Carter
View author publications
You can also search for this author in PubMed Google Scholar
Tom G Richardson
View author publications
You can also search for this author in PubMed Google Scholar
Rupert A Payne
View author publications
You can also search for this author in PubMed Google Scholar
Chris Salisbury
View author publications
You can also search for this author in PubMed Google Scholar
Laura D Howe
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The study was conceived and designed by TLN, LDH, SH, RAP, and CS. Statistical analysis was carried out by TLN. TLN and LDH wrote the paper. All authors contributed to interpretation of the results and critical revisions of the manuscript.

Corresponding author

Correspondence to Teri-Louise North.

Ethics declarations

Ethics approval and consent to participate

UK Biobank received ethical approval from the Research Ethics Committee (REC reference for UK Biobank is 11/NW/0382).

Consent for publication

Not applicable.

Competing interests

TGR is an employee of GlaxoSmithKline outside of this work. AC is an employee of Novo Nordisk outside of this work. LH received a Career Development Award from the Medical Research Council for the submitted work, which also supported TLN; ARC received funding from the University of Bristol Medical Research Council Integrative Epidemiology Unit [MC_UU_00011/1 and MC_UU_00011/6]; NIHR Applied Research Collaboration West provides funding towards CS’s salary; National Institute for Health Research provides funding towards CS’s research expenses; in the past 36 months REW received a grant from the South-Eastern Norway Regional Health Authority [2020024], REW worked in a unit funded by the Medical Research Council [MC_UU_00011/3 and MC_UU_00011/7], REW had a previous postdoc funded by the Wellcome Trust [204895/Z/16/Z], RP received an institution-paid grant from the Medical Research Council, RP received an institution-paid grant from the National Institute for Health and Care Research; in the past 36 months REW wrote a report on literature relating smoking and mental health for the public charity ‘Action on Smoking and Health’; in the past 36 months REW received support for attending meetings and/or travel from (1) the Society for research on nicotine and tobacco New Investigator Award, (2) the Gro Harlem Brundtland Visiting Scholarship at the Centre for fertility and health Norwegian institute of public health, (3) an International Convention of psychological science travel grant; within the past 36 months RP has been the Chair of the Society of Academic Primary Care and has been a member (payment to institution) of the MHRA Pharmacovigilance Expert Advisory Group; within the past 36 months RP has had a personal paid role as Consultant Editor for the journal Prescriber; within the past 36 months ARC received an honoraria from the American Medical Association Memphis Chapter for delivering a Mendelian randomization workshop.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

North, TL., Harrison, S., Bishop, D.C. et al. Educational inequality in multimorbidity: causality and causal pathways. A mendelian randomisation study in UK Biobank. BMC Public Health 23, 1644 (2023). https://doi.org/10.1186/s12889-023-16369-1

Download citation

Received: 24 February 2023
Accepted: 24 July 2023
Published: 28 August 2023
DOI: https://doi.org/10.1186/s12889-023-16369-1

Educational inequality in multimorbidity: causality and causal pathways. A mendelian randomisation study in UK Biobank