The contribution from psychological, social, and organizational work factors to risk of disability retirement: a systematic review with meta-analyses

Background Previous studies indicate that psychological, social, and organizational factors at work contribute to health, motivation, absence from work, and functional ability. The objective of the study was to assess the current state of knowledge of the contribution of psychological, social, and organizational factors to disability retirement by a systematic review and meta-analyses. Methods Data sources: A systematic literature search for studies of retirement due to disability in Medline, Embase, and PsychINFO was performed. Reference lists of relevant articles were hand-searched for additional studies. Data extraction: Internal validity was assessed independently by two referees with a detailed checklist for sources of bias. Conclusions were drawn based on studies with acceptable quality. Data synthesis: We calculated combined effect estimates by means of averaged associations (Risk ratios) across samples, weighting observed associations by the study’s sample size. Thirty-nine studies of accepted quality were found, 37 of which from the Nordic countries. Results There was moderate evidence for the role of low control (supported by weighted average RR = 1.40; 95% CI = 1.21-1.61) and moderate evidence for the combination of high demands and low control (although weighted average was RR = 1.45; 95% CI = 0.96-2.19) as predictors of disability retirement. There were no major systematic differences in findings between the highest rated and the lowest rated studies that passed the criterion for adequate quality. There was limited evidence for downsizing, organizational change, lack of employee development and supplementary training, repetitive work tasks, effort-reward imbalance to increase risk of disability pension. Very limited evidence was found for job demands, evening or night work, and low social support from ones superior. Conclusions Psychological and organizational factors at work contribute to disability retirement with the most robust evidence for the role of work control. We recommend the measurement of specific exposure factors in future studies. Electronic supplementary material The online version of this article (doi:10.1186/s12889-017-4059-4) contains supplementary material, which is available to authorized users.


Background
With the unprecedented global population ageing, extending the working life is becoming increasingly important for sustaining welfare for all citizens [1]. Early exit from working life due to disability incurs large production losses and compensations costs for societies as well as challenges to the quality of life to the individuals. The objective of the present systematic review was to assess the state of knowledge of the contribution of psychological, social, and organizational factors at work to retirement due to disability.
Disability is defined as the general inability to perform ones job, either due to (i) a problem in body function or structure (impairment), (ii) difficulties in executing tasks (activity limitation), or (iii) problems experienced in participating in tasks or social relations at the workplace (participation restriction), cfr The International Classification of Functioning, Disability, and Health (ICF; http://www.who.int/classifications/icf/en/) developed by the WHO. While job disability is a condition or state at which one decides that a job cannot be performed, the level of 'work ability' may vary over time. Work ability is a central concept in social security and rehabilitation medicine referring to abilities necessary to perform and hold a job. Competence is a central concept in management and employment referring to knowledge and skills and relevant abilities. Hence, these concepts overlap (Tengland [2]) and both are either defined relative to demands posed by one specific job (e.g., the job held by an individual) or relative to demands of holding a job in general [2].
A large body of studies have shown that psychological and social factors at work may contribute to health and disease. Most of these studies have tested predictions of the demand-control model (Karasek [3]). Originally this "model predicts that mental strain results from the interaction of job demands and decision latitude" [3]. This model has been paramount in advancing research from the ambiguous and circular concept "work stress" to elucidation of the exposure dimensions demands and control that relate to basic research on responding to challenge (e.g., Weiss [4]). An instrument developed to measure these factors, the Job Content Questionnaire (JCQ) has been widely used in research of work and health [5]. The combination of high level of demand and low levels of control seems to contribute to cardiovascular disease (e.g., Kivimäki et al [6]) and several other health problems (e.g., Kraatz et al [7]).
Of many organizational factors, downsizing, organization of work schedules (shift work, and long working hours) have been reported to contribute to health and disease (e.g., heart disease [15][16][17] and musculoskeletal pain disorders [18]. Rapid rates of upsizing may also contribute to health problems [19]. The combination of illness perceptions, illness beliefs, and the appraisal of demands posed by the work tasks influence an individual's appraisal of work ability. Subjective appraisal of work ability may influence attitudes to work. The workplace is an arena where individuals face challenges from work tasks and social interactions. Work also provides opportunities for positive achievement, fulfilment, and friendship. For many people the job is a major source of feedback on attitudes and behaviour. Studies of organizational psychology have revealed psychological factors of significance to work motivation (e.g., Hackman & Oldham [20]) and Adams [21] and global satisfaction with ones job (e.g., Spector [22]). Hypothetically, psychological, social, and organizational factors at work may contribute to early retirement with disability pension by influencing several of the processes leading from a state of good health and work ability to a state of reduced health and disability.
Ilmarinen, Tuomi, and Seitsamo [23] proposed that several dimensions of health resources, competence, values, and factors at work contributes to work ability and modelled this like a house of four floors ("the house model of work ability"). The theoretical basis of the present study was three assumptions. (I) retirement due to disability is a result of a series of processes, each with multifactorial causation. (II) Both biological/medical, psychological, and social factors contribute in these processes: clinical medical condition, physiological and cognitive function, competence, job demand characteristics, individual appraisal of work ability, physician's assessment of work ability, job motivation, and attitudes to one's job may contribute in the processes leading from high work ability, and adequate competence to disability resulting in exit from working life (e.g., de Wind et al. [24]; Volanen et al. [25]). (III) Psychological, social, and organizational factors at work influence several of these factors and processes and henceforth may contribute to retirement due to disability. Society-level factors influence exit from working life and retirement compensation, but are outside the scope of the present study.
The present systematic review aimed to answer the following research questions: Which psychological task-level work factors contribute to retirement due to disability? Which social interaction factors at work contribute to retirement due to disability? Which organizational work factors contribute to retirement due to disability? We did not limit the review to models, theories, or to specific factors and sought to grade the level of evidence for each factor studied.

Methods
With the aim to examine whether psychological, social, or organizational factors at work contribute to retirement due to disability, we performed systematic literature searches plus an extensive evaluation of the methodological quality of the retrieved articles. Retirement was defined as permanently not performing paid work. In order to include any relevant work factor and allow variations in wording of factors (constructs), searches did not specify work factors. We also performed meta-analyses when applicable.
Methods of inclusion criteria, analyses, and eligibility were specified in an unpublished protocol (developed by the research group to ensure that all procedures were standardized and adhered to throughout the study). Minor modifications of protocols were performed during the study. All modifications were documented and all conclusions were based on the final version of the methods.

Eligibility criteria and search strategy
Disability retirement was defined as permanently not performing paid work due to disability. Psychological factors were defined as variables pertaining to the contents of a job and work tasks. Social factors were defined as interactions with other people, either co-workers, superiors/leaders, or clients, customers, or patients. We defined organizational factors as ways work is organized, e.g., working hours and shift-work systems, downsizing, upsizing, reorganization e.g., merging of units.
For inclusion, studies had to meet all the following criteria: 1. Outcome measures: addressed registry-based disability pension awards or self-reported retirement from work due to ill health or disease; 2. Types of exposures: measured any organizational, psychological, and social exposure pertaining to work in subjects that were employed and working.; 3. Types of studies: designed as a prospective cohort study, case control study (longitudinal), or intervention study; 4. Types of participants: employees, reported analyses estimating effects of work factors.
The review was limited to publications written in English, German, Danish, Finnish, Norwegian, or Swedish.

Information sources
We searched systematically Medline, Embase and Psy-cInfo up to April 23 rd , 2015 to identify primary studies that addressed the risk of retirement due to disability in relation to any organizational, psychological, and social exposure pertaining to work. A (see Additional file 1, Table A: Search strategy) shows the search strategy that was developed and adapted for each database with a combination of free text terms and controlled, hierarchical vocabulary (e.g., Medical Subject Heading terms for Medline). No limits and a search strategy with a high sensitivity were selected. The search terms were constructed to identify articles that addressed the risk of disability pension awards or related outcomes pertaining to retirement, independent of work-related exposures. We tested the specificity and sensitivity of each eligible search term before inclusion in the search string. Pilot searches showed that search profiles with exposures terms (psychosocial, demands, control, etc) did not result in more relevant sets of studies and often excluded relevant studies.

Screening
Two reviewers retrieved and screened the 19545 abstracts produced by the searches. When in doubt, the study was read in full text. The full-text versions of all potentially relevant articles were independently reviewed for inclusion by two of the authors. If disagreement or doubt, the article was subjected to formal assessment of methodological quality.
In addition to database searches, the reference lists of all articles of acceptable quality were inspected ("handsearched"). We found one additional article on factors determining remaining at work [26], but decided that the definition of "remaining at work" did not meet our inclusion criteria of retirement due to disability.
The current review defined psychological, social, and organizational factors at work as exposures that individuals are subjected to during work. Studies of personality traits were not relevant for the present review. It may be argued that job dissatisfaction, low commitment, low job involvement may be proxies of poor work environments. However, these factors are mediators between exposures and outcomes, not exposures. Therefore, a study which investigated effects of job satisfaction and job enjoyment [27] and a study of organizational commitment and meaning of work [28] were excluded from the systematic review.
Data on health status at baseline was scored as a potential confounder in the quality check list (positive if measured and adjusted or stratified in analyses). Some studies of prognostic factors in specific diseases like insomnia, obesity, rheumatoid arthritis or coronary heart disease have measured work factors as predictors of disability retirement [29][30][31][32][33]. However, many of these studies have treated work factors as covariates only [29][30][31].

Data extraction
We extracted data from each included study using the following variables: study characteristics (Authors' of the study, date of the study, and study location), exposures investigated (instruments used to measure factors at work), employee groups/types of work and number of subjects studied, outcomes/definition of disability pension and number of cases, effect estimates (the most completely adjusted estimates reported), and confounders controlled for.
Risk of bias: The assessment of validity of findings (study quality) The present systematic critical review defined the quality of primary studies as internal validity, the extent to which the effects reported in a study are truly caused by the treatment or exposure in the study sample (rather than being due to other biasing effects of extraneous variables). External validity (generalizability) determines which (specific) populations the conclusions apply to.
Systematic reviews have assessed methodological quality of primary studies by several systems. The Grading of Recommendations Assessment, Development and Evaluation Working Group (GRADE) system for the evaluation of treatment trials grade evidence as high (GRADE 4), moderate (3), limited (2), and very limited (1) [34]. In studies of treatments the serious threats to internal validity of conclusions are selection bias and information bias due to inadequate blinding. There is no consensus or gold standard for assessing the quality of observational epidemiological studies (e.g., Sanderson et al. [35]; Shamliyan et al. [36]). Recommendations for reporting or evaluating observational studies (e.g., the STROBE statement) [37] address variables in general terms: "give sources of data and details of methods of assessment (measurement)". The GRADE system categorizes observational studies as limited evidence (GRADE 2) even if conducted with prospective design and no known selection bias because of high risk of bias [36,38]. However, the evidence may be upgraded to moderate (GRADE 3) if several studies show the same result or if a limited number of studies are unequivocal.
Most studies of psychological and social work exposures are based on self-reported data which present challenges to validity of measurement methods. The individual's reporting is influenced by psychological mechanisms like perception, cognitive appraisal, expectancies, attitudes, etc, which in turn are influenced by personality traits and culture (e.g., Watson et al. [39]; Chen & Spector. [40]; Oliver et al. [41]). Recommendations for reporting or evaluating observational studies (Sanderson et al, 2007; the STROBE statement) [35,37] do not address evaluating psychometrics of variables [42].
The present systematic review evaluated primary articles with a detailed check list (see Additional file 1, Table B: Quality assessment check list) which included items for grading quality of subjective-report methods. There were separate check lists for each study design type since some items are design specific (e.g., blinding and randomization in experimental designs). Since work exposure variables may vary considerably over time, single-point measurements may be unreliable estimates of exposure. Hence, an item addressing repeated measurements of exposures was included.
The main arguments for applying a detailed check list for observational studies were to ensure that (i) reviewers actively search out all information relevant to internal validity in each article, (ii) the two reviewers put equal weights on sources of bias, (iii) to provide a standard for grading methods based on different self-report instruments, observations, or registries, and to (iv) to provide full transparency of assessments. In addition to assessing internal validity (recruitment of study population/subjects; methods for exposure measurements; methods for outcome measurements; analysis and data presentation; and inclusion of confounders), we scored external validity (generalizability, the representativeness of the study population), and moderators (other types of exposures at work (e.g., physical, chemical) and leisuretime exposures). The two latter aspects are not taken into account in the present review. Before the scoring of articles took place, a pilot test of the check list was conducted by all reviewers to test the system.
Each study was first assessed independently by two reviewers. After assessing quality, the two referees compared and scored the study. If there was disagreement on checklist item scores, the referees discussed the reason for disagreement and agreed upon the score of the item. All authors participated in the assessment of quality.
The 27 different items of the checklist for internal validity of prospective studies were weighted for their potential significance for methodological quality (0-3 points). Factors of potential serious bias were assessed by more than one check-list item and higher obtainable scores. The grading of subjective-report methods for measuring exposures contained items pertaining to psychometric quality of instruments (explicit documentation of validity and reliability, repeated measurements) and reporting behavior (analysis of data at organizational unit-level), and reporting historical exposures. These are methodological measures that improve quality, but have not been considered necessary for accepting studies in epidemiology journals.
The scores were summed and a total score for internal validity was the basis for the conclusion of quality. To be given a maximum score (100%) a study must exhibit no discernible selection bias, attrition to follow-up lower than 15%, all measurements performed with objective (neutral) methods using interval or ratio scales, include three or more measurements of exposure factors (high reliability), include analyses which control confounders age, gender, education, socioeconomic gradients, and perform comprehensive statistical analyses. After having scored all articles published until 2012 the research group concluded that studies meeting customary criteria for acceptable methods exhibited scores exceeding 50% of maximum. The criterion for accepting methodological quality of a study was set to internal validity score of 50% or more. This level eliminated studies with "(1) failure to develop and apply appropriate eligibility criteria (inclusion of control population), (2) flawed measurement of both exposure and outcome (3) failure to adequately control confounding, and (4) incomplete follow-up" (GRADE guidelines) [38]. The highest score was 81% [43]. The study group agreed that studies that scored 66% or higher could be defined as high-quality studies. We have not found previous studies differentiating between acceptable and high quality base on detailed check list of all factors listed above.
Conclusions were based on studies with acceptable quality only. The conclusion "high evidence for an effect" required that randomized control studies of interventions targeted at a specific exposure factor (a change of exposure) showed that this exposure was significant. The conclusion "moderate evidence" required that there was sufficient reason to upgrade evidence from observational studies from the normal level of limited evidence: either (i) two or more observational studies of acceptable quality showed the same effect with no studies showing nonsignificant or opposite effects, or (ii) many observational studies of accepted quality showed an effect and in addition, a significant combined effect estimate in meta-analyses. The conclusion "limited evidence" was made if (i) there was only one study of acceptable quality of the factor in question (no replication) and this study showed a significant effect, or (ii) there were studies showing significant and a small number showing nonsignificant effects, but none showing significant opposite effects. The conclusion "very limited evidence" was drawn if there were several studies with nonsignificant findings and meta-analyses did not produce unequivocal results.

Meta-analyses
Combined effect estimates were calculated by means of averaged associations across samples, weighting each observed association by the study's sample size [44]. Eligible studies for inclusion in the meta-analyses reported categorical exposure variables with the unexposed employees (or employees with the lowest exposure category) as reference category. We synthesized studies reporting both Odds ratios and Hazard ratios and computed means of average associations as approximations of Relative risk ratios. This approximation is valid if the incidence rate of a study outcome is rare. The most completely adjusted risk estimates from each study and their corresponding confidence intervals or standard errors were used to compute combined effect estimates. When applicable, we computed additional subgroup analysis of the most comparable studies, i.e., studies that used the same exposure instrument measures, e.g., the Job content questionnaire (JCQ).
We computed random-effects models which estimate the mean of a distribution of true effects. The random effects model is recommended when there is reason to assume that the true effect vary from one study to the next [44]. The Q statistic was computed to assess the heterogeneity of studies (p < 0.05 rejects the null hypothesis of homogeneity). The I 2 statistic shows the heterogeneity in percentages. To address the potential problem of publication bias, we computed the fail-safe N statistic which indicates the number of studies reporting null results that would be required to reduce the overall effect to nonsignificant [45]. All of the computed statistics were carried out by the Comprehensive Meta-Analysis (version 2) software, Biostat, Englewood, USA [46].

Identified studies
Of the 19545 abstracts, we identified 39 studies that fulfilled the inclusion criteria and satisfied the criteria for quality [12,32,43,. Figure 1 depicts the identification, screening, eligibility, and inclusion processes. Studies excluded in the initial screening did not fulfill any of the inclusion criteria, or were duplicates (identified by Endnote reference library program, n = 3705). In all, 184 studies were considered as potentially relevant in the initial screening process. Of these 184 studies, 63 were excluded because of crosssectional design or irrelevant outcome measures. In total, 121 [12, 24, 27-33, 43, 47-157] studies were independently reviewed in full text by two of the authors. Among these studies, 19 studies did not have a relevant exposure measure [33, 84, 90, 97, 105, 115-117, 119-121, 124, 131, 137, 141, 143, 147-149], seven studies did not report relevant outcome measures [87,93,103,106,110,113,129], five studies were not written in English, German, or a Nordic language [114,125,128,136,140], three studies were only reported as congress abstracts [92,139,144], and one study had a cross sectional design [130].
The table C of Additional file 1 presents characteristics and findings of the studies that did not meet the criterion quality score. The table D of Additional file 1 shows the scores of internal validity for accepted studies. The table E of Additional file 1 depicts the scores of internal validity for studies that were excluded by quality criteria.
Studies reporting several work factors are presented in two or more sections in this article.

Overall appraisal of the work environment
One representative general-population study by Støver and coworkers [81] found that overall poor psychosocial work environment predicted disability retirement, based on the number of "poor psychosocial work exposures" out of 11 questions encompassing variation, feedback, support, influence (control), demands, reorganization, and bullying.
Aspects of job demands were reported to predict subsequent disability retirement in four out of total twenty studies. Job demands were positively related to subsequent disability retirement in two [54,79] of the eleven studies that measured demands by validated instruments, one of which pertained to retirement due to mental diagnoses [48]. However Ropponen et al. [79] reported that low demands was associated with an increased risk of disability awards with musculoskeletal diagnoses and Clausen et al. [76] found that medium-level of demands reduced risk of disability pension awards compared to low demands.
Specific aspects of job demands were significantly related to subsequent disability retirement in two of the eight studies that were based on single item demand measures [50,74]. Krokstad et al [50] found significant associations between high levels of "concentration and attention" and subsequent disability retirement among women in the general working population, but not among men. Based on the same population and measurements as Krokstad et al, Hagen et al [74] found a significant sex and age-adjusted increased risk.
Thirteen of the twenty studies were suitable for metaanalysis [50,52,54,55,60,67,71,[73][74][75][80][81][82]. Figure 2 shows the forest plot and the results from the meta-analysis. The combined relative risk estimate of the thirteen studies did not show an association (RR = 1.12; 95% CI = 0.98-1.28). The Q statistics showed substantially heterogeneity between studies ( Table 2). The reason may be that the different studies measured different exposures (e.g., job stress, busy at work, high time pressure, demands in concentration and attention, JCQ-job demands). In addition, some studies measured the outcome by questionnaire whereas other studies reported registry data. The subgroup analysis of the four comparable studies which measured job demands with JCQ and based information on disability pension on registries [54,55,60,71], showed no significant association (RR = 1.14; 95% CI = 0.80-1.62). The heterogeneity between these studies was substantial (p < 0.01).
Stratified analyses for studies with single item (n = 7) and multiple item (n = 6) measurements of demands showed a RR of 1.08 (95% CI: 0.90-1.30) for single item and an RR of 1.16 (95% CI: 0.92-1.47) for multiple item measures.
Stratified analyses for studies with high quality (n = 8) and acceptable quality (n = 5) did not reveal a difference in results of the effects of demands on disability pension (high quality studies RR = 1.06; 95% CI = 0.89-1.27 vs acceptable quality studies RR = 1.26; 95% CI = 0.96-1.65).

Repetitive work tasks (monotonous work)
One representative general-population study by Sterud [80] found that reporting having repetitive work tasks three quarters of the work day or more predicted disability pension. Repetitive work tasks was measured with a single item: Does your job consist of constantly repeated tasks, meaning that you do the same thing hour after hour?"
Twelve studies used measures from or derived from the JCQ [49, 54, 55, 60, 63, 66-68, 71, 73, 77, 79]. This instrument defines the control dimension as job-decision latitude consisting of two factors: decision authority and skill discretion (required skills for the job). Nine of the twelve studies reported decision latitude [49,54,55,60,66,67,73,77,79], whereas two studies assessed decision authority and skill discretion separately, [68,71] and one study assessed decision authority only [63]. Of the remaining twelve studies, one study measured influence at work with a four item scale from COPSOQ [76], one study assessed work-time control by a validated instrument (self-reported and co-worker-reported) [51], one study assessed job control with a three item scale [80], while one study measured decision authority-like aspects of control with a non-validated two item scale [75], and eight studies measured decision-authority like aspects of job control with single items [50, 52, 56-59, 70, 74].

Job strain
Six studies addressed job strain [47-49, 54, 67, 79] referring to the combination of high level of demands and low level of control (the job-strain model; Karasek [3]) ( Table 1). All of these assessed job strain with measures from or derived from the Job Content Questionnaire (JCQ) [5]. Mantyniemi et al [47] and Samuelsson et al [49] used aggregated scores (work unit/job title and Job Exposure Matrix, respectively) to determine strain. The studies by Ahola et al [48], Ropponen et al [79], Samuelsson et al [49] and Canivet et al [54] were based on data from the general working population, whereas the studies by Mantyniemi et et al [47] and Laine et al [67] pertained to Finnish public-sector employees. With the exception of the study by Ahola et al [48] that measured disability retirement by a questionnaire, all studies assessed disability retirement by registries.
Job strain was a significant predictor of subsequent disability retirement in four out of six studies [47,48,54,67]. The studies with nonsignificant results for job strain showed an increased risk of disability for the combination of low job demands and low job control (passive jobs) [49,79], hence control may be the decisive factor.
Of the six studies, five were suitable for metaanalysis [48,49,54,67,79]. Figure 4 and Table 2 shows that high job strain was borderline significantly associated with increased risk of disability pension (RR = 1.45; 95% CI = 0. 96-2.19). The two studies with nonsignificant results for job strain were based on two samples drawn from the same population [48,78], but were both included in this analysis. The Qstatistic indicates substantial heterogeneity between studies (p < 0.01).

Effort-reward imbalance (ERI)
High effort-reward imbalance and low rewards were found to predict disability pension due to depression, Fig. 3 Forest plot Job Control both in analyses of individual level and work-unit level of ERI scores in one study [78]. The population was Finnish public sector employees. Effort-reward imbalance was measured with a four-item questionnaire, one item measuring effort and three measuring rewards. Individual level scores of ERI also predicted disability retirement due to musculoskeletal disorders.

Development and training
One study of the Danish general working population study found significant effects of both low employee development and low supplementary training on the risk of registry based disability-pension awards [63].

Studies of social factors Social support
Social support refers to assistance, information, feedback, and emotional support. Nine studies addressed aspects of social support [49,53,54,59,60,63,68,71,79]. Data were extracted from the general working population in six of these studies [49,53,54,63,71,79], whereas two studies were based on civil service employees [59] and employees of the city of Helsinki [60], respectively. In all of these studies, disability retirement was assessed by registry data. In the remaining study, the population was waste collectors and municipal workers, and disability retirement was assessed by questionnaire [68]. Seven studies assessed support with measures from the JCQ [49,53,54,63,68,71,79]. Of the remaining two studies, Hinkka et al [59] assessed support with a single item, whereas Lahelma et al [60] used another validated instrument.
Social support was related to subsequent disability retirement in three out of nine studies. Low supervisor support was found to increase the risk of subsequent disability retirement in the study by Sinokki et al [53] and Canivet et al [54]; whereas high social support was found to increase the risk in the study by Samuelsson et al [49]. In all these three studies, aspects of social support were assessed with measures from the JCQ.

Conflicts
Three studies examined the effect of conflicts on subsequent disability retirement (Table 1). Appelberg et al [12] assessed interpersonal conflicts with a single item, whereas Lund et al [63] and Labriola et al [71] assessed conflicts at work with measures derived from the JCQ. All three studies were based on the general working populations of Finland [12] and Denmark [63,71], and all used registry data to assess disability retirement.  Appelberg et al [12] found an increased risk for women, whereas Labriola et al [71] and Lund et al [63] did not find any significant associations between conflicts at work and disability retirement.

Harassment
One study of harassment based on the general working population found that harassment did not significantly predict disability retirement when controlling for distress, gender, age, and work exposure factors [80].

Team climate
Two studies examined the effect of team climate on disability retirement (Table 1). Hinkka et al [59] studied civil-service employees, and assessed team climate based on five items, whereas Ahola et al [48] used data from the general working population, and items from the Healthy Organization Questionnaire. Both studies assessed disability retirement by registries. Hinkka et al found a protective effect of good team climate, whereas no effect was found in the study by Ahola et al.
Three [59,62,64] of six [56,57,59,60,62,64] studies found significant associations between shift work and subsequent disability retirement. One of these studies reported a protective effect of regular shift work compared to day work in the general male working population [62]. Three [52,55,65] out of five studies found significant associations for evening and/or night work.
Of the three studies that examined the effect of hours worked per week [48,60,62], the study by Krause et al [62] [48,60,62] compared those who worked 60 h or more per week, 45 to 49 h per week, and 40-44 h per week with those who worked less than 40 h per week, respectively. There was a significant increased risk of working 60 h or more per week [62]. Ahola et al [48] and Lahelma et al [60] found no effects of working more than 40 h per week compared to working less.

Contract type
One study examined the effect of temporary work contracts versus permanent contracts, and reported nonsignificant results [60].

Organizational change
Of the 32 studies, just one study addressed organizational change [69]. Based on office staff aged 35-55 working in 20 civil service departments in London, Virtanen et al found that the employees who were transferred to executive agencies compared to those who remained in the Civil Service exhibited increased risk of disability retirement.

Downsizing
Only one study addressed downsizing. Vahtera [43] et al found that a reduction in personnel of 18% or more was significantly related to subsequent disability retirement in both male and female municipal employees.
High job demands assessed with validated instruments was reported to predict disability in two studies [54], but nine studies reported nonsignificant findings [49,55,60,63,[66][67][68]71]. Low demands was found to predict disability awards in two studies [75,78]. Of the eight studies that measured demands with single items [50,52,58,62,[73][74][75], two studies [50,74] reported that "concentration and attention" predicted disability. The metaanalysis did not show evidence of effects of high demands on subsequent disability retirement. Hence, with the large number of studies it seems merited to conclude that there is very limited evidence that general job demands predicts disability retirement, while there is limited evidence that the specific factor "concentration and attention" does contribute to disability.
There was limited evidence for the association between repetitive work tasks (monotonous work) and disability retirement (one study; Sterud [80]). There was limited evidence for a role of effort reward imbalance and low rewards as predictors of disability pension due to depression (one study; Juvani et al [78]).
Two studies reported that low support from the superior predicted disability pension awards [53,54], while one [49] study reported that high social support predicted disability with mental diagnoses. Five other studies reported no effect of support [59,60,63,68,71]. Theoretically, one may argue that supportive superiors may facilitate retirement if in the best interest of the employee, or contribute to adjustments of work tasks (demands) to compensate for lower work ability. Hence, social support may affect retirement decisions through several mediation processes.
Interpersonal conflict was associated with disability pension awards in women [12], while two other studies did not find any association between conflicts and disability [63,71].
Of organizational factors, downsizing [43] and organizational change [69] predicted subsequent disability retirement. The specific factors "employee development" and "supplementary training" have been reported by one study. These factors may be of significance, but the findings need to be confirmed by other studies and are therefore graded as limited evidence.
There is very limited evidence for evening and night work, shift or period work, working hours >60 per week. While three studies found effects of evening and night work [52,55,65], two did not [32,72]. Two studies reported effects of shift or period -based work [59,64], but four studies did not [56,57,60,62]. The meta-analysis of the effects of shift work on subsequent disability retirement showed nonsignificant results.
While one study found effects of working hours >60 per week [62], two did not find effects of working hours >40 per week [48,60].
Job dissatisfaction and low levels of meaning at work have been reported to predict higher risk of disability retirement ( [11]; Clausen et al. [28]) indicating that some of the effects reported here may be mediated through emotional and cognitive factors like attitudes to work and the workplace.
On a theoretical level, we found that task-level, individual-level factors associated with control of one's work situation was the most consistent significant work factor in processes leading to disability retirement. The level of control depends on the nature of work tasks and how work is organized. Disability is defined as the general inability to perform ones job and one would expect that the demands posed by the work tasks would be paramount in determining retirement due to disability. Surprisingly, psychological job demands was not a consistent predictor of disability awards. Theoretically, high levels of psychological demands may be associated with jobs held by employees with high levels of education and high job involvement [158], hence the appraisal of demands may vary between occupations resulting in inconsistent effect of demands.
Limitations: Methodological considerations pertaining to primary articles Validity and specificity of exposure variables The study of psychological and social factors is generally limited to methods based on self-report. Direct observations of working conditions by trained observers are usually not possible as they are very time-consuming. Moreover, the presence of an observer may influence the behavior of those observed.
Many articles presented minimal descriptions of the methods employed to measure psychological and social factors, referring to an instrument, but not specifying which items were used, response scales, or the properties of these.
Of psychological and social factors at work, most studies measured demands, control, and "job-strain". The demands dimension includes several types of demands: Quantitative demands (amount of work, working hours, time pressure) differ from qualitative demands (complexity, standards for quality, problem solving) and emotional demands of dealing with clients, etc. Single-item assessment of demands can only tap into one (narrow) aspect of the many types of demands posed to the employee. Hence, it seems reasonable to recommend restricting conclusions to the specific factor measured rather than making statements pertaining to general job demands. Moreover, findings pertaining to specific factors may direct interventions. The finding that repetitive work tasks may predict disability retirement (Sterud, 2013) indicates that rather concrete organizational interventions may be beneficial.
Similarly, Control is a broad dimension that may be defined as the possibility or freedom to choose between alternatives. Control may pertain to control of decisions, breaks, work procedures, working hours (e.g., flexi-time), social interactions with clients or colleagues, etc. Hence, the control dimension incorporates several factors.
Most studies of demands, control, and job strain, were based on the Job Content Questionnaire instrument (JCQ; Karasek et al. [5]), which measures demands by questions pertaining to time pressure, amount of work, and role conflicts. Role conflicts may produce effects on health that differ from those of demands (e.g., Christensen & Knardahl [14,159]). The control dimension ("decision latitude") of the JCQ includes both "skill discretion" (variety of work and opportunity to use skills) and "decision authority" (control over decisions that influence work) which may affect health differentially [160]. High levels of skill discretion may imply more interesting work tasks and more responsibility (which may be related to demands). Furthermore, one [42] of two studies that reported skill discretion separately [42,45] found a significant effect on disability [71]. Neither of these nor a third study from the same research group [37] found significant effects of low decision authority. Control of work time seems to be an important aspect of control [51].
The exposure assessment of studies on shift work was mostly based on dichotomous classification of data ("shift work" compared to "day work"). There is considerable variation in shift-schedule characteristics (e.g., the number of night shifts a year, the speed of shift rotation, regular vs. irregular shift systems, etc.) which may result in misclassification.
Therefore, although some of the most robust findings of the present systematic review pertain to broad dimensions each consisting of several factors; it is possible that applying instruments that measure specific factors would uncover stronger associations with subsequent disability. Some aspects of control may be more important with regard to disability pensioning than others, but this has not been sufficiently examined yet. Moreover, there is a need for studies of more specific exposure factors to allow practical application in interventions or prevention of disability retirement.
Exposure factors may be correlated. Shift-work schedules are most common in occupations in the manufacturing industry and in health-care and nursing where employees are sometimes also exposed to mechanical exposures (like lifting and pushing/pulling objects/patients), chemical exposures, and noise. Hence, many shift-work jobs present a combination of exposures making it difficult to assess the contribution of the shift work schedule per se. On the other hand, some organizations with continuous operations carry out most workintensive procedures during daytime. Hence, some night shift jobs may present low job demands and working with a small group of coworkers. Therefore, drawing general conclusions of effects of shift work based on data from shift schedule with no information of type of work tasks or psychological and social factors is problematic. Conclusions probably only apply to the specific working population investigated and should not be generalized. This was a reason for not including external validity in the assessment of bias in this review. There is a need for studies of effects of interactions of shift characteristics and exposures during work.
Validity and specificity of the outcome variable retirement due to disability The decision to award disability benefit may be based on both assessments of function and on availability of a suitable job. Criteria may vary in emphasis on medical diagnosis or tests of function. Even the support status of the claimant's partner may be taken into account. To complicate things further, criteria for disability benefits may change over time as a consequence of political decisions. Some countries have introduced temporary disability pension in order to stimulate efforts for rehabilitation/return to work. Moreover, some insurance funds use the label disability compensation for payments of lost wages due to being unable to work due to injury or disease (e.g.,) [161], i.e., synonymously with sicknessabsence compensation.
Countries vary in systems and criteria for assigning benefits and compensation to individuals who are no longer able to work. The Nordic countries all provide disability pension based on prolonged sickness absence and failure of retraining (http://www.nordsoc.org/en/) [162], while the Employment and Support Allowance (ESA) of the United Kingdom (UK) and the Social Security Disability Insurance of the United States (US) are based on medical assessments and general function tests [163,164].
Of the present 39 studies with adequate quality rating, only two investigated non-Nordic populations. This may be a consequence of the differences in public pension systems. The normative age for receiving full pension is higher in the Nordic countries resulting in higher prevalence of disability retirees. Moreover, the public pension systems provide public registries available for scientific study. The age to receive full age pension in Norway is 67 years, 65 in Denmark and Sweden, and 68 in Finland. The state pension age in the UK and Austria is 60 years for women and 65 for men. The standard retirement age is 60 in Italy and until recently, 62 in France. Hence, several factors motivate research on factors determining disability retirement in the Nordic countries. A potential problem with this geographical bias of studies may be that the external validity of conclusions may be limited.

Influence of work ability on the perception and reporting of work exposures
Since work ability is a function of the ability of the individual and demands posed by the job, one should expect that individuals with lower work ability may perceive work tasks more demanding. Hence, reporting high levels of job demands may be a consequence of lower work ability, in which case one should expect high job demands to predict disability as a precursor (mediator) rather than as a pathogenic exposure factor. Therefore, it seems surprising that job demands was not a consistent predictor of disability retirement across studies. Possibly, moderate or high job demands may be associated with interesting and motivating job tasks (and/or higher socioeconomic status), hence high job involvement may buffer effects of high demands in some jobs. Indeed, Blekesaune and Solem [75] reported that "People working in stressful jobs delay nondisability retirement compared with those in less stressful occupations." An alternative explanation may be that employees with poor health may already have reduced their workload at the time of measuring their work exposures. It should be noted that Samuelsson et al. [48] found that higher levels of job demands was a risk factor of disability pension due to mental diagnoses, hence there may be specific associations between high demands and psychological health problems.
The length of a study follow-up period may influence results. Theoretically, if the follow-up period is short, individuals receiving disability retirement may primarily be employees with somewhat reduced work ability at baseline when the exposure measurements were performed.
The present review did not find a systematic tendency of significant results among the studies with the shortest follow-up period. For the control dimension, studies that reported significant effects (control p < 0.05; n = 18 studies) had a mean follow-up time of 7.6 years (median = 7 years), while studies reporting nonsignificant results (p > 0.05; n = 6 studies) had a mean follow-up time of 7.8 years (median = 7 years). Studies reporting significant effects of the demand dimension (p < 0.05; n = 4 studies) had a mean follow-up time of 7 years (median = 6.5 years), while studies reporting nonsignificant results (p > 0.05; n = 16 studies) had a mean follow-up time of 7.9 years (median = 7.5 years).

Limitations and strengths of the current review
A major strength of this review was the comprehensive systematic search. Pilot searches showed that it was not possible to cover all psychological, social, organizational exposures by search terms. Therefore, articles were searched based on outcome search terms and all articles investigating/containing organizational, psychological, and social exposures (including moderating factors) were reviewed in full-text. We are therefore reasonably confident that this systematic literature review includes all relevant studies within the time span.
Several studies were identified that measured psychological and social factors, but only entered these data as confounders or moderators in their final analyses. These studies did not allow any conclusions of the contribution of psychological and social factors and consequently failed to reach an acceptable-quality score in this review.

Assessment of methodological quality
A second strength of the present review was the specific evaluation of subjective report methods for assessing bias. Most measurements of psychological and social work exposures are based on self-reported data. Many studies only report internal consistency (Cronbach's alpha) and some measure exposures with scaled-down versions of commonly used instruments or with single items with unknown psychometric properties.
There is no universal agreement of criteria for "flawed measurement" of psychological and social factors at work. While many epidemiological studies have measured work exposures with single questions or scales with unknown psychometric properties, experts in psychometrics generally call for the use of multi-question scales that are extensively tested for several aspects of validity, reliability, and item bias in order to conclude that a method is adequate [165]. The present systematic review evaluated bias by a detailed checklist of sources of bias that might originate from subjective-report methods: psychometric quality of instruments (explicit documentation of validity and reliability), analysis of data at organizational unit-level, assessment of traits associated with reporting bias, design with repeated measurements of exposures, and reporting historical exposures. These are methodological issues that influence validity of data, but have not traditionally been considered necessary for publishing epidemiological studies. For the present review, we decided that meeting basic criteria of selection bias and adequate exposure and outcome measurement constituted adequate quality. We found that this corresponded to a quality score of 50%. However, a consequence of including the above-mentioned factors which raises the standards for "perfect methods", was the broader range of scores (highest score was 81%).
All reviewers noted that some issues were difficult to assess due to inadequate reporting of procedures methods in the primary articles. The checklist was found to be effective in making sure all essential issues were checked in all included primary articles.
There were no major systematic differences in findings between the highest rated and the lowest rated studies that passed the criterion for adequate quality (50%), e.g., the mean internal validity score of the six studies that reported no effects of control was 61%, whereas the mean score of the 18 studies that reported significant effects was 63%. Regarding demands, the mean internal validity score of the 17 studies that reported no increased risk was 62%, whereas the mean score was 63% for the three studies that reported an increased risk.
A third strength of the present review was that the decision to accept methodological quality of a primary study was based on evaluation of its internal validity. External validity determines whom conclusions pertain to, and should not direct conclusions of methodological quality.
The decision of level of evidence was based on GRADE guidelines supplemented with meta analyses. The most completely adjusted effect estimates reported in each individual primary study was entered into the meta analyses, since conclusions of primary articles generally are based on adjusted estimates. It is possible that some studies adjust for factors that are mediators rather than confounders or moderators, e.g., health status. Correcting for mediators may lead to underestimation of the effects of the work related factors. However, it is often not possible to determine if a variable is a mediator with only one or two measurement points. In our opinion, it was not reasonable to compute the combined effect estimates based on the crude effects, given that the outcome is highly related to age.

External validity
The aim of the present review was elucidating predictors of permanent disability retirement, thus we did not include studies of employees already on sickness absence or temporary disability. Several countries do not have public systems of disability compensation and several countries do not have registries allowing scientific study of disability benefits. Therefore, one may question the general external validity of the current primary studies. One has to read each study to determine to whom the conclusions apply.
The findings of a study of one type or work or one occupation may not be generalized to other occupations, but may still be of great value. Even findings from a random representative sample in one country may not be generalized to another country with different laws, culture, or compensation systems. The large increase in unemployment in many European countries since 2008 may contribute to differences between countries.

Publication bias
The selective publication of studies based on the magnitude and direction of their findings constitute a threat to the validity of meta-analysis [166]. The most consistent finding in the present study was the association between low control and excess risk of subsequent disability retirement. The fail-safe N statistic showed that 623 studies reporting null effects would be needed to attenuate the findings in the present study to non-significant. Hence, the combined effects of control were robust.

Comparison of findings with previous systematic reviews
We have only found one systematic critical review of psychosocial factors at work as predictors of disability benefits, authored by Dragano and Schneider [167]. With search terms "early retirement", "premature retirement", "work disability", "early pensioning", "disability pension", and "disability retirement" their search seems to be somewhat wider than the present review since "work disability" may denote temporary disability. They excluded studies of shift work, their only specified quality criterion was prospective study design, and they did not perform meta-analyses. They concluded that 20 studies met their criteria for inclusion and quality. Six studies in their review did not meet the quality criteria of the present review [95,98,99,101,104,111]. Two of these studies reported significant effects of control, demands [99] and job strain [104]. One study [98] reported no effects of control, demands, or opportunities for development. The present review included 25 studies [32, 47-49, 52, 54, 55, 57, 59-61, 64-66, 69, 72-74, 76-82] that were not identified by Dragano and Schneider or that were published after July 31 st , 2010, the time frame of their search. They reported "Important single factors were low control, monotonous work", "job strain, effort-reward imbalance", "lack of social support", "problems related to the organization of work and to leadership behaviors".

Conclusions
The present systematic review showed that psychological and organizational factors at work contribute to early retirement due to disability benefits. There was moderate evidence (i.e., several observational studies of high quality with coherent results) for the role of low control of work situation and for the combination of high quantitative demands and low control (job strain). There was limited evidence for downsizing, organizational change, work demanding attention and concentration, lack of employee development and supplementary training, repetitive work tasks, low rewards, and effort-reward imbalance as predictors of disability, but these findings need replication. There was very limited evidence that general job demands, evening or night work, and low social support from ones superior as predictors of disability retirement.
It seems justified to recommend that managers and leaders intensify their efforts to increase employees' control of their work situation (decision authority, autonomy), in particular for employees with high levels of job strain. During periods of downsizing and organizational change, managers should pay attention to processes that may facilitate disability. Employee development and supplementary training may be particularly important measures to maintain competence/ work ability in a working life with rapidly changing technology and demands.

Research needs
Most of the reviewed studies applied measurement instruments that measure broad dimensions combining factors that may have different effects on healthrelated outcomes. Most studies of job demands, control, and the combination of demands and control (job strain), have applied the JCQ [5]. This instrument includes role conflict in job demands and both decision authority and skill discretion under the control dimension. Therefore, studies conducted by the JCQ may underestimate effects if only one of these factors contribute to disability retirement. There is a need for studies based on measurements of specific work exposures. Furthermore, knowledge of risks (or protective factors) must be specific in order to direct design of interventions or prevention. It seems merited to recommend measurement of specific exposure factors in future studies of disability.
Almost all studies found were conducted with Nordic populations. In order to take cultural, political, and economic aspects into account, there is a need for studies of employees from non-Nordic countries. The challenges posed by the combination of current high rates of unemployment in young Europeans and the demographic shift to aging populations, raises the need for knowledge of trajectories of competence/work ability and exit from working life, and the influence of work factors on these trajectories.

Additional file
Additional file 1:

Funding
The present systematic review was supported by the Nordic Council of Ministers grant # 11119. Salaries for the authors were contributed by their respective institutions.

Availability of data and materials
Criteria for inclusion and exclusion of studies are presented in the manuscript. The search strategy (which when applied results in the initial dataset of articles) is available in Additional file 1. The processes of including and excluding articles according to criteria (defined in the introduction and methods section of the manuscript) and the numbers of articles excluded for the various criteria are depicted in Fig. 1 Flow chart for selection of studies. References for all primary articles of potential relevance are included in the manuscript. All primary articles evaluated are presented in Table 1 of the manuscript (primary articles of adequate quality) and Additional file 1 Table C (studies that did not meet our quality criteria). The check list for the evaluation of quality is presented in Additional file 1 Table B. Check-list scores for individual articles are presented in Additional file 1 Tables D and E.
Authors' contributions SK took the initiative to the study, developed the first draft of the quality check list, and coordinated the study. HAJ handled primary study articles, made the tables, and conducted the meta-analyses. SK, HAJ, TS, MH, RR, JS, VB contributed in the processes of defining criteria for inclusion and exclusion of studies, reviewing and assessing the primary studies, discussing findings, drawing conclusions, as well as the completion of the manuscript. All authors read and approved the final manuscript.