Aspects of functioning and environmental factors in medical work capacity evaluations of persons with chronic widespread pain and low back pain can be represented by a combination of applicable ICF Core Sets

Background Medical work capacity evaluations play a key role in social security schemes because they usually form the basis for eligibility decisions regarding disability benefits. However, the evaluations are often poorly standardized and lack transparency as decisions on work capacity are based on a claimant’s disease rather than on his or her functional capacity. A comprehensive and consistent illustration of a claimant’s lived experience in relation to functioning, applying the International Classification of Functioning, Disability and Health (ICF) and the ICF Core Sets (ICF-CS), potentially enhances transparency and standardization of work capacity evaluations. In our study we wanted to establish whether and how the relevant content of work capacity evaluations can be captured by ICF-CS, using disability claimants with chronic widespread pain (CWP) and low back pain (LBP) as examples. Methods Mixed methods study, involving a qualitative and quantitative content analysis of medical reports. The ICF was used for data coding. The coded categories were ranked according to the percentage of reports in which they were addressed. Relevance thresholds at 25% and 50% were applied. To determine the extent to which the categories above the thresholds are represented by applicable ICF-CS or combinations thereof, measures of the ICF-CS’ degree of coverage (i.e. content validity) and efficiency (i.e. practicability) were defined. Results Focusing on the 25% threshold and combining the Brief ICF-CS for CWP, LBP and depression for CWP reports, the coverage ratio reached 49% and the efficiency ratio 70%. Combining the Brief ICF-CS for LBP, CWP and obesity for LBP reports led to a coverage of 47% and an efficiency of 78%. Conclusions The relevant content of work capacity evaluations involving CWP and LBP can be represented by a combination of applicable ICF-CS. A suitable standard for documenting such evaluations could consist of the Brief ICF-CS for CWP, LBP, and depression or obesity, augmented by additional ICF categories relevant for this particular context. In addition, the unique individual experiences of claimants have to be considered in order to assess work capacity comprehensively.


Background
Even though the process of disability evaluation varies between countries, medical work capacity evaluations usually play a crucial role in deciding on a claimant's eligibility for benefits provided by national disability insurance schemes. Because of the key role they play, such evaluations ought to be transparent and comprehensible for all persons involved [1][2][3][4]. To enhance transparency and comprehensibility, the claimant's lived experience in relation to his or her functioning as well as with regard to influencing contextual factors should be assessed comprehensively [2,5]. Moreover, the evaluations' comparability in terms of both interrater reliability between medical experts and content validity is considered as an important quality criterion [6][7][8]. Finally, standardization is seen as one means to ensure comparability in disability assessments [9,10].
Medical standards usually refer to features which are considered as relevant to a target group in general and less so to individuals' unique experiences [11,12]. As a basis for comprehensive disability evaluations, however, a suitable standard should also allow the description of relevant experiences unique to the individual, thus complementing the whole process of evaluation [12].
In reality, decisions on work capacity often lack transparency and comprehensibility [10,[13][14][15]. Also, disability assessments are often insufficiently standardized [5,16,17], which affects their content validity and interrater reliability negatively [8,9,17]. In the Swiss national disability insurance scheme, for example, there is no generally accepted tool to guide the structure and content of disability evaluations [3]. Furthermore, decisions on work capacity for certain disorders are partly based on blanket rulings by the Swiss Federal Court [3]. Somatoform pain disorders, for instance, do generally not lead to incapacity for work. Because they are considered to be caused by psychosocial factors, the Swiss Social Security law does not recognize them as a sufficient reason for a disability pension, except if they are accompanied by a psychiatric co-morbidity like, for example, a depressive disorder [18]. By contrast, pain disorders caused by structural impairments (e.g. by a severe intervertebral disc disorder) normally entitle a person to receive disability benefits. However, diagnoses or impairments, are only loosely connected with functional limitations at work [19][20][21]. Moreover, the World Health Organization defines impairment as a loss or abnormality of a psychological, physiological, or anatomical structure or function and disability as a restriction or lack of ability to perform an activity in a manner considered to be normal for a human being [22]. Based on these definitions, focusing only on impairments is not sufficient to give a proper statement about a claimant's functional capacity at work.
Because pain is a subjective sensation, its impact on a claimant's functional capacity is difficult to objectify. Claimants with somatoform pain disorders could have the same or even a lower functional capacity than persons with a disorder related to a structural impairment. Nevertheless, according to Swiss jurisprudence their work capacity is usually rated higher. With respect to this controversy between the medical and the legal view, it seems crucial to apply a disability-oriented approach and to comprehensively assess the aspects which might influence a claimant's functioning and health in order to ensure transparent disability evaluations for persons with chronic pain.
Several attempts have been undertaken to enhance transparency and standardization in disability evaluations [23]. The Guides to the Evaluation of Permanent Impairment of the American Medical Association (AMA) are used for disability and impairment assessment and as a standard for workers' compensation evaluations in the United States and many English-speaking countries [24]. Furthermore, a number of standardized procedures for work capacity assessments have been developed like, for example, the Functional Capacity Evaluation (FCE) [25][26][27].
FCE, however, is not appropriate for multidisciplinary assessments as it is not geared towards a comprehensive evaluation of the claimant's functioning. It focuses on physical functional limitations and not on mental functioning [25], and it does not address environmental factors, an important component to ensure transparency in disability evaluations [5,28]. The AMA Guides have been questioned regarding their applicability in disability assessments of claimants with chronic pain [1], because they follow a diagnosis-based and impairment-oriented rather than a disability-oriented approach [29].
As part of the shift in recent years from impairmentoriented to disability-oriented assessments in European social security institutions, it has been suggested that the comprehensive conceptual framework and standardized taxonomy of the International Classification of Functioning, Disability and Health (ICF) [30] could improve the disability determination process [16,[31][32][33]. Since the ICF offers a scientific basis for describing results and determinants of functioning, disability and health which also considers contextual factors [30], standardization and transparency in disability evaluations might be enhanced if the taxonomy would be used as a blueprint.
While the ICF framework was generally well-received, the actual application of the taxonomy has been hampered by the sheer number of categories to be assessed, i.e. 362 on the second level and up to 1,424 when applying the more detailed third and fourth levels. Consequently, ICF Core Sets (henceforth ICF-CS) have been developed in order to simplify the use of the taxonomy in clinical settings.
ICF-CS preserve the model of the ICF in a useable mode, and they come in two flavors: (1) brief ICF-CS include a minimum number of categories describing the most relevant aspects related to functioning in persons with a specific health condition or in a specific setting [34]; (2) comprehensive ICF-CS include all categories of the respective brief ICF-CS but also additional ones so as to facilitate multidisciplinary assessments in the clinical context [35].
Because they involve high costs and time resources of medical experts are limited, medical work capacity evaluations should not only be transparent but also efficient and practical [36]. ICF-CS allow to describe a person's lived experience in a comprehensive and systematic way [35], and might be applied as practical standards regarding what should be documented in disability assessments. So far there have been only few attempts to examine the applicability of ICF-CS in disability evaluations [16,37]. To ascertain their potential it is, therefore, vital to provide further empirical evidence.
Currently ICF-CS exist for about 30 health conditions [38]. The ICF-CS for chronic widespread pain (CWP) [39] and low back pain (LBP) [40] were published in 2004 and subsequently validated in the clinical context [41][42][43]. Due to the high prevalence of disability claims and large social costs based on CWP and LBP [44][45][46][47], we chose them as our index conditions. Both conditions are also often diagnosed concurrently [48].
Moreover, CWP has been found to be related to depression [49] and chronic LBP to obesity [50]. Such comorbidities are routinely addressed in disability assessments of claimants with chronic pain. We, therefore, also included in our analysis the ICF-CS for depression [51] and obesity [52].

Objective
The objective of the study was to establish whether or not and how the relevant content of medical work capacity evaluations can be captured by ICF-CS, using medical reports from disability claimants with the index conditions CWP and LBP as examples.

Specific aims
(1) We wanted to examine to what extent the relevant aspects of functioning and environmental factors in medical reports of claimants with CWP and LBP are represented by applicable ICF-CS. (2) We wanted to determine by which ICF-CS, or combinations thereof, these aspects are best represented.

Study design
A mixed methods study [53] was conducted, involving a qualitative and quantitative content analysis [54,55] of medical reports. The ICF was used for data coding.

Ethics
The study was approved by the Ethics Commission of Basel, Switzerland, project number 134/08, and was performed in accordance with the Declaration of Helsinki.

Sample
The reports analyzed were derived from an elicitation of all medical reports received by the major Swiss health and accident insurers between February 1 and April 31, 2008, as part of a study on the quality of medical work capacity evaluations in Switzerland [3]. Insurance employees selected and anonymized all reports containing a diagnosis of CWP and/or LBP based on the International Classification of Diseases (ICD-10) (see Table 1). The diagnoses were checked by two health professionals. To ensure comparability, only reports in German submitted to the Swiss national disability insurance scheme were selected. Reports in French and Italian as well as from accident, health and liability insurances were excluded.
From this basic sample a subsample was randomly drawn. The determination of the final sample size was based on two criteria: (1) heterogeneity, i.e. the relevant medical disciplines of pain-assessment and the index conditions (CWP, LBP) were to be included proportionally; and (2) saturation, i.e. the collected information was considered to be sufficient when no new secondlevel ICF category emerged in five successive reports analyzed [56][57][58]. In order to satisfy the heterogeneity requirement, i.e. a proportional inclusion of the medical disciplines and the index conditions, a minimum size of the subsample was determined.

Analysis plan
For the data analysis the sample was divided into two sub-groups: (1) reports with CWP diagnoses, and (2) reports with LBP diagnoses. Reports including both diagnoses entered the data analysis twice, once with the pure CWP and once with the pure LBP reports.
To examine the extent to which the relevant aspects of functioning and environmental factors in medical reports of claimants with CWP and LBP are represented by applicable ICF-CS, we first did a content analysis of the reports, using the ICF for data coding. We then ranked the coded categories for both sub-groups according to their relevance, i.e. their relative frequency across reports, setting thresholds at 25% and 50%. Next, we examined whether the relevant ICF categories in CWP reports, i.e. the ones above the thresholds, are represented by the ICF-CS for the index condition (CWP) and major co-morbidities (LBP, depression). For LBP reports, we did the same analysis with the ICF-CS for the index condition (LBP) and major co-morbidities (CWP, obesity). By calculating and comparing values for their coverage (i.e. their content validity) and efficiency (i.e. their potential practicability) we determined to what extent the relevant aspects in the reports are represented by the ICF-CS for the index-condition, the co-morbidities, and a combination thereof and which ICF-CS or combination thereof is best representing these aspects.

Content analysis
Our raw data consisted of reports on disability claimants. They comprised one or more medical disciplines and included information on: (a) socio-medical history, (b) medical examination, and (c) work capacity evaluation. This content was coded to the ICF by applying established linking rules [59,60].
The reports were dissected into text passages, each representing a self-contained unit of meaning (e.g. "the claimant suffers from pain while walking"). The various concepts underlying a unit of meaning were determined (e.g. pain, walking) and coded to the most precise ICF category (e.g. b280 Sensation of pain, d450 Walking) by two health professionals trained in the ICF. A concept could be linked to more than one ICF category. Each instance of a category code being assigned to a concept was referred to as a coding. Concepts not appropriately codeable to ICF categories were flagged as either personal factors (e.g. individual attitudes and beliefs), not covered (e.g. degree of disability), not definable (e.g. demanding activities), or health condition (e.g. diabetes). The two coders assessed whether the categories represented limitations (e.g. "the claimant suffers from back pain") or, in case that they were environmental factors, whether they were barriers (e.g. "the surgery made the pain worse") or facilitators (e.g. "the surgery was helpful") for the claimant, were no problem (e.g. "the surgery had no effect"), or facts (e.g. "the surgery was performed recently"). Finally, the coders had to agree on the chosen codes. Any disagreement was solved in consultation with a third person experienced in the linking method.

Reliability and saturation
The interrater agreement was calculated using Cohen's kappa coefficient [61]. The saturation level was checked after each additional report analyzed.

Relevance ranking
Referring to the absolute frequency for determining relevance was deemed potentially misleading because different writing styles of medical experts could have led to varying degrees of content repetitions. Therefore, we operationalized the relevance of a coded category as its relative frequency across reports, i.e. the percentage of reports in which it appeared as a limitation, barrier or facilitator for the claimant. In order to ensure comparability with the ICF-CS, which refer to aspects that are problematic or supportive for the patients, we did not include the ICF categories assessed as no problem or facts in the ranking. Moreover, since the concepts not appropriately codeable with the ICF were not further specified in this study, they were not included in the ranking. Thus, the final ranking involved only secondlevel ICF categories coded either as a limitation, barrier or facilitator. For the ensuing data analysis we defined two thresholds of minimum relevance, the more lenient one at 25% or more of the reports, the more stringent one at 50% or more.

Coverage and efficiency ratios
We used two criteria to examine the extent to which the relevant content of medical reports involving CWP and LBP is represented by ICF-CS. (1) The coverage ratio, i.e. the ability of ICF-CS to capture the relevant aspects of the context in which they are applied (namely the index conditions CWP and LBP and the assessment of work capacity as part of disability evaluations). It was calculated as the number of ICF-CS categories above the threshold of 25% (or 50%) divided by the total number of ICF categories above the threshold. (2) The efficiency ratio, i.e. the ability of ICF-CS to be manageable and to contain only as many categories as necessary. It was calculated as the number of ICF-CS categories above the threshold divided by the total number of categories in the ICF-CS. A definition of efficiency which is similar to ours was applied in a recent study where it was defined as the ability of a measurement instrument to be manageable and to contain as few items as possible that measure variables outside a domain set of ICF categories used in that study [62]. ICF-CS should ideally show a high coverage ratio and be efficient at the same time.
Referring to Figure 1, the operationalization of the coverage and efficiency ratios can be further illustrated as follows:

Sample characteristics
In order to satisfy the heterogeneity requirement, the required minimum sample size had been determined to be 72 medical reports, representing about one third of the basic sample of 209 reports. The saturation criterion was already reached after coding 30 reports. The number and type of disciplines in the reports are displayed in Table 2.    involved a diagnosis related to "Obesity and other hyperalimentation". The overall interrater agreement (Cohen's kappa) at the second ICF-level was 0.80 (0.79 -0.81; 95% bootstrap confidence interval [63]). Relevance ranking 76 ICF categories passed the 25% and 37 the 50% threshold and were identified as relevant for CWP reports. Table 3 shows if the categories are included in the ICF-CS for CWP, LBP and depression.

Coverage and efficiency ratios
Focusing on the more inclusive 25% threshold, the relevant aspects of functioning and environmental factors in CWP reports are represented with a coverage of 29% [54%] and an efficiency of 92% [61%] by the Brief [Comprehensive] ICF-CS for CWP (see Table 4).
When combining the ICF-CS for CWP, LBP and depression, the coverage ratio of the  Note: k = total number of categories in the respective ICF Core Set; † = ICF categories that were ignored in the ranking because the Brief and Comprehensive ICF Core Sets for CWP contain them on the more specific third level; X = included in the particular ICF Core Set (CWP, LBP or depression); * = in the particular ICF Core Set the stated category is included at the next lower (third) or next higher (second) level.

Relevance ranking
74 ICF categories passed the 25% and 33 the 50% threshold and were identified as relevant for LBP reports.

Discussion
We found that the relevant content of medical work capacity evaluations involving CWP and LBP can be captured to a considerable, albeit not perfect, extent by a combination of applicable ICF-CS. The relevant aspects of functioning and environmental factors in the reports were either represented by the ICF-CS for the index conditions (CWP, LBP) or for major co-morbidities (depression, obesity). In both groups of reports and for both relevance thresholds, a combination of the ICF-CS analyzed showed substantially higher coverage ratios than the condition-specific ICF-CS, i.e. they represented the relevant aspects of medical work capacity evaluations involving CWP and LBP to a higher extent. There is, however, a trade-off. Due to the increased number of categories when combining the ICF-CS, the efficiency ratios decreased considerably compared to the condition-specific ICF-CS in most cases.
An interesting finding with regard to the medical disciplines involved in the medical reports was that, in fact, psychiatry appeared in both groups of reports as the most frequent discipline. This clearly indicates the relevance of psychiatric assessments for multidisciplinary medical work capacity evaluations of persons with CWP and LBP and is also in line with the finding that a considerable percentage of our medical reports included a co-morbid disorder from the ICD-10 chapter "Mood [affective] disorders".
Overall, our results are in line with previous research in the field which found that the Comprehensive ICF-CS for CWP and LBP have a potential for structuring work capacity assessments [37].
Our findings are also in agreement with the recently developed ICF Core Sets for vocational rehabilitation [64] regarding the importance of highlighting the components activities, participation and environmental factors in the context of work and work capacity.
Finally, with regard to the generic core set for disability evaluation in social security [32] we feel that its lack of environmental factors may be a potential limitation if one aims for a comprehensive and transparent documentation of a claimant's work capacity. While the authors argue that environmental aspects are implicitly covered by the participation items, we found in our analysis of medical reports prepared in the context of disability evaluations that a number of environmental factors (e.g. e310 Immediate family; e165 Assets) are explicitly and frequently reported as barriers or facilitators for the claimants (see Tables 3 and 5). Note: m = total number of ranked categories above the respective threshold; k = total number of categories in the respective ICF Core Set; † = categories aggregated on the second level (expect categories only available on the third level in the Comprehensive ICF Core Set); * = adjusted for overlap between the categories of the three ICF Core Sets.

Study limitations
Our study has some limitations. Our sample only included medical reports in German of the Swiss national disability insurance scheme with an ICD-10-diagnosis for CWP and/or LBP. The results may therefore not be generalizable to other health conditions, nor to other insurance schemes or other countries with different disability evaluation procedures. Future research should involve validation studies which look into the generalizability of our findings.
Another limitation was the significant amount of content not appropriately addressed in the current ICF taxonomy. This refers mainly to some specific aspects of functioning related to work capacity (e.g. demanding activities) and to personal factors, which may influence work capacity [65] and could, when explicitly addressed, contribute to more transparent disability evaluations [66]. This limitation could have potentially missed factors critical and relevant to the process of work capacity evaluation which should be taken into account in future research.
Finally, one could argue that context-specific ICF-CS relevant to the field of work capacity evaluation, like the ones for vocational rehabilitation or the generic core set for disability evaluation in social security, may have been included in our analysis as well. However, as our sample included medical reports with the index conditions CWP and LBP, we decided to focus rather on conditionspecific ICF-CS than on context-specific or generic ones. It might be an issue for further research to determine the extent to which these ICF-CS are representing the content of medical reports of disability claimants.

Practical implications
Combining ICF-CS (e.g. CWP with LBP and depression, or LBP with CWP and obesity) is a more Note: k = total number of categories in the respective ICF Core Set; X = included in the particular ICF Core Set (LBP, CWP or obesity); * = in the particular ICF Core Set the stated category is included at the next lower (third) or next higher (second) level. Note: m = total number of ranked categories above the respective threshold; k = total number of categories in the respective ICF Core Set(s); † = categories aggregated on the second level; * = adjusted for overlap between the categories of the three ICF Core Sets.
effective approach for work capacity evaluations involving CWP and LBP than using solely conditionspecific ICF-CS. Taken together, the ICF-CS show a potential for guiding comprehensive multidisciplinary assessments. In particular, they could ensure transparency in disability evaluations as well as standardize them in terms of what should be documented. However, efficiency and practicability become problematic when simply combining ICF-CS due to the high number of categories to be assessed. To ensure high coverage and efficiency, a suitable standard for medical work capacity evaluations involving CWP and LBP could include: (1) All categories of the Brief ICF-CS for the index conditions and major co-morbidities because Brief ICF-CS are considered as a minimum standard or data set to be reported in different settings so as to enhance comparability [35]; (2) Those categories of the Comprehensive ICF-CS identified as relevant for the present context; (3) Those categories not included in the ICF-CS but identified as relevant for the present context (e.g. b435 Immunological system functions for CWP reports; e165 Assets for LBP reports).
Our relevance rankings display the categories which should be included in the standard. To ensure comprehensive evaluations, we recommend to focus on categories above the 25% threshold. Before being applied, however, future research would have to focus on a validation of the categories by experts in the field of work capacity evaluation.
Furthermore, the proposed ICF categories are the basis for a transparent documentation of those aspects of functioning which are relevant for a claimant's work capacity and should be seen as a complement to the claimant's diagnosis without necessarily having a direct implication on the work capacity decision itself. Whereas the categories can be used as a guideline for the evaluations in terms of what aspects should be documented, they are not addressing the issue of how these aspects should be assessed. This latter problem could be approached by assigning existing validated rating instruments to the suggested ICF categories.
Last but not least, it is important to emphasize that aspects of functioning which refer to the unique individual experience of a claimant, but are not necessarily addressed by the abovementioned ICF categories, should be considered in addition as complementary source of information to provide a comprehensive picture of the claimant.

Conclusions
The relevant content of medical work capacity evaluations involving CWP and LBP can be represented to a considerable extent by a combination of the ICF-CS for the index conditions and major co-morbidities. A suitable approach for a standardized documentation of the evaluations and for enhancing their transparency could consist of the Brief ICF-CS, augmented by additional ICF categories relevant for this particular context. Aspects not appropriately addressed in the current ICF taxonomy, such as personal factors, should be specified and eventually incorporated in such a standard as well. In addition, the unique individual experiences of claimants have to be taken into account in order to assess work capacity comprehensively.