Length of sick leave – Why not ask the sick-listed? Sick-listed individuals predict their length of sick leave more accurately than professionals
© Fleten et al; licensee BioMed Central Ltd. 2004
Received: 24 May 2004
Accepted: 12 October 2004
Published: 12 October 2004
The knowledge of factors accurately predicting the long lasting sick leaves is sparse, but information on medical condition is believed to be necessary to identify persons at risk. Based on the current practice, with identifying sick-listed individuals at risk of long-lasting sick leaves, the objectives of this study were to inquire the diagnostic accuracy of length of sick leaves predicted in the Norwegian National Insurance Offices, and to compare their predictions with the self-predictions of the sick-listed.
Based on medical certificates, two National Insurance medical consultants and two National Insurance officers predicted, at day 14, the length of sick leave in 993 consecutive cases of sick leave, resulting from musculoskeletal or mental disorders, in this 1-year follow-up study. Two months later they reassessed 322 cases based on extended medical certificates. Self-predictions were obtained in 152 sick-listed subjects when their sick leave passed 14 days. Diagnostic accuracy of the predictions was analysed by ROC area, sensitivity, specificity, likelihood ratio, and positive predictive value was included in the analyses of predictive validity.
The sick-listed identified sick leave lasting 12 weeks or longer with an ROC area of 80.9% (95% CI 73.7–86.8), while the corresponding estimates for medical consultants and officers had ROC areas of 55.6% (95% CI 45.6–65.6%) and 56.0% (95% CI 46.6–65.4%), respectively. The predictions of sick-listed males were significantly better than those of female subjects, and older subjects predicted somewhat better than younger subjects. Neither formal medical competence, nor additional medical information, noticeably improved the diagnostic accuracy based on medical certificates.
This study demonstrates that the accuracy of a prognosis based on medical documentation in sickness absence forms, is lower than that of one based on direct communication with the sick-listed themselves.
The increasing rate of sick leave experienced in most Western countries challenges insurance companies, employers, and public authorities to identify measures to reduce burdens at the individual, workplace and societal levels.
To reduce the expenses of sick leave and the risk of expulsion from work, the Norwegian government introduced legislation in 1993 that anticipated early and more vigorous interventions of the Norwegian National Insurance Scheme . The Norwegian Public Report no. 27 , 2000, underscored the importance of early intervention by the National Insurance Offices (NIOs). A major challenge for the NIOs is to identify newly sick-listed individuals at risk of prolonged sick leave, and who are therefore potential candidates for rehabilitating interventions.
The selection process is currently based on information in medical sickness certificates supplied by access to the register of previous sickness benefits. A medical sickness certificate (Sickness Certificate 1; SC1) is required if sick leave exceeds 3 days, and after 8 weeks an extended medical certificate is mandatory (Sickness Certificate 2; SC2) . In addition to diagnosis and certified period, the majority of SC1s contain information on the occupation and employee, whereas information on chronic disease, previous sick leave episodes, prognosis and comments are more scattered. SC2s include updated medical information on work ability, planned diagnostics and treatments, and on the prognosis. The value of this information as a guideline for selective intervention has, however, never been established, either as an indicator of potential prolonged absence, or as an indicator of the need for occupational or vocational rehabilitation .
Based on the current practice with identifying sick-listed individuals at risk of long-lasting sick leaves, the objectives of this study were to inquire diagnostic accuracy of predictions within the NIOs, and to compare their predictions with the self-predictions of the sick-listed.
A total of 495 randomly selected persons received a questionnaire on the expected length of their ongoing sick leave period. The answer categories were: less than 4 weeks, 4 to 7 weeks, 8 to 11 weeks, 12 to 15 weeks, 16 to 25 weeks, 26 to 51 weeks, and at least 1 year. Some 152 persons (30.7%), called the responder group, returned the questionnaire with this question filled in.
Based on SC1s available after 14 days of sick leave, two NIO officers without formal medical competence, but experienced in working with sick-listed persons, and two experienced physicians working part time as insurance medical officers (NIO medical consultants), assessed the expected length in each of the 993 ongoing sick leave cases. In 496 randomly chosen cases, the NIO assessors had additional access to information on sick leave periods during the previous 3 years. Of potentially 1986 assessments in each profession, the officers and medical consultants had 18 and 25 missing assessments, respectively.
SC2s became available in 322 of the 459 cases where sick leave exceeded 8 weeks, and the NIO assessors reassessed these cases.
Reproducibility of assessments by medical consultants were analysed in 20 cases reassessed by the two NIO medical consultants, and assessed by another eight of their colleagues.
Observed length of sick leaves
The reference standard lengths of individual sick leaves within 1 year were collected from the National Sickness Benefit Register. Sick leaves interrupted by only 1–2 days without sickness benefits, typically on weekends, were registered as a single period. The observed length of sick leave thus comprised the total period of continuous full-time or part-time absence due to sickness within 1 year.
The diagnostic accuracy of predicted lengths was compared on the basis of sensitivity, specificity, likelihood ratio and the area under the receiver operating characteristics curves (ROC area) [6, 7]. The non-parametric standard error and 95% CI for the ROC area were calculated in SPSS-11. The ROC curve represents plots of the true-positive rate (sensitivity) and the false positive rate (1 – specificity) at the average of two consecutive categories of the assessments (>= 0 weeks, >= 4 weeks, >= 8 weeks etc). The ROC curves of the mean assessment by NIO officers and medical consultants include even intermediate points representing half categories.
The predictive validity is presented as sensitivity, specificity, positive predictive value (PPV) and likelihood ratio at different thresholds, cut-offs, in predicted length . Reliability of predicted length was analysed with agreement between assessors, the kappa value [9, 10].
The Regional Ethical Committee approved the protocol, and the Norwegian Data Inspectorate licensed the necessary register of sick-listed subjects.
The mean observed continuous sickness absence was 100.8 days (median 48 days). Sick leaves in females lasted a mean of 105.1 days, compared to 94.6 days in men (medians 55 and 43 days, respectively). The mean length among persons with musculoskeletal disorders was 90.2 days in 335 males and 108.6 days in 489 females. The mean length among persons with mental disorders was 120.6 days in 56 males and 90.0 days in 113 females.
The mean length of the sick leave in the responder group was 107.4 days (95% confidence interval, CI, 88.7–126.1 days), compared to 92.4 days in the 343 non-responders. Stratified analysis revealed longer mean sick leaves among responders 40 years and younger, of 109.3 days (95% CI 81.4–134.5 days), compared to the 79.3 days (95% CI 65.6–93.1 days) in non-responders. Stratification on gender or musculoskeletal or mental disorders did not reveal any significant differences in the length of sick leave between responders and non-responders.
Categorical distribution of observed and predicted length of sick leave. Observed and predicted length of sick leaves in seven categories for all participants (n = 993) compared to the responder group (n= 152). The assessments of National Insurance medical consultants and officers are grouped according to proportions of persons predicted in each category.
All participants Proportion according to
Responder group Proportion according to
Length of sick leave categories
Observed length %
Assessed by medical consultants %
Assessed by officers %
Observed length %
Assessed by medical consultants %
Assessed by officers %
Assessed by sick-listed %
< 4 weeks
>= 52 weeks
Receiver operating characteristics of prediction
ROC area of identifying sick leaves lasting at least 12 weeks. The ability to identify sick leaves lasting at least 12 weeks in the responder group (n = 152) and in all participants (N = 993), presented as ROC area, calculated from length of sick leave predicted by sick-listed, and mean length predicted by National Insurance medical consultants and officers. The range of the individual National Insurance ROC areas is presented for all participants.
Self-assessed Responders n = 152
Responders n = 149
All participants n = 972
Responders n = 150
All participants n = 975
years of age
years of age
Sensitivity, specificity, predictive value and likelihood ratio
Predictive validity – identifying sick leaves lasting at least 12 weeks. Predictive validity of identifying sick leaves that lasted at least 12 weeks, using 8 weeks as the cut-off in length as predicted by the sick-listed, medical consultants and officers. The prediction based on the Sickness Certificate 2 (SC2) used a cut-off in predicted length of at least 12 weeks. Sensitivity, specificity, PPV, and likelihood ratio data for NIO assessors are presented as means with 95% CI.
Sensitivity (95% CI)
Specificity (95% CI)
Likelihood ratio (95% CI)
PPV1 (95% CI)
PPV adjusted to prevalence 33.4% (95% CI)
Medical consultants Responder group
Medical consultants All participants
Officers Responder group
Officers All participants
Medical consultants SC2
Predictive validity – identifying sick leaves lasting at least 26 weeks. Predictive validity of the ability to identify sick leaves lasting at least 26 weeks, using 8, 12 or 26 weeks, as cut-offs in length as predicted by the sick-listed, medical consultants or officers. Sensitivity, specificity, PPV and likelihood ratio data for NIO assessors are presented as means for length predicted on Sickness Certificates 1 and Sickness Certificates 2 (SC2).
Sensitivity (95% CI)
Specificity (95% CI)
Likelihood ratio (95% CI)
PPV1 (95% CI)
PPV adjusted to prevalence 17.9%(95% CI)
Sick-listed >= 8 weeks
Sick-listed >= 12 weeks
Sick-listed >= 26 weeks
Consultants >= 8 weeks
Consultants >= 12 weeks
Consultants >= 26 weeks
Officers >= 8 weeks
Officers >= 12 weeks
Officers >= 26 weeks
Consultants >= 12 weeks
Consultants >= 26 weeks
Officers >= 12 weeks
Officers >= 26 weeks
Duration of at least 8 weeks was the preferable cut-off in predicted length, to identify sick leaves lasting at least 12 weeks (Table 3). A predicted length of at least 12 weeks reduced the sensitivity in all the data to 0.17 in medical consultants and 0.25 in officers. The corresponding improvement in PPV was modest, reaching 0.54 in medical consultants and 0.45 in officers. Using a predicted length of at least 4 weeks would have markedly reduced the specificity (Figure 2).
The sensitivity of identifying sick leaves lasting at least 26 weeks was generally low when medical consultants and officers predicted on the basis of SC1s. (Table 4). The sensitivity was improved somewhat by introducing SC2 information, but the effects on likelihood ratio and PPV if prevalence corrected, were minor.
According to the results, the effects of the different predictive strategies can be illustrated by considering a program designed to intervene in all cases where the subject is expected to be sick-listed for more than 12 weeks at 14 days of sick leave. Out of every 1000 sick-listed persons, 333 will be sick-listed for more than 12 weeks according to the prevalence in this study. The random selection of 333 persons will include 111 true positives, while 333 persons selected by officers will include 133 of the 333 persons that will be sick-listed at least 12 weeks. The evaluation of 1000 sick-listed individuals thus increases the number of true positives by 22 in a selection of 333 sick-listed persons. The alternative strategy of asking the sick-listed themselves will include 210 true positives in a selection of 333 persons.
Reliability and reproducibility of the predicted length
Agreement between medical consultants in their initial prediction of sick leaves lasting at least 12 weeks, was fair, with a kappa of 0.31 (95% CI 0.20–0.43). The corresponding kappa value between officers was 0.05 (95% CI -0.05–0.14).
In the prediction of sick leaves lasting at least 12 weeks based on the SC2, agreement was moderate between medical consultants (kappa = 0.42, 95% CI 0.29–0.54) and fair between officers (kappa = 0.26, 95% CI 0.10–0.42). The corresponding agreements in the prediction of sick leaves lasting at least 26 weeks were moderate between medical consultants (kappa = 0.55, 95% CI 0.40–0.70) and fair between insurance officers (kappa = 0.31, 95% CI 0.17–0.47).
The differences in diagnostic accuracy, between the two participating medical consultants and their eight colleagues in the reproducibility group, were not significant.
The results of the present study question any practical value of using information in medical sickness certificates in predicting the length of sick leave, as is the current practice in Norwegian NIOs. Instead, the sick-listed themselves predicted their length of sick leaves far more accurately, but this information is not routinely sought.
The officers in the present study were selected from experienced officers who had shown an interest in the field of sick leave. This might introduce a bias of overestimating the officers' general ability to predict the length of sick leaves. The performances of the two medical consultants were representative of eight of their colleagues who participated in the reproducibility part of the study. We therefore consider the diagnostic accuracy of the assessors to be representative of their professional groups, or at least not underestimated due to bias. Although the diagnostic accuracy varied within each group, the main conclusion of better predictive ability among the sick-listed, was challenged neither by comparing with the mean length predicted by assessors, nor by comparing with the best-performing NIO assessor.
The distributions of gender and diagnosis among the 993 persons included in the study were comparable with those in the National Sickness Benefits Register. The findings of longer sick leaves in women with musculoskeletal disorders, and longer sick leaves in men with mental disorders, are consistent with the Register and other studies [11–13].
The low responder rate among the sick-listed introduced a possible selection bias, although we could not identify any selection bias in gender, age, diagnosis or occupation . If there was a selection towards more predictable sick leaves, this should have been reflected in the assessments of officers and medical consultants. The general trend of lower diagnostic accuracy of NIO assessors in the responder group indicates that if any selection bias contributes to the results, it is an underestimate of the self-predictive ability.
Why did the sick-listed make better predictions?
If the lengths of sick leaves were predominantly related to loss of function caused by sickness, in line with the legislation, we would expect that the medical consultants' professional competence would favour them in predictions of the lengths of sick leaves. The differences we observed between medical consultants and officers in mean ROC area, were however minor. Furthermore, we could not demonstrate any significant differences in diagnostic accuracy between medical consultants and officers when aggregate information on disease, treatment, function related to work, and prognoses were available in the SC2. The improvement in ROC area with this aggregated information was minor, with the area just reaching 70%, which is considered borderline useful for some purposes . The result is in line with Bjørndal's findings of low prognostic impact of the SC2 , and is supported by findings of a low predictive power of symptoms and signs in neck and shoulder disorders . The better prediction of the length of sick leave by the sick-listed themselves, is supported by studies that have identified different non-disease determinants of sick leave, such as job satisfaction , attitudes towards pain , irreplaceability  and psychosocial work environment [20–22]. Studies identifying that at least the initial sickness certification is predominantly patient controlled [23, 24] indicate the competence of the sick-listed. Self-rated health seems to be an independent predictor of return to work , disability pension  and early retirement . Our findings can be interpreted as indicating that the subjective perception of sickness and work ability is more predictive of the length of sick leave, than the apparently more objective description in medical terms. The differences in predictive ability were especially significant in persons with mental and neck disorders, while the NIO assessors performed equal to the sick-listed in the more clear-cut injuries with more standardised treatment and prognosis. Mental disorders, with high prevalence in the population, and an increasing cause of absence , are of special interest . This increasing prevalence of sick leaves indicates the presence of factors separate from the diagnosis criteria. It seems that the more clear-cut the disease and the recommended treatment, the lesser the gain in predictive ability achieved by asking the sick-listed, and vice versa. The modest gain in predictive ability caused by introducing more medical information by the inclusion of the SC2 supports this interpretation. A more complete description of symptoms and treatment does not necessarily give better prognostic information when this includes little knowledge of the consequences related to occupation, and the effects of treatment are undocumented or, at best, marginal.
Diagnostic accuracy – practical implication
The Norwegian NIO is obliged by legislation to perform early intervention on the sick-listed in an effort to reduce the length of sick leave and the risk of expulsions from work. Limited resources and the large number of sick-listed individuals make selection desirable before any intervention is initiated. An alternative to selection on the basis of medical certificates is to communicate directly with the sick-listed themselves. This selection for intervention by NIOs might be seen as screening. The aim is to reach – at an acceptable cost – as many as possible of those that might profit from intervention. The potential individual gain by intervention will be greater when longer lasting sick leaves can be anticipated, and greater the sooner individual intervention programs are established.
The marginal predictive ability and modest agreement between NIO assessors questions the use of resources in selection based on information from medical certificates. The predictions of medical consultants tend to be better than those of officers, but not to an extent that makes it more meaningful to use medical consultants in the selection process, rather than officers.
With limited resources for intervention, it might be more cost effective to identify those whose sick listing will last longer than 26 weeks instead of 12 weeks. Based on self-reporting, eight out of ten would be true positives, and one fourth of the individuals would be reached. To reach the same number of true positives at 14 days of sick leave, the ratio of true positives would be reversed from eight out of ten, to two or three out of ten, if the selection were based on medical certificates.
In the search for tests predicting long-lasting sick leaves, such as The Örebro Musculoskeletal Pain Questionnaire , the present study indicates that the results of any such tests should be compared with the results of crude self-estimated length.
Sick-listed individuals predicted their length of sick leave far more accurately than did NIO medical consultants and officers based on information from sickness certificates and the history of past sick leaves. The predictions of sick-listed males were better than those of females, and older persons predicted better than younger persons. The availability of more information, as through the SC2, had only a minor effect on the predictive ability of the medical consultants and officers. Neither reliability nor validity of their predictions was satisfactory.
This study demonstrates the need to re-consider the diagnostic usefulness of documentation on sickness absences, and supports a change in strategy from collecting more medical information to more direct communication with the sick-listed themselves, for effective and early interventions to prevent long sick leaves and expulsions from work.
The authors want to thank the Norwegian Ministry Of Health and Social Affairs for funding the study from July 1997 to December 1999, canalized through the National Insurance Administration (project no. 13345).
This study could not have been performed without the support and contribution of the county and local National Insurance Offices in Troms.
- Ministry of Labour and Government Administration; St.meld.39 (1991-1992). Attføring og arbeid for yrkeshemmede. Sykepenger og uførepensjon. (Attføringsmeldingen). [White Paper of Vocational Rehabilitation]. 1991, OsloGoogle Scholar
- Ministry of Health and Social affairs; Norwegian Public Report no 27 2000 [Sickness absence and disability pensioning]. 2000, OsloGoogle Scholar
- Berg JE, Tellnes G, Noreik K, Melsom H: Sykmelding II-ordningen. Fra prosjektet Evaluering av oppfolging av langtidssykmeldte. [The sick leave notification II system. From the project Evaluation of follow-up of long-term sick leave patients]. Tidsskr Nor Laegeforen. 1990, 110: 1393-1397.PubMedGoogle Scholar
- Fleten N, Johnsen R, Ostrem BS: Reliability of sickness certificates in detecting potential sick leave reduction by modifying working conditions: a clinical epidemiology study. BMC Public Health. 2004, 4: 8-10.1186/1471-2458-4-8.View ArticlePubMedPubMed CentralGoogle Scholar
- Tellnes G, Brage S, Haland EM, Brodholt A: Hvilke symptomer og plager forer til sykmelding? ICPC-koding av pasientenes egne vurderinger i allmennpraksis. [What symptoms and complaints result in sick-listing? ICPC-coding of patients' own opinion in general practice]. Tidsskr Nor Laegeforen. 1992, 112: 1985-1988.PubMedGoogle Scholar
- Hanley JA, McNeil BJ: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982, 143: 29-36.View ArticlePubMedGoogle Scholar
- Swets JA: Measuring the accuracy of diagnostic systems. Science. 1988, 240: 1285-1293.View ArticlePubMedGoogle Scholar
- Moons KG, Stijnen T, Michel BC, Buller HR, Van Es GA, Grobbee DE, Habbema JD: Application of treatment thresholds to diagnostic-test evaluation: an alternative to the comparison of areas under receiver operating characteristic curves. Med Decis Making. 1997, 17: 447-454.View ArticlePubMedGoogle Scholar
- Altman GD: Practical statistics for medical research. 1991, Chapman and Hall, 396-409. 1Google Scholar
- Fleiss JL: The Measurement of Interrater Agreement. Statistical Methods for Rates and Proportions. 1981, New York, John Wiley, 13: 212-236. 2Google Scholar
- Brage S, Nygard JF, Tellnes G: The gender gap in musculoskeletal-related long-term sickness absence in Norway. Scand J Soc Med. 1998, 26: 34-43.PubMedGoogle Scholar
- Hensing G, Brage S, Nygard JF, Sandanger I, Tellnes G: Sickness absence with psychiatric disorders--an increased risk for marginalisation among men?. Soc Psychiatry Psychiatr Epidemiol. 2000, 35: 335-340. 10.1007/s001270050247.View ArticlePubMedGoogle Scholar
- Sandanger I, Nygard JF, Brage S, Tellnes G: Relation between health problems and sickness absence: gender and age differences--a comparison of low-back pain, psychiatric disorders, and injuries. Scand J Public Health. 2000, 28: 244-252. 10.1080/14034940050500474.PubMedGoogle Scholar
- Fleten N, Johnsen R, Ostrem BS: Sykmeldte tror tiltak på arbeidsplassen kan redusere sykefravær [Sick-listed patients think job adjustments might reduce sick-leaves]. Tidsskr Nor Laegeforen. 1999, 119: 3730-3734.PubMedGoogle Scholar
- Bjorndal A: Oppfølging av langtidssykemeldte. En undersøkelse av en kohort fra Moss kommune. [Follow-up of persons on long-term sick-leave. A cohort study in the city of Moss]. Tidsskr Nor Laegeforen. 1994, 114: 2857-2862.PubMedGoogle Scholar
- Viikari-Juntura E, Takala E, Riihimaki H, Martikainen R, Jappinen P: Predictive validity of symptoms and signs in the neck and shoulders. J Clin Epidemiol. 2000, 53: 800-808. 10.1016/S0895-4356(00)00197-9.View ArticlePubMedGoogle Scholar
- van-der-Giezen AM, Bouter LM, Nijhuis FJ: Prediction of return-to-work of low back pain patients sicklisted for 3-4 months. Pain. 2000, 87: 285-294. 10.1016/S0304-3959(00)00292-X.View ArticlePubMedGoogle Scholar
- Lofvander M: Attitudes towards pain and return to work in young immigrants on long- term sick leave. Scand J Prim Health Care. 1999, 17: 164-169. 10.1080/028134399750002584.View ArticlePubMedGoogle Scholar
- Aronsson G, Gustafsson K, Dallner M: Sick but yet at work. An empirical study of sickness presenteeism [In Process Citation]. J Epidemiol Community Health. 2000, 54: 502-509. 10.1136/jech.54.7.502.View ArticlePubMedPubMed CentralGoogle Scholar
- Kivimaki M, Elovainio M, Vahtera J: Workplace bullying and sickness absence in hospital staff. Occup Environ Med. 2000, 57: 656-660. 10.1136/oem.57.10.656.View ArticlePubMedPubMed CentralGoogle Scholar
- Niedhammer I, Bugel I, Goldberg M, Leclerc A, Gueguen A: Psychosocial factors at work and sickness absence in the Gazel cohort: a prospective study. Occup Environ Med. 1998, 55: 735-741.View ArticlePubMedPubMed CentralGoogle Scholar
- Voss M, Floderus B, Diderichsen F: Physical, psychosocial, and organisational factors relative to sickness absence: a study based on Sweden Post. Occup Environ Med. 2001, 58: 178-184. 10.1136/oem.58.3.178.View ArticlePubMedPubMed CentralGoogle Scholar
- Englund L, Tibblin G, Svardsudd K: Variations in sick-listing practice among male and female physicians of different specialities based on case vignettes. Scand J Prim Health Care. 2000, 18: 48-52. 10.1080/02813430050202569.View ArticlePubMedGoogle Scholar
- Larsen BA, Forde OH, Tellnes G: Legens kontrollfunksjon ved sykmelding. [Physician's role in certification for sick leave]]. Tidsskr Nor Laegeforen. 1994, 114: 1442-1444.PubMedGoogle Scholar
- Mansson NO, Rastam L: Self-rated health as a predictor of disability pension and death--a prospective study of middle-aged men. Scand J Public Health. 2001, 29: 151-158. 10.1080/14034940152393426.View ArticlePubMedGoogle Scholar
- Mein G, Martikainen P, Stansfeld SA, Brunner EJ, Fuhrer R, Marmot MG: Predictors of early retirement in British civil servants. Age Ageing. 2000, 29: 529-536. 10.1093/ageing/29.6.529.View ArticlePubMedGoogle Scholar
- Norwegian National Insurance Administration; Planning and Research Department; Basisrapport 2000 [Basis report 2000]. 2001, OsloGoogle Scholar
- Linton SJ, Boersma K: Early identification of patients at risk of developing a persistent back problem: The predictive validity of the Orebro Musculoskeletal Pain Questionnaire. Clinical Journal of Pain. 2003, 19: 80-86. 10.1097/00002508-200303000-00002.View ArticlePubMedGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2458/4/46/prepub
This article is published under license to BioMed Central Ltd. This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.