A systematic review of studies measuring health-related quality of life of general injury populations

Background It is important to obtain greater insight into health-related quality of life (HRQL) of injury patients in order to document people's pathways to recovery and to quantify the impact of injury on population health over time. We performed a systematic review of studies measuring HRQL in general injury populations with a generic health state measure to summarize existing knowledge. Methods Injury studies (1995-2009) were identified with main inclusion criteria being the use of a generic health status measure and not being restricted to one specific type of injury. Articles were collated by study design, HRQL instrument used, timing of assessment(s), predictive variables and ability to detect change over time. Results Forty one studies met inclusion criteria, using 24 different generic HRQL and functional status measures (most used were SF-36, FIM, GOS, EQ-5D). The majority of the studies used a longitudinal design, but with different lengths and timings of follow-up (mostly 6, 12, and 24 months). Different generic health measures were able to discriminate between the health status of subgroups and picked up changes in health status between discharge and 12 month follow-up. Most studies reported high prevalences of health problems within the first year after injury. The twelve studies that reported HRQL utility scores showed considerable but incomplete recovery in the first year after discharge. Conclusion This systematic review demonstrates large variation in use of HRQL instruments, study populations, and assessment time points used in studies measuring HRQL of general injury populations. This variability impedes comparison of HRQL summary scores between studies and prevented formal meta-analyses aiming to quantify and improve precision of the impact of injury on population health over time.


Background
Worldwide, injuries are recognized as a major concern in public health, being the predominant cause of deaths in adults aged 1-45 years, and an important cause of disabilities [1,2]. The number of survivors of severe injuries has rapidly grown due to substantial improvements in trauma care. This has resulted in a shift of focus from mortality towards disability of injury patients. Disability (i.e. reduced levels of functioning resulting from diseases or injuries [3]) is increasingly seen as an important component of a population's health and for the field of injury prevention and trauma care [4].
Disability is a complex construct and can be measured using functional instruments or generic or disease specific HRQL measures, where disability represents the gap between measured and perfect HRQL. To enable straightforward comparisons with other disease groups and with general population norms, it is necessary to measure the consequences of injuries using generic health status measures (for instance the SF-36 or the EQ-5D). Some HRQL instruments generate a summary score (utility) that can contribute to a composite health outcome measure [1]. It has become common practice to quantify the impact of diseases and injuries on population health with the help of composite health outcome measures, such as quality-adjusted life years (QALYs) and disability-adjusted life years (DALYs) [4,5]. Sound epidemiological data on the incidence, severity and duration of the functional consequences of injuries are needed to make valid estimates of the years lived with disability due to injuries in the population. Data on all dimensions of functioning relevant to injuries are needed to describe the pattern of recovery or residual disability of injury patients over time. With the help of these data, the impact of injury on population health over time can be quantified. Measuring the impact of injury is particularly challenging due to the large variation in injury types and severity. The European Consumer Safety Association has published guidelines for the conduction of follow-up studies measuring injury-related disability based on a narrative literature search of papers from1995-2005 [1]. They concluded that in the injury field there is lack of consensus on preferred HRQL instruments and study designs [1]. However, this review only included 14 studies that measured HRQL in general injury populations. Derrett et al conducted a more recent systematic literature search of injury specific and generic studies measuring outcome after injury but restricted this to studies using the EQ-5D outcome measure. They called for further comprehensive population-level research exploring outcomes after injury, and particularly for studies focusing on 'all injury' [6]. It is clear that there is a need to obtain greater insight into patterns of HRQL in comprehensive injury populations in order to document people's pathways to recovery and to quantify the impact of injury on population health over time [1,6,7]. Given the appearance of additional studies after 2005, and the variety of generic measures in this field of research, the current systematic review was conducted to describe the up to date state of knowledge in this field and hopefully contribute to further consensus development on preferred methodologies within the injury research field.
This review focused on the measurement of HRQL with a generic instrument among general injury populations. The following key questions were addressed: a) which generic instruments were used?, b) how were these instruments administered?, c) at which time points was HRQL assessed?, and d) did the instrument measure changes over time and predictors for HRQL? Furthermore, in anticipation of substantial heterogeneity preventing formal meta-analysis we aimed to produce a narrative summary of study outcomes to improve insight into general recovery patterns and residual disability.

Data sources and search strategy
We conducted a literature search aiming to identify empirical studies on injury-related disability. Searches of eligible studies were conducted in PubMed (Medline), Web of Science, Embase, and PsychInfo. All peerreviewed articles published in the period January 1995 to 2009 were included in the searches. An electronic search strategy was developed in collaboration with a librarian with extensive experience in systematic reviews. Search terms used were: 'wounds and injuries', 'health status indicators', 'disability evaluation', 'functional outcome', 'health status measure', and 'cohort studies' (details in Additional file 1). Keywords were matched to database specific indexing terms. In addition to database searches, reference lists of review studies and articles included in the review were screened for titles that included key terms. More detailed information on the review can be found in the report compiled for the INTEGRIS (Integration of European Injury Statistics) project [8].

Selection criteria
The inclusion criteria were studies using HRQL instruments in injury patients irrespective of the underlying injury, published in English or German in a peerreviewed journal in the period 1995-2009. We focused on 'all injury' studies and therefore excluded injuryspecific studies (for instance limited to brain injuries or hip fractures). Studies concerning people other than the injury victim were excluded, e.g. studies of impact of witnessing trauma. We included studies that reflected the definition of injuries used by the World Health Organization (WHO) as 'relatively sudden discernible effects due to body tissue damage from energy exchanges or ingestion of toxic substances but not due to medical adverse events, and obtained from health care settings' [9]. We included only longitudinal studies in line with the EuroSafe guidance [1].

Data extraction
Relevant papers were selected by screening the titles (first step), abstracts (second step) and entire articles (third step), retrieved through the database searches. During each step the title, abstract or entire article was screened to ensure that it met the selection criteria listed above. This screening was conducted independently by two researchers (SP and EB). Two experts in this field (RL and JL) checked a sample of the abstracts (n = 50) on the inclusion criteria, to quality assure the process. Full articles were critically appraised by two reviewers (EB and SP), using data extraction forms developed for this study in a Microsoft Access-database.
Data were tabulated from studies that used a HRQL instrument and that reported a utility score or summary score, to give greater insight into the recovery patterns and changes over time of the different instruments. Utility scores are based on preferences or values related to health states and are derived from approaches used in decision theory and economics. Utility scores represent the total HRQL status of a person in a number on a 0 (or <0)-1 scale (where 0 indicates death or maximum amount of disability, and 1 being optimal health status). The EQ-5D and the SF-6D are examples of HRQL assessment instruments that produce such utility scores. Some instruments report a disability score (in which 1 represents the maximum amount of disability) for instance the WHODAS II [10]. Although there are some differences between the concepts, utility and disability scores will be referred to as summary scores in the remainder of this paper.

Literature search
The database search identified 6291 titles of potentially relevant articles. In the first round (scanning the titles) 6031 articles were excluded. The main reasons for exclusion were studies which did not concern injury or were restricted to specific injuries. Of the remaining 260 articles, 165 were excluded after scanning the abstracts, mainly because the paper did not include self-reported HRQL measures. This resulted in scanning 95 full texts of which 54 did not meet our inclusion criteria, leading to inclusion of 41 articles. In this last round the main reason for excluding a full-text article was not using a generic health status measure or not describing the general injured population in sufficient detail (Additional file 2).
Studies of disability restricted to the most severely injured patients are increasingly conducted. These studies used different severity scales and cutoff points for decision criteria for 'major trauma patients'. There were 10 studies that clearly described inclusion criteria using the most widely used inclusion definition of major trauma patients, namely an Injury Severity Score (ISS) > 15. Threshold scores from the Abbreviated Injury Scale (AIS), Glasgow Coma Score (GCS) or admission to a 'trauma center' for longer than 24 hours were also used as inclusion criteria.
Particularly for studies of low to moderate severity injury populations, e.g. emergency department (ED) attendees, there were difficulties in acquiring acceptable response and retention rates. Higher rates were more often reported in studies where outcome measures were administered by clinicians.

Measurement of health related quality of life Study design
Twenty-four different instruments were used to assess HRQL or functional status. Of the available generic instruments, the SF-36 (n = 15), (Wee)FIM (n = 10), GOS (n = 7) and the EQ-5D (n = 5) have most often been applied among injury patients (see Figure 1). Half of the studies used more than one instrument to measure HRQL (two instruments: n = 10; and more than two instruments: n = 10). None of these studies used an injury specific measure besides a generic measure. In the nine studies among children, only three used a children's instrument [21,39,40]. All three studies also included an 'all-ages' instrument.
Twenty-six studies used a longitudinal design with multiple assessments over time. HRQL was assessed most frequently at discharge, six months, one year, and two years following injury ( Figure 2). There were five papers that assessed pre-injury health status, i.e. after the injury patients experienced the shock of sustaining an injury [14,15,33,41]. Variation was also apparent in the mode of administration with a mixture of self-completed questionnaires (n = 14), face to face interviews (n = 13), and telephone interviews (n = 14) of which 4 were telephone proxy interviews with parents, being used.

Predictors for HRQL
High prevalence's of health problems within the first year after injury were a common finding of the studies. Studies often included a large variety of associated variables which affected disability scores. Predictive variables frequently reported included injury severity, type of injury, gender, mental health status and comorbidity. The generic instruments showed similar differences between subgroups. The SF-36 and the EQ-5D were reported to be able to discriminate between the health status of injured patients and non-injured persons and between patients with different types of injuries (e.g. [13,33,42,43]). Among the majority of studies hospitalization, injury type and/or mechanism, and injury severity were predictive for long-term disability.

Changes over time
All HRQL instruments demonstrated improvements in health over time within the first 3 to 6 months after the injury. Most studies reported improvement in HRQL between discharge and one year after injury, and studies of 'severe' injuries also found improvements one to two years following the injury (Table 1).  There were twelve studies that reported HRQL utility scores. Table 3 shows that the injury populations differed considerably. All studies reported measurable recovery in the first year after injury. Among the severely injury patients there was some evidence of further improvement in the second year. Figure 3 provides an overview of the studies that reported HRQL summary scores over time in the 12 months following discharge among patients aged 15 or older. Overall, HRQL improves in the first year after discharge, although the large variation in HRQL instruments, study population and time points at which HRQL was assessed impedes comparison of HRQL summary scores between studies.

Discussion
This systematic review aimed to provide greater insight into the measurement of functional outcome and recovery patterns of general injury populations in studies using a generic health state measure. There was considerable methodological variation between studies, including different settings, mixture of participants, instruments, and follow-up periods and timings of assessment. Among available generic instruments, the SF-36, FIM, GOS and EQ-5D have been most frequently used. Studies of functional outcome of the general injury population are still uncommon and generally not comparable, preventing an in-depth understanding of the HRQL experiences of injured persons. Evidence from our review lends support to the need for guidelines for the conduct of follow-up studies measuring injury-related disability.
Longitudinal studies with multiple time points measuring outcomes, and incorporating a retrospective assessment of the pre-injury situation are needed to produce valid estimates of injury-related morbidity and disability. Studies with this design provide insight into the course of recovery over time and quantify the longerterm functional consequences of injuries. There is still a lack of consensus on preferred HRQL instruments and study designs given the wide variety of different approaches that are used by the articles included in this review. This variability prevents meta-analyses necessary to refine quantification of the impact of injury on population health over time.
Almost no papers provided a description of the evaluation of the instruments used against widely accepted    criteria with in the field: data quality, reliability, validity and responsiveness (e.g. the COSMIN checklist [44]). Preferably, instruments should only be widely applied within the injury field if there is acceptable evidence for these measurement criteria in the population of interest. Empirical head-to-head comparisons of different HRQL measures are needed to obtain more insight into the strengths and limitations of the multi-attribute utility measures (MAUI) to estimate utility losses in injury populations. Such head-to-head comparisons have so far been lacking. Several of the authors have recently published a paper comparing the Health Utility Index (HUI) mark 2 and 3 and the EQ-5D in 'all injury patients' [45]. However, to maximise the utility of the available sparse data there is a need for studies which develop translational or bridging metrics for the different instruments which would then allow data to be combined in metaanalyses.
The EuroSafe Group has developed guidelines which advise the use of a combination of the EQ-5D and the HUI, with assessments at 1, 2, 4 and 12 months after injury [1]. Few studies, including those published after 2007, satisfy the guidelines. Only five studies used the EQ-5D, and none was found that used the HUI. The guidance recommended the use of two instruments to measure functional outcome, but half of the studies only used one instrument. With the exception of the twelve month measurement few recommended time points were assessed. However, most of the published studies were designed before publication of the guidelines in 2007.
Different HRQL instruments assess different dimensions of health, which make comparisons of study outcomes difficult. Polinder et al. showed that the HUI and EQ-5D resulted in significantly different utilities for similar health states for a general injury population [45]. These differences have the undesirable effect that the distinct instruments yield different utilities for similar health states. Clinicians and researchers should be aware of these differences between the HRQL instruments.  It is remarkable that in the 41 papers reviewed 24 different HRQL instruments were used. The use of so many different HRQL instruments might indicate that none of the instruments seems to incorporate all the attributes that one would like to, or that there is uncertainty about which instrument is best to use. Decisions regarding which HRQL measure to use will be influenced by a range of factors. For example, researchers may choose to include measures where normative population values are available or where the HRQL instrument is available in their national language. Our review shows that the choice of an instrument is also country specific (e.g. in the Netherlands the EQ-5D is very often used and in the US the SF-36). Other factors such as user fees and instrument length will also be influential. Researchers may also choose instruments based on considerations that are specific to their study which may make generalizability difficult. For example, they may choose an injury specific instrument with greater responsiveness to change for their particular study question rather than a generic instrument.
The importance of using the same generic health instruments in multiple studies needs to be raised across the injury research community. In our view, the ideal measure to quantify the burden of injuries should include all dimensions relevant to the burden of injury, produce a 0-1 range, a utility or summary score, be responsive to changes over time and not be injury specific to enable comparisons with other diseases. We think that the HRQL instruments proposed by the Euro-Safe group (EQ-5D and HUI) include the majority of the relevant dimensions for measurement of the burden of injury and the instruments are suitable for 'all injury' populations and all but the youngest age ranges [45,46].
The Eurosafe recommendations were based on an assessment of whether all relevant health domains for injury patients are included, when measuring the functional consequences of injury. As a first criterion, all body functions, activities and participation domains of the International Classification of Functioning (ICF) were defined, that are relevant for a substantial proportion of injury patients: cognition, emotion, pain, problem solving, ambulation, use of hand/arm/fingers, self care, household activities, interpersonal interactions (including sexual activities), school and/or work, and recreation. Actually, none of the generic measures studied cover all the relevant domains. However, a combination of a measure focusing on the functional capacities of the patient on the one hand (such as the HUI) and a measure including social participation on the other hand (such as EQ-5D) provides the best compromise. To assess functional capacities, Functional Capacity Index (FCI [47]), the only available injury specific instrument could in principle be used, but validation studies of this measure have been few and inconclusive [19,48]. For this reason, the few studies  using the FCI conducted so far, were not included in our review.
Until improved evidence based recommendations become available, the EuroSafe guidance should be adopted across the injury field to facilitate comparisons between studies and to provide greater insight into functional outcome and recovery patterns after injury. Of course, depending on the type of injuries included in future studies researchers may continue to use different assessment periods and variability in follow-up. However, if researchers can adhere to the guidelines as closely as possible, the opportunity for the improved understanding of injury outcomes will be enhanced.
It is clear that given the current state of knowledge it is difficult to summarize the functional outcome of injuries amongst the general injury population, due to wide variety of study designs, instruments used, and timing of outcome assessments. Nevertheless, this review has provided an improved insight into functional outcomes and recovery patterns of injury patients. A high prevalence of health problems during and after the first year of injury was a common finding of the studies. Among 'all injury' groups recovery occurred predominately during the first year following injury, whereas some 'more severely' injured patients also recovered by a smaller amount during the second year. Two years post-injury both groups, on average, still showed large deficits from full recovery whether measured by population norms or differences from pre-injury health status.
Several authors have recently called for further comprehensive population-level research exploring outcomes after injury, particularly for the non selective 'all injury' group [6,7]. Our systematic review is one of the few studies that has considered the measurement of HRQL in Four of the reviewed articles were not included in our review, because they were too injury specific (e.g. vertebral fractures [51]), or because they were cross-sectional in design and unsuited for measurement of outcomes [52][53][54]. Our review included 28 new studies which assessed injury-related recovery or disability. Furthermore, a systematic review of HRQL following major trauma among children reported similar findings with a large variety of HRQL measures used (n = 14) [55]. With regards to future research in the injury field there is clearly a need for further empirical studies of injury outcomes which follow the general guidelines from the 2007 EuroSafe report [1]. Such studies should include a measure of pre-injury HRQL (retrospectively) using instruments that produces utility scores. Decisions regarding which HRQL measure to use is often influenced by a range of factors. Therefore, whilst researchers may use their own instruments, but should include the best validated generic instruments to ensure comparability of results across studies. The EQ5 D, combined with the HUI, is particularly recommended as an appropriate measure in 'all injury' studies due to its suitability, ease of use and also being free to use. When used, both utility scores and standard deviations should be reported. Longitudinal studies with multiple measurement points to study recovery patterns of injury patients should be a priority issue as many existing studies have had only one follow-up assessment. Since 'major trauma' patients often show further improvements after 12 months, studies focusing on such injuries should measure outcomes up to two years after injury.
Assessment of the impact of injury requires comparison with pre-injury HRQL or in the absence of such information age and gender specific population norms. Whilst population summary scores are often used new evidence suggests that such scores are significantly lower than the pre-injury summary scores of injury patients [27]. This implies that using population scores as a baseline results in an underestimation of the impact of the injury. Pre-injury scores were collected in the UK Burden of Injury Study and an Australian study though the validity of these 'pre'-injury data has been questioned as they are collected after the injury and may be prone to recall bias [27,33,41]. Comparison with population norms is also prone to bias as injured people are unlikely to be a random sample of the general population and adjustment may not be possible for unmeasured confounders.

Conclusions
In conclusion, this review shows that there is considerable variation in study design between studies measuring HRQL of general injury populations. It is also clear that recently developed guidelines are not yet being followed. Adherence to such guidelines would facilitate comparability across studies which would produce improved estimates of injury disability and recovery patterns. There is also a need for the development of bridging tables which would allow direct comparison of the results of studies using different instruments. Such tables would be a helpful step in supporting formal meta-analyses of the results of studies using different instruments.
There are still major gaps in our understanding of the impact of injury on personal and population health. Consistently collected empirical data across countries would support the production of more valid burden of injury calculations, cost-effectiveness analyses of injury prevention programs and trauma care, and support continuous quality improvement of care.

Additional material
Additional file 1: Search strategy PubMed.
Additional file 2: Flow diagram of the reviewing process.