Functional ability in a population: normative survey data and reliability for the ICF based Norwegian Function Assessment Scale

Background The increasing focus on functional ability assessments in relation to sickness absence necessitates the measurement of population functional levels. This study assessed the reliability of the Norwegian Function Assessment Scale (NFAS) and presents normative population data. Methods All inhabitants in seven birth cohorts in Ullensaker municipality in 2004 were approached by means of a postal questionnaire. The NFAS was included as part of The Ullensaker Study 2004. The instrument comprises 39 items derived from the activities/participation component in the International Classification for Functioning, Disabilities and Health (ICF). Based on the results of principal component analysis, these items comprise seven domains. Non-parametric tests for independent samples were used to compare subgroups. Internal consistency was assessed by Cronbach's alpha. Two-week test-retest reliability was assessed by total proportions of agreement, weighted kappa, and intraclass correlation coefficient (ICC). Results The response rate was 54% (1620 persons) and 75.4% (101 persons) for the retest. Items had low levels of missing data. Test-retest reliability was acceptable with high proportions of absolute agreement; kappa and ICC values ranged from 0.38 to 0.83 and 0.79 to 0.83, respectively. No difficulty on all 39 functional activities was reported by 33.1% of respondents. Females, older persons and persons with lower levels of education reported more functional problems than their respective counterparts (p < 0.05). The age gradient was most evident for three of the physical domains. For females aged 24–56 and males aged 44–76, a clear education gradient was present for three of the physical domains and one mental domain after adjusting for age and gender. Conclusion This study presents population based normative data on functional ability, as measured by the NFAS. These data will serve as basis for the development of national population norms and are necessary for score interpretation. Data quality and test-retest reliability of the NFAS were acceptable.


Background
Longitudinal trends in sickness absence and disability pensions rates in several European countries, including Norway, show that increasing proportions of the population have levels of work ability that are too low to meet work demands [1]. To meet this challenge, European social security schemes increasingly emphasize the individual's resources and functional abilities rather than health deficits and restrictions. The Norwegian Insurance Scheme has introduced functional ability assessments in sickness certification forms [2]. In this context, the new classification for functioning, disabilities and health (ICF) has received attention through its consistent conceptual framework for defining functional ability [3].
It is commonly found that the level of functioning tends to be poorer with increasing age and in lower social classes [4]. Eurostat surveys conducted within the European Union, have estimated the prevalence rates for moderate and severe disability in the working-age population to be 10.0% and 4.5% respectively [4]. These figures were based on self-assessed restrictions in daily activitiesmoderate or severe -and stricter definitions of functional limitations would give lower prevalence figures.
National and international population surveys have frequently used well-established health status instruments such as the Nottingham Health Profile [5] and the Short Form 36-item (SF-36) Health Survey [6] to assess function. Such questionnaires often have multiple aims and include several scales to measure function and quality of life. A broad array of scales might be relevant for clinical and epidemiological investigations, but are less useful in social security. To reintegrate employees in working life, there is a need for discriminating instruments that can aid the medical assessors, case managers, and labour experts in their decisions as to who should receive which types of benefits and support. Instruments based on self-report have been developed in the UK, the Netherlands, and Finland [7,8]. Self-reports of health conditions, abilities, and skills are also important approaches in the expanding research field on the relationships between health and work productivity [9]. In the WHO Health and Work Performance Questionnaire, functional status is reported, although on a more general level [10].
The Norwegian Function Assessment Scale (NFAS) is an instrument for self-report that was developed by an expert group in social insurance in 2000. It was developed to assess the need for rehabilitation, adjustment of work demands among sick-listed persons as well as the rights to social security benefits [11]. ICF was selected as a basis for facilitating multidisciplinary work and understanding, and the usage of generally accepted definition of concepts. All categories from the activities/participation component in ICF were considered, and categories not relevant for the assessment of work-related functional abilities were removed. After this process, 39 categories remained which were rephrased into questions with four response alternatives. Four response alternatives were used in preference to the five within the ICF, because fewer alternatives make the scale easier to use in assessment procedures.
The first version of the NFAS was tested for construct and convergent/divergent validity against SF-36 and the Dartmouth COOP Functional Health Assessment Charts/ WONCA(COOP/WONCA), and for utility in a random sample of 386 persons sick-listed for six weeks in eight different geographical areas [11]. Based on a principal component analysis of this data, the 39 categories were regrouped into seven functional domains. Individual assessment of these domains facilitated the design of work place adjustments, and strengthened the communication between the sick-listed person and the case manager in the National Insurance Administration. Recently, the NFAS has been utilized in a study of 89 disability pensioners, to predict belief in return-to-work [12].
The final version of NFAS had good construct validity [11], but its reliability has not yet been thoroughly evaluated. The level of functional ability has yet to be assessed in the general population. This will provide important normative data necessary for score interpretation. Validity of a four-and a five-point scale version of the NFAS in a population will be reported elsewhere. The purpose of this study was to obtain normative data on the NFAS as part of The Ullensaker Study 2004, and to examine the test-retest reliability of the scale.

Study setting and sample
Ullensaker is a rural community which had 23,700 inhabitants in 2004. There are no major differences between the population of Ullensaker and the population of Norway with respect to demographic characteristics [13]. In 2004, postal questionnaires, which included the NFAS along with questions relating to musculoskeletal pain, were sent to all inhabitants in Ullensaker municipality in the birth cohorts 1918 -20, 1928-30, 1938-40, 1948-50, 1958-60, 1968-70 and 1978-80. A randomized half of these inhabitants received the four-point version of NFAS and were included in this study. Reminders were sent at eight weeks. Information on the residential locations was given by the Population Register.
The Regional Committee for Medical Research Ethics and The Norwegian Data Inspectorate approved the study.

Test-retest reliability
For purposes of assessing test-retest reliability, the first 30 returning a questionnaire within each of the five youngest birth cohorts were asked to complete the NFAS again at two weeks. The two oldest birth cohorts were not included because the persons are outside the normal working age in Norway of 16 to 67 years. Individuals reporting no difficulty on all NFAS items were not invited in the retest since possible changes could only be in one direction.

The Norwegian Function Assessment Scale (NFAS)
The NFAS [11] was included in The Ullensaker Study 2004 to obtain self-reported levels of ICF based functional ability. The 39 items are relevant for assessing physical and mental functioning in working life, some relating to activities of daily living. The NFAS starts with the question "Have you had difficulty doing the following activities during the last week?" and respondents self-report 39 activities using a four-point scale from 1-4: no difficulty, some difficulty, much difficulty, could not do it. A low score indicates good functional ability.
Based on the results of principal component analysis from the previous study with sick-listed persons [11], the items comprise seven domains: Walking/standing (7 items), Holding/picking up things (8 items), Lifting/carrying (6 items), Sitting (3 items), Managing (7 items), Cooperation/communication (6 items), Senses (2 items). These domains have evidence for validity in sick-listed persons [11]. The main application of the NFAS is likely to be social insurance. Hence it was decided to keep the domains from the earlier study with sick-listed persons [11]. It should, however, be anticipated that principal component analysis based on data from the general population in Ullensaker will yield somewhat different results. Domain scores are calculated by adding the item scores and dividing by the number of items completed. The NFAS total scores are calculated by adding all 39 item scores and dividing by the number of items completed. Thus, missing values were ignored. Demographic data about the education level was included in the questionnaire with the response categories of lower secondary school, upper secondary school (technical), upper secondary school (preparatory), university 1-4 years, university >4 years. Education level was then categorized into three groups: ≤ 9 years, 10 to 12 years and ≥13 years.

Statistical analyses
Internal consistency was assessed by Cronbach's alpha. Test-retest reliability was assessed by calculating total proportions of agreement, weighted kappa [14], and intraclass correlation coefficient (ICC) (two-way mixed model with the measure of absolute agreement). Since data are categorical, non-parametric tests for independent samples were used to compare subgroups.

Results
Of the 3000 questionnaires posted, 1620 (54.0%) were returned. Compared to respondents, non-respondents were more likely to be male (p < 0.001) and young or very old (Table 1). Of the respondents, 18.5%, 47.5% and 34.1% reported ≤ 9 years, 10 to 12 years and ≥13 years of education respectively.
The mean level of missing data for the 39 NFAS items was 3.3% and 78.5% had no missing data. For the great majority of items, missing values ranged from 1.9 -4.6%. Holding and turning a steering wheel (5.3%), driving a car (6.1%), working in groups (9.0%), and guiding others in their activities (9.3%) had higher missing values. There was a significant increase of missing values with age (p < 0.001).
Item responses were skewed towards no difficulty; range 63.5 -96.8%. The percentage of respondents reporting no difficulty for all 39 items was 33.1%. The items going up and down stairs, engaging in your leisure activities, pushing and pulling with your arms, cleaning your house, staying alert and being able to concentrate, managing everyday stress and strains, managing to take criticism, managing to control your anger and aggression, and remembering things, represent functional activities in which more than 20% of the population reported difficulties.
Cronbach's alpha ranged from 0.67 (Sitting) to 0.91 (Walking/standing) for domains and was 0.95 for the total scores. Five of seven domains exceeded the 0.70 reliability standard for use in groups [15], the remaining two just failing to meet this criterion.

Test-retest
Retest questionnaires were returned by 101 of the 134 (75.4%) individuals sent a second questionnaire. Most persons in the youngest cohort reported no difficulty for all questions, resulting in fewer candidates in this cohort (n = 17). The respondents were significantly older (p < 0.05) than the non-respondents, but were otherwise comparable. With the exception of four items -writing, which showed a deterioration (p < 0.01) in function, and managing to take criticism, managing to control your aggression and anger, and remembering things, which showed an improvement (p = 0.01) in function -there were no score differences between test and retest. The proportion scoring exactly the same on both occasions (total proportions of agreement) was high, ranging from 0.68 -0.97.
Weighted kappa values ranged from 0.38 (fair agreement) to 0.83 (almost perfect agreement) [16] ( Table 2). The weighted kappa values for single items showed large variability, but the values for six of the seven domains were above 0.61, indicating good agreement. ICC values ranged from 0.79 (substantial) to 0.88 (almost perfect) [16] ( Table 2).

Gender
Item and domain scores ranged from 1.04 to 1.42 and from 1.05 to 1.25 respectively (Table 3). Males reported significantly better functional ability than females on 33 items. With the exception of the Cooperation/communication domain, domain and total scores were significantly better for males than females.

Age
Domain and total scores for males and females within different age groups are given in Table 4. With the exception of females in the age group 54-56, the total scores increased gradually with age (p < 0.001   domain the association with functional ability was significant among females aged 24-46, but not among males.

Discussion
The Norwegian Function Assessment Scale (NFAS) was developed by an expert group to ensure that the instrument has content validity, as a measure of functional ability relevant to the working population. With just 39 items the NFAS is suitable for inclusion in population surveys with minimum respondent burden and take an estimated ten minutes to complete. The instrument seems to be acceptable to the general population in Norway, even though the response rate was relatively modest in some age cohorts. The response rate represents a potential study limitation as we do not know the possible effect imposed by the non-respondents. Compared with national population data [13], the study sample included fewer persons in the youngest and the oldest cohort. Since these two groups are at the opposite ends of the functional ability continuum, the effects on scores might to some extent be cancelled out. Further, more females than males returned the questionnaire, which might have led to poorer scores than if all responded. On the other hand, this effect may have been lessened by the higher percentage of persons with education at university level in the sample compared to the distribution of educational level in the whole population [13].
Levels of missing data were within acceptable limits. However, a few items had a high percentage of missing values, which is probably because there was no "not applicable" option. When a participant considered a functional activity irrelevant, he or she would probably have left this item unanswered. Some items could have been irrelevant for the two oldest cohorts since many of these participants have retired from work or do not drive a car. Including a  The NFAS was originally developed for persons of working age. The small number of participants in the two oldest age cohorts, and the poorer data quality among these respondents due to more missing values and irrelevant items imply that caution should be exercised when using these normative data on groups outside the working age. Otherwise, the data quality was acceptable.

Reliability
The level of Cronbach's alpha was acceptable with two of the domains only just failing to meet the criterion of 0.70 for use in groups of people [15]. The participants received the test and the retest questionnaires about two weeks apart. In this way the recall bias might be minimal, but there may have been a real change in health related function. Functional health status is also likely to show some day-to-day variation. For the most part, mean changes were fairly evenly distributed between improvements and deteriorations.
The total proportions of agreement in this test-retest was high compared to a study examining test-retest reliability of COOP/WONCA [17]. Compared to a further test-retest study using the COOP/WONCA charts [18], the weighted kappa values were slightly lower. The ICC values for domains indicated substantial to almost perfect agreement, and all met the reliability standard of 0.70 for use in groups [15]. Compared with other studies using the SF-36 [19,20], ICC values were similar. Overall the test-retest reliability is acceptable.

Normative data
As expected, the data were highly skewed indicating that a large proportion of the population did not experience difficulties with functional activities. One in three respondents reported no difficulty on all items indicating excellent functional ability, and the remaining two thirds reported a variety from minor to major difficulties with different functional activities. The population seems to have most problems with remembering and least problems with their senses. Walking/standing and Managing domain have the highest scores, whereas Senses and Sitting the lowest. The items, watching television and listening to the radio, had very low scores, indicating that very few respondents reported difficulties with this. However, problems with these senses are important aspects in relation to work.
Men reported higher functional ability than women on most items. The findings of previous studies differ somewhat, which may, at least partly, be due to the use of different instruments and the aspects of health that they measure. Of the studies looking at functional health status using the SF-36, five had a similar conclusion [21][22][23][24][25][26], whereas one study did not [27]. According to one study using the COOP/WONCA charts, males reported better functional ability than females on the first four of the six charts [28]. The report by Grammenos [4], did not show systematically significant differences between the percentages of men and women of working age in the European Union reporting disability.
The significant age gradient in physical domains and the non-gradient in mental domains found in this study follows previous research [21,23,24,[26][27][28]. Grammenos [4] also found a strong non-linear age gradient in the reported disability prevalence rates in the European Union. In our study, females aged 44-46 reported more difficulties on mental domains than younger or older females, the exception being the oldest cohort. For males, a peak at the age group 54-56 was found for the Managing domain only. These findings are supported by the results from a study by Hensing et al [29] showing that the cumulative incidence of sickness absence for a psychiatric diagnosis was highest among those aged 45-59. The association between age and functional ability seems to be more complex in mental domains than in physical domains.
In this study, the length of education was significantly related to functional ability level with better levels among the persons with the highest levels of education. This finding is supported by previous studies [22][23][24][25]. In the European Union report [4], education was inversely associated with disability in all countries. Further, positive correlations between income and health, and a presence of collinearity between education, income and socio-economic status were reported. After adjusting for gender and age, we only found associations between educational level and reported functional ability for some subgroups in our study, whereas Sullivan et al [23] reported significant gradients after adjusting for age. The relations between age, gender, education, and income are often difficult to disentangle. In older generations of women, their well-being is more likely to be influenced by their husbands' education and income. The lack of association between functional ability and education in younger men is likely explained by young men's general high functional levels. We propose that a normative population data set must take age, gender, and education into account.

Comparisons with sick-listed persons
Comparing this population study data with data from the sample with 386 Norwegians sick-listed for six weeks [11], the population sample scores are lower than for the sicklisted persons.  . 1.99). These four functional activities seem to imply much more difficulties for the sample of sick listed than for the normal population. In the sample of 386 sick-listed persons [11] no significant differences between males and females nor any age gradient were found, as opposed to the normal population where females and older persons report more difficulties with functional activities than males and younger persons.

Conclusion
This study presents population scores on the NFAS by gender, age and length of education. Data quality, internal consistency and test-retest reliability were acceptable. The main findings were that females, older persons and persons with lower levels of education reported more functional problems than males, younger persons and persons with higher levels of education. A large proportion of the respondents reported no difficulty for most items and very few answered that they could not do it. The domains, in which the respondents reported most problems with functional activities, were Walking/standing, Lifting/carrying and Managing. These data will serve as basis for the development of national population norms.
Publish with Bio Med Central and every scientist can read your work free of charge