Methods and representativeness of a European survey in children and adolescents: the KIDSCREEN study

Background The objective of the present study was to compare three different sampling and questionnaire administration methods used in the international KIDSCREEN study in terms of participation, response rates, and external validity. Methods Children and adolescents aged 8–18 years were surveyed in 13 European countries using either telephone sampling and mail administration, random sampling of school listings followed by classroom or mail administration, or multistage random sampling of communities and households with self-administration of the survey materials at home. Cooperation, completion, and response rates were compared across countries and survey methods. Data on non-respondents was collected in 8 countries. The population fraction (PF, respondents in each sex-age, or educational level category, divided by the population in the same category from Eurostat census data) and population fraction ratio (PFR, ratio of PF) and their corresponding 95% confidence intervals were used to analyze differences by country between the KIDSCREEN samples and a reference Eurostat population. Results Response rates by country ranged from 18.9% to 91.2%. Response rates were highest in the school-based surveys (69.0%–91.2%). Sample proportions by age and gender were similar to the reference Eurostat population in most countries, although boys and adolescents were slightly underrepresented (PFR <1). Parents in lower educational categories were less likely to participate (PFR <1 in 5 countries). Parents in higher educational categories were overrepresented when the school and household sampling strategies were used (PFR = 1.78–2.97). Conclusion School-based sampling achieved the highest overall response rates but also produced slightly more biased samples than the other methods. The results suggest that the samples were sufficiently representative to provide reference population values for the KIDSCREEN instrument.


Background
Health-related quality of life (HRQoL) measurement can provide useful data for health services research. The measurement of HRQoL is currently more advanced in adult than in child populations, though a substantial number of generic and disease-specific questionnaires now exist to measure HRQOL in younger respondents [1].
Reference values can facilitate the interpretation of scores on an instrument by providing a point of comparison for individual or group responses obtained in clinical or epidemiological studies [2]. This is particularly helpful if reference values are available by age group, sex, and other relevant characteristics. Population norms can also be used to develop norm-based, standardized scores. In this case, scores are standardized so that a score of 50, for example, represents the mean score for the general population in a given country or region, and 10 points is equivalent to 1 standard deviation [3]. Reference values and norm-based, standardized scores can be used to determine the extent to which an individual or group differ from the standard for their age, sex, etc.
General population norms are available for some generic HRQoL questionnaires for adults [4], but they are not usually developed simultaneously in different countries. Likewise, reference values are not available for most of the instruments designed for use in children and adolescents. When obtaining reference values, the sampling techniques used should ensure representativeness in terms of age, socioeconomic status, and gender and any other variables considered important [5]. Methods which systematically exclude certain segments of the population should be avoided and the external validity of the results should be evaluated [6] by comparison with an external source of information such as those provided by census data. Previous studies have indicated that the social and individual characteristics of eligible subjects, sampling, and fieldwork procedures can influence the decision to cooperate or not in a survey [7,8] and it has been found that nonresponse tends to be higher among individuals with lower educational levels and in low-income groups [9,10].
The KIDSCREEN is a generic questionnaire designed to measure HRQoL in children and adolescents aged 8 to 18. The instrument was developed and pilot tested in seven European countries [11], and is intended primarily for use in multinational clinical trials and health interview surveys [12,13]. Three different versions of the instrument have been developed: the core version contains 52 items spread over 10 dimensions [11], the second has 27 items in 5 dimensions, and a 10 item version exists which provides a single index score [12]. Versions for both children and parents or guardians are available for all variants of the questionnaire. The psychometric properties of the 52-item version were tested after administration of the instrument to respondents in 13 European countries [13]. A further aim of the KIDSCREEN project was to obtain reference values or norms which would facilitate interpretation of scores with the instrument.
The objective of the present analysis was to compare the results obtained using different sampling and questionnaire administration methods in the international KID-SCREEN project in terms of participation, response rates, and completion rates. A further objective was to assess the representativeness of the samples obtained in the thirteen participating countries by examining their external validity.

Subjects
The target population for the KIDSCREEN study was children and adolescents aged 8-18. A sample size of 1,800 children and adolescents per country was considered necessary to detect a minimally important difference of half a standard deviation (SD) in HRQoL scores within each age strata between children with and without special health care needs or a chronic condition. A response rate of approximately 70% was expected, so the initial sample size was set at 2,400 children and adolescents per country. The sample was drawn to take into account distribution of the target population by age, sex, and region.

Sampling and questionnaire administration strategies
Three approaches to sample selection and administration were used: 1) telephone sampling followed by a mail survey (AT, CH, DE, ES, FR, and NL); 2) school sampling and survey administration during class-time (EL, HU, IE, and SE), or school sampling followed by a mail survey (PL), and; 3) multistage random sampling of communities and households followed by questionnaire self-administration at home (CZ). In the UK, a combination of telephone and school sampling was used. Fieldwork was carried out between April and November 2003 except in IE, where data were collected in 2005.
All procedures were carried out following the data protection requirements of the European Parliament (Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data). Each country was required to respect their national ethical and legal requirements for this type of survey and to obtain signed informed consent from participants.

Telephone sampling
A stratified sampling strategy was used to include children proportionally by age, sex, and geographical areas within countries, based on census data. Telephone sampling was performed centrally from Germany, and was carried out using a Computer Assisted Telephone Interview (CATI) with random-digital-dialing (RDD). The RDD randomly generated suffix numbers for each geographical area's prefix number until the desired quota was achieved. Interviewers who had received study-specific training called the random numbers generated to identify households with children or adolescents aged 8-18 years. When such a household was identified, the interviewer asked one of the parents if they would be willing to participate together with their child. If the parent agreed to participate, the questionnaire and other study materials were mailed to the requisite address with a stamped, addressed envelope for return of the completed questionnaire. A telephone hotline was used to provide further information about the survey. Two reminders were sent in cases of non-response (after two and five weeks). Due to initially poor response using RDD in NL and UK, additional phone numbers were obtained from a large data services institute in NL (DIDOC), and school sampling was added as an alternative option in the UK.

School sampling
In the case of school sampling, sample selection was also designed to take into account the distribution of children by age, sex, and geographical or administrative regions within countries. Sampling frames were public (state) school (EL, IE, PL) or classroom (HU) listings, and schools were randomly selected within each region. A convenience sample of schools was used in SE. Consent to participate was obtained from parents before study questionnaires were administered. Children whose parents provided informed consent to participate completed their questionnaires in school. Questionnaires were collected during a 2nd visit to schools after 3-7 days.

Multistage probability sampling
In CZ, communities were randomly selected from all regions of the country. Households within each selected community were then randomly selected using a local telephone directory. Trained interviewers telephoned to identify families with eligible children, provided standard information on the survey, and delivered the questionnaires to households who agreed to participate. They returned after 2-5 days to collect the completed questionnaires.
In countries which used CATI sampling, as well as PL and CZ, a non-response questionnaire was administered to those who declined participation to collect basic sociodemographic information.

Study variables
Study materials sent to respondents included the KID-SCREEN-52 questionnaire, several additional instruments which would be used to test the validity of the KID-SCREEN, and a series of questions on sociodemographic and health-related characteristics. Sociodemographic and health-related data was collected from both children and parents, and both children and parents completed the relevant versions of the KIDSCREEN-52 questionnaire.
Variables used to evaluate sample representativeness were the child's sex and age (based on data provided by the children), and the educational level of the child's mother and father (provided by parents). The highest educational qualification of both parents was collected and coded according to the International Standard Classification of Education (ISCED) categories [14] as follows: Low, at most lower secondary (ISCED 0-2); Medium, upper secondary (ISCED 3-4), and; High, tertiary (ISCED 5-6). Household location was collected in five categories (big city or suburbs, town or small city, country village, farm, or house in the country).
Where possible, data was collected from parents who refused to participate to be able to compare characteristics of refusals and participants. Variables collected included: parent-reported child health status, parent's self-perceived overall health, marital status, highest educational level achieved (except for PL, where it was replaced by the mother's highest educational level), and household location.

Analysis
Participation in the study was assessed by analyzing cooperation, completion, and response rates. Cooperation was defined as the willingness of parents to participate with their children in the study and was computed as the number of parents who agreed to participate divided by the number of eligible households contacted. The result was then multiplied by 100. The completion rate was computed as the number of questionnaires completed by children and adolescents divided by the number of families who agreed to participate, and multiplied by 100. Only valid addresses were used as the denominator for the response rate in the mail survey, since it was considered that people who could not receive mail would not have the opportunity to respond. Finally, response rate was computed as the product of the cooperation rate and the completion rate divided by 100.
In order to assess each national survey's representativeness, they were compared with the corresponding reference population in the Eurostat census database [15]. Samples were compared with the reference data in terms of age (of the population aged 8-18 years), sex, and the highest educational level (according to ISCED categories) of women and men with at least one child aged 8-18 in the household. The Eurostat provides comparable data from statistical offices of the participating European countries.
Participants and refusals were compared in terms of child and parent overall health status, parents' martial status, parent's educational level, and household location in selected countries. SE was not included in the analysis of representativeness because the sample was not intended to be representative.
Differences between the sample's observed and expected (based on Eurostat data) proportions by age and sex were tested for using a binomial test. A chi-squared goodness of fit test was used to analyze whether the sample distribution in terms of level of education differed significantly from the distribution in the reference population. An α < 0.01 was considered to be statistically significant.
Population fractions (PF) by age and sex were computed for each country. The PF was defined as the number of children responding in each age-sex group divided by the number of children with the same characteristics according to the Eurostat data. Population fraction ratios (PFR, e.g. ratio of boys' to girls' PF), and their 95% confidence intervals (95%CI) were also estimated for each country (child's PF and PFR were not calculated for SE and EL because only adolescent samples were recruited in these countries). Population fractions and PFRs were also computed for mothers and fathers according to their educational level, using the mid-level as the reference category. A PFR greater than 1 indicates an "excess" of boys in the sample, while an excess of girls is indicated by a PFR lower than 1. Similarly, a PFR greater than 1 indicates an "excess" of parents in the high (or low) educational level compared to the intermediate educational level. IE was not included in this analysis because data were not available on parents' educational level. Table 1 shows the socio-demographic characteristics of the samples included in the participating countries. Data from a total of 22,827 respondents were eligible for analysis. Mean age (SD) in the child and adolescent samples was 9.7 (1.1) and 14.4 (1.7) years, respectively. 51.3% of the child sample was female, and 53.8% of adolescents. Table 2 shows the cooperation, completion and response rates for the Kidscreen sample by country. The mean cooperation rate using telephone sampling was 56.5% (range, 42.8% -76.5%) compared to 89% (range, 69% -97.2%) for school based sampling (table 1). Completion rates were highest when questionnaires were administered in schools (92-100%), and varied substantially in the mail surveys. The lowest mail survey completion rates were in the UK (44.1 %) and FR (45.3 %); the highest rates were in the NL (97.8%) and DE (77.6 %) (table 2). Response rates were highest in countries that used school sampling and administration during class-time (range from 69.0% in the UK to 91.2% in SE), and lowest in countries which used CATI sampling and mailed surveys (between 18.9% in the UK, and 68% in NL). Countries that used other methods achieved intermediate response rates (59.6% in PL, and 71.5% in CZ). Within both types of sampling and administration strategies, participation and response rates tended to be higher in countries with greater proportions of parents in higher educational categories according to Eurostat data. Table 3 shows the population fractions and fraction ratios by country according to age and sex. On these parameters, the distributions of the KIDSCREEN national samples in each country were quite similar in terms of sex and age to their Eurostat reference populations. Nevertheless, due to the large sample sizes in each country, most of the differences observed were statistically significant (table 2).

Participants compared with reference census population data
The population fraction varied depending on the number of 8-18 year olds in the country, and also on the sample size. In France, the PF was 0.13-0.14/1,000 children, and 0.11-0.13/1,000 adolescents, but rose to 2.50-3.21/ 1,000 in Hungarian children, and 2.34-2.95/1,000 in Swiss adolescents (table 2).
In the KIDSCREEN sample, girls tended to be over-represented compared to the Eurostat reference population in almost all countries. This was also true across all of the sampling methods. The PF was lower in boys than in girls in almost all countries, except IE, where the male-female PFR was 1.74. In general, in school-based samples females were more likely to participate than males (PFR = 0.64, EL; 0.78-0.62, HU; 0.83-0.77, PL).
The PF was lower in adolescents than in children, except in AT, CH, and DE. In school-based samples, adolescents were less likely to participate than children (PFR = 0.81 in girls, and 0.64 in boys, HU; 0.87 in girls, and 0.57 in boys, IE), ES (0.81 in boys), and UK (0.69 in girls and 0.60 in boys) (table 2).  The external validity was also studied by comparing parents' declared level of education with the Eurostat reference sample. In the majority of countries, parents with a low educational level were less likely to complete the survey materials than parents with an intermediate educational level (e.g. PFR = 0.37 for mothers with a low educational level compared to those with an intermediate level in AT) (table 4). The PFR for parents with a high educational level varied between 0.86 (AT) and 2.20 (FR) in countries which used phone sampling and mail administration, and was over 1.5 in countries which used school sampling. Table 5 provides data on the characteristics of children and parents in households that refused to participate compared to households who actually participated. Parents who participated were more likely to declare good self-perceived health for both their children and themselves. They were also more likely to be married and have a higher educational level, and they were less likely to live in large cities (except in CH). No statistically significant differences were found between refusals and participants in PL, the only country which used school-based sampling and which also collected data from refusals.

Discussion
It is agreed that it's important to ensure that reference values for HRQoL instruments are based on representative samples. However, there has been relatively little analysis of aspects such as sampling and administration methods, cooperation and response rates, and the external validity.
The fact that the present study was carried out in several different European countries using different sampling and administration methods made it even more important in this case. This sort of analysis also provides information about sampling and questionnaire administration strategies which may be useful for future studies to obtain reference values.
Cooperation and response rates varied considerably by country and by the sampling and administration methods  Other used. As was expected, school-based surveys produced higher cooperation rates, which might be explained by a greater degree of confidence among parents due to the school's involvement from the outset (unsolicited telephone contact vs. an individualized letter from the school) [7]. A further advantage of school-based sampling is that once parental consent to participate was obtained, completion and response rates were almost 100%. On the other hand, one of the main advantages of the centralized CATI method is that it is a relatively efficient way of obtaining broad geographical coverage, which might be more difficult to achieve using school sampling. This needs to be balanced against lower participation and response rates, as well as the fact that phone coverage is limited in some European countries. Whereas the percentage of households with phones was over the 90% in most of the countries included in the present survey [16], it was under 50% in CZ, HU and PL [17], leading to an obvious source of sample bias. For this reason, among others, the survey was carried out using school or household sampling in EL, HU, IE, PL, SE, and CZ.
Using the telephone sampling method, we also observed considerable variation in the cooperation, completion, and response rates across countries, with reasonably good rates in NL and German-speaking countries, but lower rates in ES, FR, and the UK. Similar patterns in response rates have been found in earlier Europe-wide studies in adults, although earlier studies have also noted wide variations in response rates within regions and countries [18].
Differences in participation and response rates between countries using the same sampling and administration procedures could be due to several factors. In particular, it was observed that countries with better-educated populations tended to produce higher response rates, though cultural factors could also play a role. This will have implications regarding expected non-response rates for sample size estimates in different European countries.
Other reasons for the variations observed between countries might include differences in the reminder procedures used, or differences in study materials even though considerable effort was made to standardize study materials and follow-up procedures. Similar strategies were also used across countries to increase response rates, such as contacting participants before sending the questionnaire, use of personalized questionnaires and letters, inclusion of stamped addressed envelopes for return of completed questionnaires, and a second telephone contact to nonrespondents [19].   Despite differences in sampling and administration methods between countries, the final samples obtained were generally very similar to the reference populations. As is frequently observed in surveys of this type, and as indicated by the population fraction rates, there was a slight tendency for a higher response rate from females compared to males and from children compared to adolescents. This was true across all sampling methods and countries. Likewise, the mothers' educational level in the KIDSCREEN sample was similar to that of the reference sample in most countries, though those in the lowest educational category tended to be underrepresented. Although differences between the KIDSCREEN sample and the Eurostat reference populations were small, they could affect scores on the KIDSCREEN instrument, and therefore the validity of the KIDSCREEN reference values, as HRQoL scores have been shown to differ by age, sex, and level of education [20]. To test the possible effect of these differences on KIDSCREEN global and dimension scores additional analyses using re-sampling methods such as bootstrapping were performed. These indicated no appreciable effect on KIDSCREEN scores (data not shown).
Finally, there were some differences in socio -demographic characteristics between participants and refusals, with non-responders in general having lower levels of education and reporting slightly poorer health. Future studies may need to take this into account by over-sampling these groups to ensure adequate representation in the final sample.
The study had some limitations. In particular, the fact that the results cannot be generalized beyond the countries studied and the fact that there may have been some confounding between the countries and the different sampling approaches. Whereas the former limits the study's external validity, the latter might affect the internal validity of the conclusions. Future studies could focus on applying and comparing different strategies within the same country. The results could also be due to psychological and social factors that were not included in the present analysis [21], although the literature indicates that the main factors associated with non-response were included, i.e. age, sex and socioeconomic or educational level [8,9]. Finally, although we made considerable efforts to ensure that standardized procedures were used for follow-up, there may have been some differences between countries that we were unable to control. Any such differences could affect the study conclusions, particularly as aspects such as reminder letters and follow-up phone calls have been shown to have a considerable impact on response rates [18,22].

Conclusion
We found that school-based surveys provided the highest cooperation and response rates, although they did not necessarily provide the most representative samples, at least in terms of age and sex. This approach also requires sophisticated sampling strategies in order to avoid biases, and gaining access to schools may not always be easy.
Centralized phone sampling and mail administration led to slightly lower cooperation rates, although this was variable between countries. This approach also led to a lower rate of participation amongst those in lower educational categories, although the distribution by age and sex was more representative than that achieved with school sampling. Finally, multistage random sampling of communi- Participants' refer to the parent-child pair, i.e. when a parent agreed to participate and survey materials were completed by a parent; c 'Refusals' were the parents who were not willing to participate, but answered a short interview on non-response. ties and households achieved a well-balanced age-sex, and urban-rural survey, but at a higher cost than other methods. The most appropriate method to use will likely depend on researchers' available resources, telephone coverage in target countries, and the need for representativeness on given variables. This study has also indicated some areas in which oversampling and additional followup strategies may be required. Finally, the results suggest that the KIDSCREEN survey has achieved a sufficient degree of representativeness to provide reference population values, which will improve the interpretability of the KIDSCREEN questionnaire.