We report similarly high response rates in all the school-based surveys - irrespective of age of the adolescents, year of the study or whether the survey was carried out in urban Oslo or rural Hedmark. In the follow-up, the response rate was markedly higher in Oslo when school-based and postal-based data collection were combined in comparison with the pure postal-based data collection in Hedmark. The rate of participants who gave their consent to DNA provision was as high as the rate of those who consented to linkage of data with other health registers and previous surveys. Significant predictors of lost to follow-up and failure to provide DNA samples were: male gender; non-western ethnicity; postal survey compared with school-based; lower educational plans than university/higher education; low education and income of father; low perceived economy in the family; unmarried as compared with married parents; poor self-reported health; externalized symptoms and smoking, with some differences in subgroups of ethnicity and gender. Regarding the association between selected exposures and outcomes, the main finding was that the association measures (PRs) were quite similar in the groups of participants and all invitees. In subgroups of non-western boys and girls, however, we found some differences, though the pattern was inconsistent, the number in the analysis was small and the confidence intervals of the estimated associations were large.
When conducting epidemiological studies, we aim to select samples in which all groups are represented in the study sample in the same way as their representation in the general population. Still, almost every study is hampered by a number of invitees who decline to participate. Any sign of selective attendance in which certain exposed groups are grossly under or overrepresented may incur disturbances to the conclusion. Epidemiological studies with a low level of participation are particularly vulnerable to self-selection bias threatening the internal validity. In the three present cross-sectional, school-based studies, the response rate was approximately 90%. In addition to the advantage with the classroom setting, several other factors may have contributed to the high response rates. Active parental consent was not needed, which in other studies has reduced the response rate to the level of 19%-60% [7, 42]. We used no invasive methods and the data collection only took a little of the participants' time [43, 44]: two hours for the 15/16 year olds and one hour for the 18/19 year olds. Face-to-face recruitment instead of a less personal form of contact between the study recruiter and potential participants may also increase the participation rate . Among adults, monetary incentives may increase the response rate, but the effect on differential study participation is mixed . Greater monetary incentives may have a greater impact on minority and low education individuals participating than on those who are non-minority with a higher education, though in contrast, potential responders with a high income or education may have a greater demand to be compensated for their time . In the present study of 18/19 year olds, an incentive was given by letting participants join a lottery consisting of three sums of NOK 15,000 (i.e. USD 2,470/EUR 1,740), but it is not known whether incentives may bias studies of adolescents or had any impact on the response rate in this study.
Predictors of lost to follow-up
About 10% did not agree to the linkage to other health surveys or registries, including their own baseline, thereby contributing to "lost to follow-up". In the consent form, the question of agreeing to a linkage to their own baseline was written in the same sentence as the linkage to registers. We are not able to rule out whether this mix could be the reason why so many adolescents refused the linkage.
Most of the predictors of lost to follow-up found in the present genetic epidemiological study have previously been reported in surveys of adolescents, and are the ones most often supportive of our findings [10, 21]. In addition, we have found the following predictors of lost to follow-up, which as far as the authors are aware, have not previously been reported: postal survey compared with school-based; lower educational plans than university/higher education and low perceived economy in the family. Most studies report that urban area of living, as compared to rural, predict non-response [10, 21], which is in contrast to the findings of the present study, in which we detected no differences between Oslo and Hedmark in the response rate among 15/16-year-old 10th graders. It could be that school-based studies are less sensitive to location than other settings due to oral information about the purpose of the study  and a possible team feeling.
In the follow-up studies, the response rate was 65% in Oslo when combining the school-based and postal portion and only 43% in Hedmark, which is a concern regarding internal validity. However, in a respiratory health survey in Norway of 15-70-year olds, early responders were compared with late responders after a first and second reminder and telephone follow-up with respect to prevalence estimates and association measures . The response rates increased from 42.7% to 79.9%, but there were only marginal differences in the exposure-disease relationship and prevalence estimates when initial responders were compared with all responders. This is in accordance with the present study of adolescents, in which we found no marked differences in association measures (PRs) among responders and all invitees, when restricted to western girls and boys. In non-western girls and boys, however, there were differences in the prevalence ratios in the association between exposure to smoking and physical activity on selected outcomes, with no discernable pattern. This could be due in part to the low number of participants in these groups or information bias, i.e. a linguistic or cultural problem in understanding the meaning of any of the questions from the questionnaire [47, 48].
We may also draw support from a Dutch study on pre-adolescents by de Winter et al.  regarding western boys and girls in the present study, as well as a warning that prevalence estimates of mental health problems may increase with increasing participation rates. They utilized information from community registers, parents, teachers and classmates in order to investigate a possible bias in association measures and prevalence estimates. Responders were compared with late responders and non-responders, demonstrating that extra efforts to increase the sample size from 66% to 76% prevented an underestimation of the prevalence of psychopathology. Nonetheless, even with differences between non-responders and responders on several individual characteristics, no significant differences were found pertaining to associations between these characteristics and psychopathology . In the present study, mental health was associated with lost to follow-up in the western subgroup, and we also detected that after reminders, late responders reported more mental health problems than early responders . For that reason, we may have underreported the occurrence of mental health in the follow-up.
In a two-year follow-up of 15-18-year-old psychiatric outpatients, it was possible to reach all 101 patients except four, using a comprehensive tracking system . Axis I and II disorders at the two-year follow-up were significantly associated with follow-up contact difficulties, while baseline psychopathology and sociodemographic variables were not. Thus, relying on baseline characteristics of adolescents may underestimate the extent of psychopathology at follow-up. Based on the above mentioned studies, there might be a higher rate of mental health problems among those lost to follow-up compared to participants in our study. Even though we found that relying on baseline information yielded a higher overall prevalence of externalized symptoms in those lost to follow-up compared to participants and overall no difference regarding mental distress, we cannot rule out of whether there was any underestimation of psychopathology at follow-up in the present study.
According to Hartge , poor response rates may be of little concern if the willingness to participate is essentially unrelated to exposure. Even if willingness differs with exposure, bias will still not result unless the tendency is stronger (or weaker) in different levels of outcome (i.e. in individuals with disease vs. no disease). According to Kleinbaum et al. , even if a willingness to participate is unrelated to exposure, this willingness may be stronger (or weaker) associated with baseline exposure by level of outcome, meaning that selection bias may occur. In the present study, we have investigated whether there are differences in associations measures (PRs) between baseline exposures (smoking, physical activity) and baseline health outcomes (self-reported health, mental distress, externalized symptoms) among participants and all invitees. The use of baseline outcome variables must be regarded as a proxy evaluation of selection bias in associations between baseline exposures and outcomes at follow-up. In our study, a willingness to participate was associated with one of the two exposure variables, namely baseline smoking, but not with baseline physical activity (Table 4). For the total material and in Norwegian/western participants, we detected no selection bias in the association measures (PRs) when utilizing baseline smoking and physical activity as the exposures and selected baseline outcomes (mental distress, externalized symptoms, self-reported health), thereby indicating that the associations between willingness to participate and smoking (and physical activity) are similar by level of outcomes . However, in subgroups of non-western youths (especially in boys) the association between willingness to participate and exposure (particularly with smoking) differed by level of outcome (mental distress, externalized symptoms, self-reported health). So in accordance with Kleinbaum , the estimated PRs are biased in these subgroups of participants due to selection, which are indicated with different estimates of PRs between participating and all invited non-western boys and girls (Table 6). However, the number of participants in these groups was low and the confidence intervals were wide.
Unfortunately, it is not possible to conclude that the association between baseline exposures (smoking or physical activity) and selected outcomes measured at follow-up (self-reported health, mental distress, externalized symptoms) are free from selection bias, but if we lean on analyses of the present baseline outcome data and reports from previous studies [4, 6], we may be able to say that there is probably no major selection bias. Regarding subgroups of non-western immigrants, it is more likely that the association measure is biased.
In the present study, it is not possible to directly assess whether the task of providing DNA has affected the rate of lost to follow-up among 18/19 year olds. However, in the school-based study of 13th graders in which DNA was provided, the response rate was high at 90%, which is similar to the school-based surveys of 10th graders in which DNA sampling was not included. Because of this, it is unlikely that a particular fear of providing DNA played a role in the response rate for the school-based portion of the study. In a qualitative study in the UK  of 23-67-year-old participants from an epidemiological health study which collected DNA, it was reported that most of the panel had a positive attitude to medical research and that genetic research in particular was seen as being especially rich in the potential for medical advancement. Other reasons for participating in this genetic health study were: a desire to do good; the possibility of a health gain in the form of a health check; confidence in the research process and its governance and a perception of low risk . The study revealed that the participants had these positive attitudes although most of them misunderstood the aim of the genetic epidemiological study, which was explained in information leaflets. It is not known which factors were in operation for adolescents, and this should be further explored.
To the best of our knowledge, the present study is the first of its kind to investigate predictors of failure to provide DNA. We hypothesize from the present data that there are similar personal reasons behind a willingness to provide DNA and a willingness to agree to linkage of data to registers and health surveys, which should also be further explored. The proposed hypothesis is based on detection of a similarly high response rate in the school-based studies that did or did not collect DNA, with a consent rate as high for providing DNA as for linking data to registers and health surveys.