Open Access Research Article Validity of Instruments to Assess Students' Travel and Pedestrian Safety

Background: Safe Routes to School (SRTS) programs are designed to make walking and bicycling to school safe and accessible for children. Despite their growing popularity, few validated measures exist for assessing important outcomes such as type of student transport or pedestrian safety behaviors. This research validated the SRTS school travel survey and a pedestrian safety behavior checklist. Methods: Fourth grade students completed a brief written survey on how they got to school that day with set responses. Test-retest reliability was obtained 3-4 hours apart. Convergent validity of the SRTS travel survey was assessed by comparison to parents' report. For the measure of pedestrian safety behavior, 10 research assistants observed 29 students at a school intersection for completion of 8 selected pedestrian safety behaviors. Reliability was determined in two ways: correlations between the research assistants' ratings to that of the Principal Investigator (PI) and intraclass correlations (ICC) across research assistant ratings. Results: The SRTS travel survey had high test-retest reliability (κ = 0.97, n = 96, p < 0.001) and convergent validity (κ = 0.87, n = 81, p < 0.001). The pedestrian safety behavior checklist had moderate reliability across research assistants' ratings (ICC = 0.48) and moderate correlation with the PI (r = 0.55, p =< 0.01). When two raters simultaneously used the instrument, the ICC increased to 0.65. Overall percent agreement (91%), sensitivity (85%) and specificity (83%) were acceptable. Conclusions: These validated instruments can be used to assess SRTS programs. The pedestrian safety behavior checklist may benefit from further formative work. Background Improving youth physical activity is an important public health goal for addressing the childhood obesity epidemic. In the US, physical activity declined across age groups among a nationally representative sample [1] and longitudinally among a large US cohort [2]. Since adolescent physical activity was associated with adult physical activity [3], engaging youth in lifelong physical activity behaviors should promote health throughout the lifespan. Active commuting to school, i.e. walking or bicycling to and from school, shows promise for improving youth physical activity. have linked active commuting to school with higher levels


Background
Improving youth physical activity is an important public health goal for addressing the childhood obesity epidemic. In the US, physical activity declined across age groups among a nationally representative sample [1] and longitudinally among a large US cohort [2]. Since adolescent physical activity was associated with adult physical activity [3], engaging youth in lifelong physical activity behaviors should promote health throughout the lifespan.
Active commuting to school, i.e. walking or bicycling to and from school, shows promise for improving youth physical activity. Reports from Australia [4], Germany [5], Scotland [6], Canada [7], England [8] and the US [9,10] have linked active commuting to school with higher levels of physical activity [11][12][13]. Active commuting to school is an integral part of the US Safe Routes to School (SRTS) Program [14], which promotes walking and bicycling to school safely by primary and middle school students. The concept for SRTS programs originated in Denmark in the 1970s and spread internationally [14]. The first US programs started in New York City and Florida in 1997. SRTS programs consist of up to four elements: education, encouragement, enforcement, and engineering to promote safe walking and bicycling to school. Since most US youth attend organized school, SRTS could have large public health impact [15]. Increasing walking and bicycling to school were objectives of US Healthy People 2010 [16]. Evaluations of walk to school programs in Scotland [17] and the US [18][19][20][21] have reported success in improving active commuting to school among students from a range of socioeconomic levels and settings. SRTS programs are widely available and gaining in popularity internationally [22,23]. The US SRTS Program received substantial federal funding -$612 million from [2005][2006][2007][2008][2009]; however, few validated measures exist for assessing outcomes of Safe Routes to Schools (SRTS) Programs [24], such as school travel and pedestrian safety. Previous school travel surveys were too lengthy for focused evaluations of active commuting to school [25], were developed among a rural sample only [26], had unclear validity for children who walked or biked to school [27], or have not been formally validated [23]. We therefore sought to validate a SRTS travel survey in an urban population (Objective 1), with the goal of providing an easy to use school travel survey for general program audits, but also rigorously validated and appropriate for researchers studying SRTS programs.
While most studies on SRTS have focused on students' mode of transport to school, an equally important outcome is pedestrian safety behavior. Increasing the numbers of children walking and bicycling to school also increases children's exposure to pedestrian hazards. SRTS programs emphasize safety through its four program elements. Since linking reductions in child pedestrian injuries to behavioral interventions would require very large sample sizes, assessing children's pedestrian safety behavior provides a direct assessment of what was learned. Videotaped observations of children's pedestrian safety behaviors has been described [28], but may not be feasible for some investigators due to cost. Previous investigators have described observational checklists for pedestrian safety behaviors assessed in the field among children in Israel [29], Scotland [30] and the US [31], although they only reported inter-rater reliability without comparison to an expert. We sought to fill this gap by assessing the reliability of a pedestrian safety behavior checklist designed for field use (Objective 2), which would be useful for SRTS program researchers.

Population and Sample
Participants were students from low-income (90-99% of students qualified for the US Federal Free or Reduced Price Lunch Program) elementary schools in the Houston Independent School District of Houston, Texas, the fourth largest US city. For Objective 1 (school travel survey validation), students were recruited from 4 th grade classrooms of two schools that predominantly enrolled either African American (85%) or Hispanic (74%) students. Eligibility was restricted to 4 th grade students since this study was conducted in preparation for a randomized controlled trial evaluating a walking school bus intervention among 4 th grade students in the Houston Independent School District. For Objective 2 (pedestrian safety behavior checklist validation), students from one elemen-tary school (kindergarten through 5 th grade) that predominantly enrolled African American students (90%) were observed. Low-income ethnic minority participants were recruited because they are generally underrepresented in childhood obesity and injury prevention research, yet are substantially impacted by both obesity and unintentional injuries [16].

SRTS travel survey (Objective 1)
We adapted the publicly available SRTS travel survey from the US National Center for Safe Routes to School website [23]. Instead of asking students to raise their hands to indicate how they traveled to school (which could have a strong social influence bias on responses), we administered a written survey [Additional file 1]. The survey obtained each child's name, parents' contact information, and asked one question, "How did you get to school today?" The students could chose among seven potential responses, "rode school bus, came by carpool, came by car, rode metro bus, walked with an adult, walked without an adult, or biked". Students were instructed to mark the one answer that best showed how they got to school. Parents were contacted on the same day by study staff and asked, "How did [child's name] get to school today?" Parents also answered questions on the child's date of birth, gender, and race/ethnicity.

Pedestrian safety behavior checklist (Objective 2)
The principal investigator (PI) developed a pedestrian crosswalk behavior observation checklist based on previous observation elements that had good reliability [28][29][30][31]. Specifically, we observed and scored (yes/no) the following pedestrian safety elements: crossed at a corner or crosswalk, crossed with an adult or safety patrol, stopped at the curb, looked left-right-left, kept looking while crossing, walked and did not run across the street, and followed the traffic signal (if present). We added an item to assess if the participant was a part of a walking school bus, operationally defined in this study as a group of children wearing bright reflective vests and led by an adult [Additional file 2].

SRTS travel survey (Objective 1)
We recruited 4 th grade students, aged 9-11 years, to complete in their classrooms the one-question written survey, available in English and Spanish. Students could "opt-out" of the study if asked by their parents, who received an informational letter describing the study, or if they did not wish to participate in the study themselves. Study staff administered the survey by asking the students, "how did you get to school today?" in English or Spanish (as appropriate) and directing them to indicate their answer on the survey. Students also provided their full names and telephone numbers, to facilitate contact with their parents, who were asked the same question in English or Spanish along with brief demographic questions. Test-retest reliability was determined on the same day by repeating the student survey 3-4 hours later in the same classroom. Convergent validity was determined in comparison to parents' report for that day, similar to previous school travel validation studies that compared student report to parent report [26,27].

Pedestrian safety behavior checklist (Objective 2)
The PI trained 10 research assistants in a 1.5 hour session to complete the pedestrian safety behavior checklist. The first 1/2 hour of training was spent discussing pedestrian safety and the checklist, while the final hour was spent in the field observing and scoring pedestrians with the PI, who also gave feedback on correct scoring. After training, the PI and research assistants unobtrusively observed child pedestrians walking toward the study school in the morning prior to classes. Children on bicycles, skateboards, scooters, or riding in strollers were excluded, since this checklist was designed to assess pedestrian safety only. Ten research assistants and the PI observed a convenience sample of 29 students chosen by the PI (about 1/2 were female) at a major school intersection. For groups of children crossing the street, the PI chose only one student for the research assistants to observe, consistent with a previous study [30]. The PI and research assistants did not interact with the children (to avoid influencing their behaviors), thereby eliminating the possibility of collecting socio-demographic information on individual children. The PI served as the comparison because he was a board-certified pediatrician with advanced public health training in injury prevention, physical activity, and SRTS research. He also developed the pedestrian safety behavior checklist, trained the research assistants in the use of the checklist, and observed the study participants simultaneously with the research assistants. While an objective measure would have been ideal, no such standard exists for child pedestrian safety observations. This study was approved by the Institutional Review Board of Baylor College of Medicine and the Research Department of the Houston Independent School District.

SRTS travel survey (Objective 1)
Frequencies and percentages were used to describe participant characteristics. To assess the reliability of the children's rating of the SRTS travel survey, the percentage of agreement and the kappa test statistic for agreement assessed the reliability between the children's initial SRTS ratings and the child ratings given 3-4 hours later. Similarly, the percentage of agreement and the kappa statistic for agreement assessed the convergent validity between the children's initial SRTS survey responses and their parent's responses, which served as the comparison.

Pedestrian safety behavior checklist (Objective 2)
To assess systematic differences between rater values and the PI, an analysis of variance with a priori contrasts on the pairwise comparisons was used. Spearman correlation was used to determine the reliability between the raters's and the PI's total scores. The mean of each rater and the PI's score was correlated with the difference between each rater and the PI's score to evaluate whether errors from the rater was associated with the mean difference (a Bland-Altman plot) [32]. Relative reliability, the extent to which the rater's scores were the same as the PI, was assessed using the two-way mixed intra-class correlation (ICC). Generalizability theory identified the contribution to variability from different sources (the behavior, the rater) and provided a generalizability coefficient, a form of the intra-class correlation, which incorporated the different sources of error [33].
To determine the probability that the rater reported that the behavior was performed, when the individual actually performed the behavior (based on comparison to the PI), the sensitivity was computed. Conversely, the specificity was computed to determine the probability of the rater reporting that the behavior was performed when the individual did not perform the behavior. All analyses were performed using SAS 9.1.3 (SAS Institute Inc., Cary, North Carolina, USA).

SRTS travel survey (Objective 1)
Ninety-nine children (aged 9-11 years) out of a total of 125 participated in the validation of the SRTS travel survey for a participation rate of 79%. Demographic information provided by parents was obtained on 82 children ( Table 1). The majority of participants were Hispanic (65.9%) and non-Hispanic black (32.9%). Of the 99 children, 96 (97%) provided test-retest reliability data (Table  2), which was high (kappa = 0.97, p < 0.001) with 97.9% total agreement between test administrations. Convergent validity of children's responses to parents' responses (Table 3) was moderate (kappa = 0.52, p < 0.001, n = 82). Most disagreements were due to a discrepancy between car versus carpool responses. When car and carpool were combined into one category and excluding the one child reported metro bus case (due to the very low frequency), agreement was high (kappa = 0.87, p < 0.001). Forty-six parents (57%) used the Spanish version of the survey and 43 children (45%) used the Spanish version of the survey at least once.

Pedestrian safety behavior checklist (Objective 2)
At the item-level, overall percent agreement (91%), sensitivity (85%), and specificity (83%) comparing the raters to the PI were acceptable (Table 4). Item #6 (kept looking) had the lowest percent agreement, sensitivity and specificity of all of the items, and was therefore dropped from the remaining analyses. Although the average pedestrian behavior scores from two raters were significantly different from the PI, the average difference was only around 0.3 units ( Table 5). The average correlation for the raters compared to the PI was moderate (r = 0.55, p < 0.01) and reliability across raters was also moderate (ICC = 0.48, Gtheory coefficient = 0.50). If two raters used the checklist simultaneously, the ICC increased to 0.65 and the Gcoefficient increased to 0.66. There was an association between the difference between the rater and PI and the mean of the rater and PI (r = -0.53), where larger differences were associated with fewer behaviors. From the Bland-Altman plot (Figure 1), there was no distinct pattern to indicate that changes in scoring variability were related to differences between the raters and PI's score. The percent of variance of the checklist items (Table 6) was accounted for mostly by the items themselves (78.1%), which suggest that the items were distinct and distinguished among the pedestrian behaviors. The child x item interaction accounted for 11.5% of the variance and suggests that children were observed using different patterns of behaviors. The rater x item interaction accounted for only 0.4% of the variance and the child x rater interaction accounted for only 0.2% of the variance. Residual error accounted for only 9.5% of the variance in the checklist items.

Discussion
The written SRTS travel survey was simple to administer and had high reliability and validity among elementary schoolchildren in the Houston-metro area. Child reported active commuting to school was 21%, a figure higher than previous national surveys, which reported prevalences of 13 to 14% [34,35], likely due to differences in participant demographics and the wording of the questions used to assess active commuting to school. While other surveys have been developed or validated among middle-/high-income and rural or suburban samples of  children [20,23,26,27], this report is the first to validate a school travel survey in English and Spanish among a low income sample consisting of mainly Hispanic and non-Hispanic black children. This survey is suitable for use among both English and Spanish speaking urban elementary students, 9 years of age and older. Building upon the school travel survey used from the Marin County evaluation [20] and the survey from the National Center for SRTS [23], which asked students to raise their hand in response to their school travel mode, this written SRTS travel survey minimized potential peer influences on responses. This study also builds upon the study that lacked children who walked or biked to school in the validity sample [27]. In contrast to the CLASS instrument, which was more comprehensive and assessed 30 types of physical activities including active commuting to school [25], this survey focused solely on active commuting thereby providing a targeted, efficient, and valid, method to assess SRTS programs. The pedestrian safety behavior checklist showed high overall agreement at the item level between raters compared to the PI (91%) and acceptable sensitivity (85%) and specificity (83%) in field testing. This high level of agreement is similar to a previous study that used videotapes of children crossing the road and coded for specific pedestrian safety behavior [28]. The checklist had moderate   correlation compared to the PI and moderate reliability across raters. Checklist item #6, "kept looking," proved to be the most difficult to assess in the field and is reflected by having the lowest percent agreement (68%). This finding was not surprising, since item #6 required raters to observe whether or not the child was watching for traffic while crossing the street, which can be difficult due to raters having different vantage points during the observation. Thus, although this behavior is important, we recommend eliminating it from the checklist to improve reliability. Since reliability increased substantially from an ICC = 0.48 with one rater to an ICC = 0.65 with two raters, we recommend that two raters simultaneously use the pedestrian safety behavior checklist to maximize reliability. Reliability for this checklist was lower than a previous 4-item instrument (0.90) that assessed whether participants walked in the sidewalk/shoulder, stopped before entering the street, looked left-right-left before crossing, and kept looking for traffic while crossing [31]. This difference may be due to the greater number of items on our checklist (8-items), which potentially increased the difficulty of the instrument for the raters. Generalizability theory analysis revealed that the majority of the variance (78.1%) of the checklist responses was due to the items themselves, which is desirable and indicates that the checklist successfully distinguished among pedestrian behaviors. Overall, the checklist appears to be an acceptable instrument to assess pedestrian safety behavior in the field without the need for video recording equipment. This instrument builds on previously described observational checklists for pedestrian safety behaviors that reported high reliability [29][30][31] by also providing comparison to the PI, the developer of the checklist. This study has several limitations. First, the SRTS travel survey assessed only school travel on one day. However, it can be administered on several different days to assess school travel patterns over longer periods of time. The travel survey also forced students to choose only one method of travel to school that best described their mode of transport. While this made the survey easier for the students to complete, some information may be lost for children who used multiple modes of transport for substantial parts of their commute (e.g. walk to the bus stop and then ride the bus). The travel survey did not assess travel from school to home, which may differ from the journey from home to school. Given the high convergent validity and reliability of the SRTS travel survey for assessing the journey from home to school (i.e. how did you get to school today?), it seems likely that children would also give valid and reliable results for the journey from school to home (e.g. how will you/did you get home today?)-this requires confirmation. Finally, convergent validity was determined compared to parent report, which is subjective and may be prone to error. However, no valid method for objectively assessing individual students' mode of travel to school exists and comparison to parent report has been previously used by others for validating school travel surveys [26,27]. Significant difference between rater and gold standard at p < 0.05 (*) and p < 0.01(**). Pearson correlation between rater and gold standard (corr1) and between difference and mean (corr2). Overall difference F(10,250) = 2.49, p = 0.007.
Limitations of the pedestrian safety behavior checklist include the moderate correlations and the lack of studies relating these behaviors to injury outcomes. However, these pedestrian behaviors were drawn from previous pedestrian injury prevention studies and were generally considered fundamental injury prevention behaviors for children to learn [28][29][30][31]. Further formative work may increase the checklist's reliability, such as improving training via standardized videotaped examples and focusing on problematic items (item #6 "kept looking" and item #4 "stopped at the curb"). If percent agreement, sensitivity, and specificity do not improve despite this formative work, those items will be removed. Another limitation was the use of the PI, a subjective rater, as a comparison. An objective criterion standard would have been ideal; however, no such standard exists for observations of pedestrians behaviors. While others have previously used videotaped observations, these were beyond the scope of this study, but may be useful for future validity studies.

Conclusions
The written SRTS travel survey is a valid and reliable instrument for assessing school travel outcomes among English and Spanish speaking 9-11 year old elementary school students. It should prove valuable to both researchers and program evaluators seeking an efficient and rigorous method for assessing travel outcomes associated with SRTS programs. The pedestrian safety behavior checklist provides a reasonable method for assessing children's pedestrian behavior in the field, and should be used simultaneously by at least two raters to maximize reliability. Future studies should seek to improve the checklist's reliability and examine the relationship between the pedestrian safety behaviors and childhood pedestrian injuries.   the United States Department of Agriculture (USDA/ARS) Children's Nutrition Research Center, Department of Pediatrics, Baylor College of Medicine, Houston, Texas, and was also funded with federal funds from the USDA/ARS under Cooperative Agreement No. 58-6250-6001. The funding agencies above had no role in the design, collection, analysis, interpretation of data, writing of the manuscript, or decision to submit the manuscript for publication. The contents of this publication do not necessarily reflect the views or policies of the funding agencies or Baylor College of Medicine, nor does mention of trade names, commercial products, or organizations imply endorsement from the funding agencies or Baylor College of Medicine.