In their model, Glass et al. [8, 9] conceptualize a social contact network as people of appropriate ages placed into multiple groups (more than one group per person) of a variety of types such as households, school classes, or clubs (see Additional file 1, Figure S12 for an illustration). Within each group, people are connected to each other with primary links (networked as a variety of random to regular to scale-free graphs) to yield cliques or seating (as in a classroom) such that an individual does not necessarily have contact with all others in the group. Because people belong to multiple groups, the social contact network exhibits the overlapping quality of a structured community [15] with both clustering within a group (cliques) and the "small-world characteristic" of needing only a small number of contacts along links to reach any particular individual within the entire community [16]. The network is generated from statistics (averages and distributions) for the groups people belong to, their size, internal structure such as seating, the number of primary links that a person may have within each group, and the frequency of contact along these links. The frequency of contact is related to the amount of time that a person spends within a given group and may vary from group to group and person to person.
Three recent studies consider the contact process within a social network to evaluate the spread of infectious diseases such as influenza [14, 17, 18]. Using surveys and contact diaries they were able to show a direct connection between observations of age-specific social behavior and the observed age-specific risk of infection. Such studies support the use of self-reported social contacts as a way to evaluate the transmission of infection, as well as do others in the characterization of more general social networks [19, 20]. Of particular interest, Edmunds et al. [17] differentiated contacts into 4 levels: level 1 for physical contact without conversation, level 2 for conversation, level 3 for conversation with physical contact, and level 4 for kissing and other intimate behavior. These levels relate to the potential that an infection like influenza will be passed during a contact.
We combined Edmunds et al.'s [17] concept of contact-level with Glass et al.'s [8, 9] conceptualization of a social contact network to obtain a group-centric method of characterization that embodied both quantity (time) and quality (level) of contact for the potential transmission of influenza. We initially evaluated two self-report methods, surveys of individuals and contact diaries. Surveys of individuals produce data based on a person's reflection and estimation of what they do in their lives on typical days, weeks, or longer times, and if non-specific prompting is used, forgetting can be counteracted at least in part [21]. Contact diaries in principle collect actual contact data from individuals over a specific period of time as the contact occurs. In practice, we found in preliminary trials that students filled in their contact diaries very unevenly and often from memory well after the contact (through reflection and estimation) rather than at the time of the contact. Therefore, we chose to use individual surveys conducted at once with entire school classes, that is, students were guided to fill out the survey individually (without comparison) but in class, step by step, with instruction and examples given as required. This gave access to students in a structured educational setting with an authority figure present (teacher) and where questions could be asked and answered with non-specific prompting. Our survey questionnaire and its administration within a classroom setting are provided in Additional file 1.
In the survey, students recorded what they do on typical school days, weekends, weeks, months or years as relevant for a particular group or "public activity" (see definition Table 1). The distinction between groups and public activities is critical for our characterization. Groups are defined as routine sets of people within which a person contacts others and were categorized as households, extended family, before school classes or care, school classes, lunch periods and recess, after school care, clubs, sports, work, church, friends, and neighborhood. Public activities were called out separately and defined as activities that are usually done either alone or as part of a group venturing out into the community where many unplanned or random interactions can occur. These public activities were categorized as passing periods within school, car rides, school and city bus rides, mall, errands, movies, concerts, sport event participation or attendance, dances, parties, and eating out.
Students first filled in their age, grade and gender and then proceeded to fill in a set of tables which assessed the groups (listed above) and group characteristics to which they belonged. Group characteristics (defined in Table 1) included: group name, time in group per day or week, group size, age range of entire group, number of primary links inside the group (within 3 ft and for a significant amount of time, more than several minutes; for example, a student may be in a school class group with 30 other students but may only be within 3 ft of and interact with the 4 people seated around them, thus the size of group would be 30 and number of primary links would be 4), initials of primary links, relationship with primary links (family, friend, acquaintance, authority), contact-level or range with primary links (1: close, within 3 feet, 2: close and talking, 3: close, talking and touching, 4: kissing), and ages or age range of primary links. While not used in subsequent analysis, filling in the initials and relationship of their primary links prompted students to focus and carefully think through their groups and contacts throughout the day. Such non-specific prompting was found in preliminary trials to be important to increase survey accuracy as has also been found more generally by others [21]. Students were also asked to record additional time spent with household members on the weekend and the number of hours they spent alone (doing homework, watching TV, reading, etc.) on a daily basis. The hours a student spent sleeping were not recorded and thus excluded from attribution. In the work environment, random contacts (defined as "customers") were also recorded.
After completing the groups section of the survey questionnaire, students then filled in a set of tables to characterize the public activities (listed above) in which they participated. Characteristics of public activities (defined in Table 1) were: number of times per day, week or month that an individual participates in the public activity, how much time the public activity takes, which previously defined groups the individual does the public activity with, number of the primary links within these groups an individual has contact with during the activity, range of level of these contacts, whether the public activity is already included in the time spent previously recorded in the groups, and the number, range of level of contact and age of others contacted at random while doing the activity. Data on the amount of time associated with random contacts was not taken, they were defined as recognizable but "in passing" (less than a few minutes, with other students in hallways and people in malls, concerts, dances, etc, or customers in the work environment). Surveys often required more than one class period to complete.
In-class surveys were conducted during late fall, 2006, and early winter, 2007, after receiving study approval from the Albuquerque Public School Review and Clearance Committee. Albuquerque High School and one of its feeder middle and elementary schools were chosen as they were both ethnically and socio-economically diverse. A convenience sample was chosen of 2 classes of freshmen (9th grade, ages 14–15), sophomores (10th grade, ages 15–16), juniors (11th grade, 16–17) and seniors (12th grade, ages 17–18), as well as 2 middle school classes (7th grade, ages 12–13) and 2 elementary school classes (5th grade, ages 10–12) for the survey (a total of 12 classes and 249 people). Younger children were not considered because of the difficulty of completing the survey questionnaire. Out of 249 survey questionnaires administered, 141 (57%) were used for subsequent analysis (legible and fully filled in) with 82 from female and 59 from male students.
Data from surveys were sorted by grade (bunching 9–10th and 11–12th grades together) and by group/public activity type and then statistics were calculated (averages, standard deviations (SD), coefficient of variation (CV = SD/Average), minimum, and maximum values) for each group/public activity for the following measures (defined in Table 1): number per person, size, time (given for groups per day; for public activity per participation in the activity), number of primary links, contact-hours (number of primary links multiplied by time), contact-level, contact-level-hours (contact-level multiplied by contact-hours) and contact-level-hours per-person-per-day. Per-person values were calculated using the average number of the particular group/public activity for students in the particular grade (or grade range). For example, students in 11–12th grade had on average 1.52 different groups of friends each with an average of 16.18 contact-level-hours per day thus yielding 24.59 contact-level-hours per-person-per-day. Per-day values were formed for an average day that incorporated both week days and weekends (so that some weekday time such as in school classes was distributed to the weekend and vice versa for extra time spent on weekends with household members).
Contact-level-hours per-person-per-day combines both the quality and quantity of contact and is our best estimation of potential influenza transmission within a group or public activity. However, at its base, it is formed by a simple product between the level of the primary link and the time for the link within the group. In this way, level 2 is presumed to have twice the potential of transmission of level 1, etc. While this has the correct trend, it is a simple linear approximation.
Data for individual students were also analyzed to consider the total number of groups they belonged to, and, for their groups/public activities, the total contact-hours per day, average contact-level, and total contact-level-hours per-day. Across all individuals, we also calculated the fraction of primary links and random contacts that were with each of 5 age classes: preschoolers (0–5 years), elementary school children (6–10 years), middle-high school teenagers (11–19), adults (20–64), and seniors (>65). Finally, the average number of random contacts and their level in each of the public activities and the work environment were calculated.