Integrative study of pandemic A/H1N1 influenza infections: design and methods of the CoPanFlu-France cohort

Background The risk of influenza infection depends on biological characteristics, individual or collective behaviors and the environmental context. The Cohorts for Pandemic Influenza (CoPanFlu) France study was set up in 2009 after the identification of the novel swine-origin A/H1N1 pandemic influenza virus. This cohort of 601 households (1450 subjects) representative for the general population aims at using an integrative approach to study the risk and characteristics of influenza infection as a complex combination of data collected from questionnaires regarding sociodemographic, medical, behavioral characteristics of subjects and indoor environment, using biological samples or environmental databases. Methods/Design Households were included between December 2009 and July 2010. The design of this study relies on systematic follow-up visits between influenza seasons and additional visits during influenza seasons, when an influenza-like illness is detected in a household via an active surveillance system. During systematic visits, a nurse collects individual and environmental data on questionnaires and obtains blood samples from all members of the household. When an influenza-like-illness is detected, a nurse visits the household three times during the 12 following days, and collects data on questionnaires regarding exposure and symptoms, and biological samples (including nasal swabs) from all subjects in the household. The end of the follow-up period is expected in fall 2012. Discussion The large amount of data collected throughout the follow-up will permit a multidisciplinary study of influenza infections. Additional data is being collected and analyzed in this ongoing cohort. The longitudinal analysis of these households will permit integrative analyses of complex phenomena such as individual, collective and environmental risk factors of infection, routes of transmission, or determinants of the immune response to infection or vaccination.


Background
The first human cases of influenza caused by a novel swine-origin A/H1N1 pandemic influenza virus variant (H1N1pdm) were reported in Mexico and the United States in April 2009 [1]. Given the rapid spread of this virus and considering the likelihood of its pandemic extentconfirmed by the World Health Organization (WHO) on June 11, 2009 [2]the Cohorts for Pandemic Influenza (CoPanFlu) international consortium was initiated to study individual and collective determinants of H1N1pdm influenza infection across countries by setting up prospective cohorts of households, followed for 2 years in 6 countries or regions of the world: metropolitan France, Mali [3], Bolivia, Laos, Reunion Island [4] and Djibouti. The CoPanFlu-France cohort, set up in metropolitan general population, is part of the CoPanFlu international consortium and its protocol served as a blueprint for the other international cohorts.
Several studies already reported risk factors for seasonal or pandemic influenza infection in households. These studies focused on including individual characteristics of index patients and their household contacts [5][6][7][8][9][10] or hygiene measures as predictors of secondary household infections [11,12]. In addition to household studies, risk factors of seasonal influenza infections have been studied in relation to characteristics of social contacts [13] or baseline serological status of the host [14]. However, to our knowledge, no attempt was made to study the risk of influenza infection as a complex combination of biological characteristics (including immunity), individual or collective behaviors and environmental context. This integrative approach, in which epidemiological data is comprehensively collected and analyzed, is currently developed for non-communicable diseases and relies on methods derived from Genome-Wide Association Studies (GWAS) [15,16]. To achieve our objectives, we developed a multidisciplinary approach, with an original design involving data collection on subjects and their environment and biological samples.

Methods/Design
Sampling This cohort was designed to assess the relative risk of infection by the H1N1pdm virus. We first intended to include 1000 households (about 2100 subjects) which would have permitted to detect covariates associated to a relative risk ≥ 1.4 with a 80% power and 5% significance, assuming a cumulative incidence of 10% and intra-household correlation of 0.3.
Households were sampled using a random telephonic procedure (Mitofsky-Waksberg design [17]) in a stratified geographical sampling scheme, aimed at including a sample of subjects as close as possible to the French general population [18,19]. Forty addresses were drawn from the national directory. These addresses defined the centers of 40 areas inside which subjects were eligible. The limits of these areas were defined as the smallest circle including 130,000 household addresses in the public directory. The size of these areas varied (5 to 5000 km 2 ) according to population density (see Figure 1). In each area, two lists of households were drawn: A "landline" list of 25 households: these households were chosen as those with the phone number immediately following a landline number drawn in this area. Since landline numbers are geographically allocated, this method ensured reaching households who chose not to be listed in the national directory. A "mobile phone" list of 7 households: these households were directly drawn in the national directory, in order to reach households without a fixed phone.
Addresses were iteratively drawn from the 40 lists of 130,000 households each and these households were phoned to present the study and, upon meeting eligibility criteria, were sent a written description of the study. A household was considered as "pre-included" when a referent member sent back a filled form to confirm his agreement. According to this method, 1,280 households were pre-included, i.e. agreed to be visited by a nurse for an inclusion visit involving all household members. We anticipated that 20% of pre-included households would finally decline to participate in the cohort.

Eligibility criteria
A household was defined as a person or group of people occupying the same domicile. All households were eligible to participate in the cohort, provided at least one member was over 18 years of age and French-speaking. A household member was defined as a person living at least half his/her time in the household. All household members were eligible, regardless of age. The inclusion of a household required the participation of all members: the refusal of one or more member(s) prevented the inclusion of other members.

Participants
Five thousand one hundred and two households were contacted by phone in order to achieve our targeted number of 1280 pre-included households (see Figure 2). The rate of contacted eligible households who agreed to be pre-included varied from 17% to 34% across the 40 areas. The main reasons for non-participation were lack of time and expected difficulty to collect blood samples from children.
Six hundred and seven households were visited by a nurse for an inclusion visit, among which six finally did not agree to participate (refusal of at least one member after receiving more detailed information on the study). Data was collected on the 601 remaining households (1450 subjects). According to population census data [20], these households had sociodemographic characteristics close to the general population (see Additional file 1: Figure S1 and Additional file 1: Tables S1-S6 for details).

Data collection
The main objective of this study was to identify individual and collective determinants of H1N1pdm infection; therefore we tried to collect comprehensive data about subjects and their environment, in addition to biological samples. Several household visits are carried on by nurses for this purpose (see Figure 3 for details).

Inclusion visits
During the inclusion visit, nurses collected from all subjects detailed data regarding medical history, vaccination and preventive measures against influenza, smoking habits, socioeconomic status, risk perception and beliefs, frequency and characteristics of meetings with other people and housing (personal room, house or apartment). As the households' addresses were geocoded, we were able to get additional information from public databases regarding the immediate surrounding environment of households. An overview of data collected from questionnaires at entry in the cohort is shown in Figure 4. Blood samples were collected and centralized for serological analyses. For subjects over 10 years, a heparinated tube was also collected to study cellular immunity, as well as a blood sample dedicated to transcript analyses.

Influenza-like illness (ILI) visits
During the influenza season (as defined by the French surveillance network [21]), we use an active surveillance system order to detect ILIs: all households are called by an interactive voice response system (IVRS) weekly and are asked if any subject has symptoms of ILI (fever ≥ 37.8°C associated with cough or sore throat, as defined by the CDC [22]). A free phone number is given to subjects to report symptoms spontaneously between two weekly calls. In case of reported ILI, symptoms are validated by the study team and then three "ILI visits" are organized: nurses visit the household within 48 h after the onset of symptoms, then 3-6 days and 8-12 days after the onset. During these visits, a detailed questionnaire collects data about the circumstances of possible exposure to influenza viruses and the chronology of symptoms (if any) in all subjects. Nasal swabs are collected from all subjects. A stool sample and a throat swab are also collected from subjects with ILI, as well as a blood sample from those over 10 years of age. Moreover, a self-swab procedure is previously sent to the households in order to collect virological samples when a visit by a nurse within the first 48 h is not possible. Nasal swabs are used to identify various respiratory viruses by PCR and biochips allowing for multiple diagnosis tests. This series of three visits can occur several times in the same household during an influenza season. There were 23 ILI alerts during the 2009-2010 season (as households were still being included) and 143 during the 2010-2011 season, all of which triggered up to three ILI visits.

Vaccination visits
In order to update serological information, a blood sample was collected from subjects who had an influenza vaccination, between 2 and 4 weeks following this vaccination. There was one vaccination visit following the inclusion visits; 29 vaccination visits were conducted following the first wave of follow-up visits and 69 following the second wave.

Timeline
The cohort was initially designed to include households before the 2009 pandemic season and to follow subjects during the two subsequent influenza seasons. We

Ethical considerations
The protocol of the CoPanFlu-France study was approved by the research ethics committee "Comité de Protection des Personnes Ile-de-France 1" on September 8, 2009. Information was previously given by investigators to participants indirectly through written descriptions of the study and training of the nurses, and directly by e-mail and telephone for any question. Written informed consent was obtained for all subjects.

Expected results
Many analyses have been recently completed or are currently being carried out. Based on inclusion data, we used a data-driven approach to identify factors associated with a high anti-H1N1pdm serological titer. We are conducting several analyses to identify risk factors associated with influenza infections (based on serological data for the first pandemic season, then on both serological and virological data for the following seasons). Nasal swabs are being analyzed to identify various respiratory viruses and the characteristics of infected subjects. Blood samples collected during ILI visits are used to study innate immunity against influenza and the related transcriptome. Determinants of vaccination against influenza have also been identified, and other studies are being conducted in the field of social science and risk perception. Several other analyses are expected soon from different collaboration partners in various biomedical fields.

Strengths of the study
The main strength of this cohort is the large amount of available and expected data and the different biological samples to be collected, which will permit to carry on many studies in various biomedical fields. To our knowledge, this project is the first attempt to study so thoroughly the determinants of infections by respiratory viruses in a large sample of households randomly selected in the community. This approach is likely to provide new insights from the interaction of sparse data usually studied separately, especially with the help of data-driven methods such as those already under development in the field of non-communicable diseases Figure 4 Main data collected on questionnaires at entry in the cohort, in addition to blood samples. [15,16]. Additional data is being collected and analyzed in this ongoing cohort, whose longitudinal analysis will permit integrative analyses of complex phenomena such as individual, collective and environmental risk factors of infection, routes of transmission, or determinants of the immune response to infection or vaccination.

Limitations
We designed the CoPanFlu-France study in order to enable inference to the French general population, yet we cannot exclude a selection bias induced by the proportion of contacted households who refused to participate. However, a comparison between CoPanFlu subjects and population census data [20] suggests that this bias was controlled (see supplementary material part 2). We wish we were able to set up this project a few months earlier, in order to include households before and to follow-up subjects during the 2009 pandemic season. Due to organizational impairments, the inclusion process was delayed and data regarding ILIs were collected retrospectively, sometimes up to 6 months after the epidemic. Thus, this timeline of inclusion may have induced recall or reporting biases for the 2009 season, and we were not able to collect enough pre-pandemic blood samples and nasal swabs during this first H1N1pdm season to investigate laboratory-confirmed infections. Another consequence of this delayed inclusion process is that we decided to stop inclusions as only 601 out of the 1000 expected households were included. This limit is the main reason why we decided to postpone the end of the study until 2012 instead of 2011 as initially expected, in order to collect data during an additional influenza season.

Additional file
Additional file 1: Design and methods of the CoPanFlu-France cohort: representativeness of the population sample.
Competing interests Pr Carrat reported not having shares or paid employment with pharmaceutical companies; received honoraria from Novartis, GlaxoSmithKline and Boiron and received travel support to attend scientific meetings from Novartis. Other authors have no competing interest to declare.