Study design
CONNECT (CONtact and Network Estimation to Control Transmission) is a population-based survey of epidemiologically relevant social contacts and mixing patterns conducted in the province of Quebec, Canada. The first phase of CONNECT was conducted in 2018–2019 (February 2018 to March 2019), one year before the COVID-19 pandemic. Four additional phases of CONNECT were undertaken to document changes in social contacts during the COVID-19 pandemic period (CONNECT2: April 21st-May 25th 2020 and CONNECT3,4,5: July 3rd 2020-February 26th 2021) (Additional file 1: Table S1). All CONNECT phases were conducted with the same methodology.
Recruitment of participants
The target population of CONNECT consisted of all non-institutionalized Quebecers without any age restriction (e.g., elderly living in retirement homes who generally have personal phone lines were eligible but those living in long-term care homes (nursing homes, Quebec CHSLD) were not eligible). We used random digit dialling to recruit participants. The randomly generated landline and mobile phone number sample was provided by ASDE, a Canadian firm specialized in survey sampling [4]. After having explained the study, verified eligibility of the household and documented the age and sex of all household members, we randomly selected one person per household to participate in the study, using a probability sample stratified by age. This recruitment procedure was sequentially repeated for every new phase of CONNECT (i.e., new participants were recruited for every CONNECT phase).
Data collection
We collected data using a self-administered web-based questionnaire. A secured individualized web link to the questionnaire and information about the study were sent by email to each selected participant who consented to participate in the study. Parents of children aged less than 12 years were asked to complete the questionnaire on behalf of their child, whereas teenagers aged 12–17 years completed their own questionnaire, after parental consent.
The same questionnaire was used for all CONNECT phases. The first section of the questionnaire documented key socio-demographic characteristics. The second section was a social contact diary, based on instruments previously used in Polymod and other similar studies [5,6,7] (an example of the diary is provided in the Additional file 1: Figure S1). Briefly, participants were assigned two random days of the week (one week day and one weekend day) to record every different person they had contact with between 5 am and 5 am the following morning. A contact was defined as either physical (handshake, hug, kiss) or nonphysical (two-way conversation in the physical presence of the person, at a distance equal or less than 2 m, irrespective of masking). Participants provided the characteristics of the contact persons (age, sex, ethnicity, and relationship to themselves (e.g., household member, friend, colleague)) as well as characteristics of the contacts with this person: location where the contact(s) occurred (home, work, daycare/school, public transport, leisure, other location), duration, usual frequency of contact with that person, and whether the contact was physical or not. Participants reporting more than 20 professional contacts per day were asked not to report all their professional contacts in the diary. Instead they were asked general questions about these professional contacts: age groups of the majority of contact persons, average durations of contacts and whether physical contacts were generally involved or not. Additional questions about teleworking were included from CONNECT2 onwards.
All CONNECT phases were approved by the ethics committee of the CHU de Québec research center (project 2016–2172) and we commissioned the market company Advanis for recruitment and data collection. All participants gave their consent to participate in the study during the recruitment phone call. Informed consent was taken from a parent and/or legal guardian for study participation in the case of minors.
Analyses
We weighted the participants of the CONNECT 1–5 surveys by age, sex, region (Greater Montreal and other Quebec regions), and household composition (households without 0–17-year-olds, households with 0–5-year-olds, if not with 6–14-year-olds, if not with 15–17-year-olds), using the Quebec data of the 2016 Canadian census (Additional file 1: Table S2) and we verified that they were representative of the Quebec population for key socio-demographic characteristics. To obtain daily number of social contacts on a weekly basis, we weighted the number of daily contacts reported during the week (5/7) and the weekend (2/7). We classified the type of employment of workers using the 2016 National occupation classification (NOC) [8].
We estimated the number of social contacts per person and per day, for all locations combined and for 6 different locations: home, work, school, public transportation, leisure, and other locations. To do so, several steps were necessary. First, for a contact person met in more than a single location during a single day, the location of the contact was assigned in the following hierarchical order, according to risk of transmission: home, work, school, public transport, leisure and other locations [9]. For example, if a parent reported contacts with his child at home, in public transportation and in a leisure activity, we only considered the home contact to avoid counting contacts with the same person multiple times. Second, for workers reporting more than 20 professional contacts per working day, we added their reported number of professional contacts to the work location for their working day(s). Similar to other studies which allowed a maximal number of contacts per day [5, 6, 10], we truncated professional contacts at a maximum of 40 per day to eliminate extreme values and contacts at low risk of transmission of infectious diseases. Third, we identified all workers in schools through their NOC code and job descriptions and attributed their professional contacts to the school location. We did so to describe social contacts in schools, not only between students, but also between students and their teachers, educators, and other school’s workers. Unless specified, we estimated the mean number of contacts in the different locations using a population-based denominator. With this method, all individuals were considered in the denominator of each location had they reported contacts or not for that location. The sum of contacts in the different locations gives the total number of contacts.
Using data available from CONNECT1-5, we determined different periods to reflect the Quebec COVID-19 epidemiology, their related physical distancing measures, and expected seasonality in social contacts (Additional file 1: Figure S2). We used data collected from February 1st 2018 to March 17th 2019 as our pre-COVID period. We used data collected from April 21st to May 25th 2020 to represent the first wave, data collected from July 3rd to August 31st 2020 to represent the summer, and data collected from September 1st 2020 to February 26th 2021 to represent the second wave. We further stratified the second wave to represent periods of expected seasonality in social contacts: September with the return to school and at work, fall with gathering restrictions, the Christmas holidays with school and work vacations and closure of non-essential business, January and February 2021 with the gradual return to work and school after Christmas vacations and school/non-essential business closures, and the introduction of a curfew. We used a Canadian stringency index, adapted from the Oxford COVID-19 Government Response Tracker (OxCGRT) [11], to quantify the intensity of public health measures in Quebec over time [12]. This index is obtained by averaging the intensity score of 12 policy indicators (e.g., school closures, workplace closures, gathering restrictions, stay-at-home requirements, etc.) and higher values indicate stricter measures. We estimated the mean stringency index for each of the 8 periods described previously by averaging the daily values of the index.
We compared the mean number of social contacts over time (total or by location) using weighted generalized linear models with a Poisson distribution and an identity link. Generalized estimating equations with robust variance [13] were used to account for the correlation between the two days of diary data collection and overdispersion. A categorical period effect was included in the model and is presented as the absolute difference in the mean number of contacts compared to the previous period. We also compared the mean number of social contacts according to different key socio-demographic characteristics using the same model with a period-by-covariate interaction and adjusting for age (in 8 categories). In this model, period and characteristic effects were tested using contrasts: each period was compared to the previous period within each level of the covariate, and the global effect of the characteristic was tested within each period. We also examined the association between the mean number of social contacts and the stringency index (in 5 categories), irrespective of periods, using a model similar to the one comparing periods.
Finally, we estimated mixing matrices. The entries of the mixing matrix represent the mean number of social contacts per person per day according to the age of the respondent (column) and the age of his contacts (row). Mixing matrices were estimated separately for the 8 periods described previously and for 3 categories of contact locations: all locations, home (contacts with household members and visitors), any location outside home. The matrices were obtained by maximizing a constrained log likelihood of the number of reported contacts per day among CONNECT participants weighted by age, sex, household composition and region. The number of contacts was assumed to follow a negative binomial distribution. The likelihood constraint ensured that the total number of contacts between individuals of age i and age j is the same whether it is estimated from entry (i,j) or entry (j,i) of the total mixing matrix including contacts in all locations (i.e., reciprocity of the mixing matrix).
All statistical analyses were performed with SAS version 9.4. Maximization of the log likelihood for the mixing matrices was performed using a nonlinear programming algorithm (nlminb2 function from the ROI package in R).