Contact tracing process
Guidelines for public health management of contacts of COVID-19 cases are available . Briefly, contact tracing was conducted at a number of contact tracing centres across Ireland. Following the identification of a confirmed COVID-19 case (the index case), a series of phone calls were triggered:
The first call informed the index case that they had tested positive and gave them advice on self-isolation. They were asked about their contacts in the 48 h prior to symptom onset or in the 24 hours prior to their test if they were asymptomatic, up until the point they self-isolated.
The second call collected information about close contacts (see definition below) including name, address, phone number and circumstances of, and date of last contact with case.
The third call to these contacts informed them of their close contact with a confirmed case and advised them on further action, including referral for testing and restriction of movements.
During the contact tracing process, these data were entered into a Customer Relational Management (CRM) database called CovidCare Tracker, which was developed by the HSE Office of the Chief Information Officer in collaboration with the HSE Contact Management Programme. Routine contact tracing was conducted by trained personnel and not public health doctors. However, should the individual conducting the contact tracing consider the case to be one which required the attention of a more trained individual, or a public health doctor, the case was elevated to a status requiring specialist attention, according to the schematic shown in Supplementary Material Figure S1. Similarly, if a case was in a location or situation in which multiple transmission events may have occurred, that situation was flagged as ‘complex’, requiring further detailed investigation by public health doctors. Not all data from cases which were dealt with as ‘complex’ cases were entered into the CRM database (Supplementary Material – Figure S1). Similarly, contacts of Healthcare Workers (HCWs) that occurred within the workplace, were dealt with by the hospital occupational health department (or infection prevention and control team) and not entered into the national CRM database, otherwise some HCW workplace contacts were investigated by public health doctors. However, family and social contacts of HCWs were added to the CRM for contact tracing.
Definition of close contact
Based on European guidance, any of the following were defined as close contacts: any individual who has had > 15 minutes face-to-face (<2 m) contact with a case (irrespective of whether face coverings were worn or not), in any setting; household contacts (defined as living or sleeping in the same home), individuals in shared accommodation sharing kitchen or bathroom facilities, or sexual partners; healthcare workers, including laboratory workers, who had not worn appropriate PPE or had a breach in PPE during exposure to the case (defined as either direct contact with the case (as defined above), their body fluids or their laboratory specimen, or being present in the same room when an aerosol generating procedure was undertaken on the case); or passengers on an aircraft sitting within two seats (in any direction) of the case, travel companions or persons providing care, and crew members serving in the section of the aircraft where the index case was seated. For those contacts who shared a closed space (including an office or school setting) with a case for >2 hours, a risk assessment was undertaken taking into consideration the size of the room, ventilation and the distance from the case.
A casual contact was defined separately . However, these contacts were not included in our study since these contacts were not generally collected from the contact tracing of routine cases.
The CRM data, based on data entries from each call centre, were collected by the Health Service Executive (HSE) under the Medical Officer of Health legislation. These data were then collected by the Central Statistics Office (CSO) in compliance with the Statistics Act 1993, pseudonymised, and stored in a centralised database (the CSO C19 Data Research Hub). The CSO C19 Data Research Hub is a secure data repository from which personally identifiable data cannot be exported. These data were accessed through the CSO data hub by the first author for the purpose of this analysis. Access was granted under Section 20(b) of the Statistics Act, 1993, for the purpose of using data collected during the pandemic to aid in the national response. The study was approved by both the National Research Ethics Committee (20-NREC-COV-099). The requirement for informed consent was waived and a consent declaration provided following review by the Health Research Declaration Committee (20-025-AF1/COV, since the data were used for a purpose other than that for which it was initially collected and since it was not possible to retrospectively obtain informed consent. All methods were carried out in accordance with relevant guidelines and regulations.
Initial data cleaning and processing
Data were available in two different datasets. The first (Case data) listed the pseudonymised reference ID of the case, the location of the case (to county level), as well as the DOB rolled back to the first day of each month. Further, an anonymised reference ID was constructed from, and replaced, the DOB and surname of each case and a “current status” column was generated which indicated the status of contact tracing up to the date of the most recent data extract. The second database (Contact data) consisted of the anonymised reference ID of the case, and the following details of each of their reported contacts: DOB (rolled back to the first day of the month), an anonymous reference ID constructed from, and replacing, the DOB and surname of the contact, the type of contact (close, casual, complex, exceptional, other), and the method by which the contact was identified (manually or via a contact tracing mobile phone application - COVID Tracker).
Details of the impact of each data cleaning step on the number of records for analysis are detailed in Supplementary Material Table S1. Briefly, Case 1 data were initially filtered to include only those records where the current status entry indicated that contact tracing had been completed. In addition, duplicate cases were removed by selecting the most recent data entry where multiple entries existed for the same case. Contact data were initially filtered to include only close contacts. Then, contacts identified by COVID Tracker were excluded as these were not linked to a specific case in the contact tracing database. In addition, contacts with no recorded primary case were also removed from the dataset. Finally, we also restricted the analysis to contacts identified after May 1st, due to concerns over the variability of the quality of data collection processes prior to this point.
Case and Contact data were then joined to create an overall dataset at the level of the contact (that is, with each line representing a contact, with a column indicating the reference ID of the primary case). Ages of the cases and contacts were categorised according to age groups corresponding to school-age children, college-age adults, young adults, middle-age adults and retired adults: 0-17; 18-24; 25-39; 40-64 and greater than 65 years of age. Location of the case was recategorised as “Dublin” and “Rest of Country”.
Next, data were collapsed to the case level for analysis (that is, with each line representing a primary case), summarising the overall number of contacts reported by the case (Dataset 1). A second dataset (Dataset 2) was created which summarised the average number of contacts by the age category of the case and the age category of the contact. Histograms of the number of contacts per case showed only a small number of cases with more than 50 close contacts (Supplementary Material Figure S2a and b), these records assumed to be erroneous data and were removed from each of the datasets.
The number of contacts per case was summarised using the mean, standard deviation, 2.5th, 25th, 50th, 75th and 97.5th percentiles in the overall dataset, as well as broken down by age of the case and age of the contact.
For each day, the mean number of contacts reported per case was calculated. Next, 7-day rolling averages of the daily mean contacts were calculated for each day. These figures were calculated for the overall dataset, stratified according to the age cohort of the case and stratified according to both the age cohort of the case and the contact.
The timings of key government interventions were extracted from national press releases  and noted according to the temporal pattern of contact numbers per case. The start of each defined period of government restrictions was as follows: Stay at Home (27th March); Initial easing (5th May); Phase one easing (18th May); Phase two easing (8th June); Phase 3 easing (29th June); Kildare, Laois Offaly restrictions (7th August); Dublin level 3 (18th September); Donegal level 3 (25th September); National level 3 (6th October); Border counties level 4 (15th October); National level 5 (21st October to 1st December).
Model 1 – modelling number of contacts per case
Model 1 investigated the factors associated with number of close contacts reported per case over time. Preliminary exploration of the data demonstrated that the number of contacts per case was overdispersed. Therefore, we chose to model the number of contacts per case using a negative binomial rather than a Poisson regression model  with number of contacts per case as the dependent variable. Each time period of the pandemic was coded according to the government intervention level and modelled as a fixed categorical variable. Age of the case was offered to the model as both a categorical and continuous variable. When modelled as a continuous variable, age was modelled using a cubic regression spline. Only one form of this variable, the form resulting in best model fit as determined by the Akaike Information Criterion (AIC), was used. Region (Dublin, Rest of Country) and gender were offered to the model as categorical variables.
Models 2 and 3 – comparing temporal breakpoints in cases and contacts
Models 2 and 3 were constructed to compare temporal breakpoints (that is time points at which the trajectory of the dependent variable appeared to change) in the numbers of contacts per case (Model 2) and overall number of cases (Model 3). In this case, two additional negative binomial models were constructed in which day of the year (YDAY) was modelled as a piecewise linear variable. In Model 2, the dependent variable was the number of contacts per case, whereas the dependent variable in Model 3 was the number of cases (nationally) recorded on that day. For each, the optimal position of the breakpoint was automatically selected using the ‘segmented’ package in R. Whilst this approach optimises the position of the breakpoints, the number of breakpoints must be specified. For our models, the number of breakpoints was selected by running separate models varying the number of breakpoints from 2 to 20 and selecting the number with the lowest AIC. The duration in time between breakpoints at which number of contacts began to increase or decrease, were compared with breakpoints at which the number of cases began to increase or decrease.
Assessment of model fit
Model fit was assessed by comparing real versus predicted contact counts according to different subgroups of predicted values (deciles) and according to each month of the pandemic and each age cohort of the cases.
All data manipulation and analyses were conducted in R version 3.3.1 , using the “dplyr” , “lubridate” , ‘mgcv’ , ‘segmented’  and “zoo” packages  plots were generated using “ggplot2” .