Epidemiology of Epstein-Barr virus infection and infectious mononucleosis in the United Kingdom

Background Epstein-Barr Virus (EBV) is a ubiquitous gamma-herpesvirus with which ~ 95% of the healthy population is infected. EBV infection has been implicated in a range of haematological malignancies and autoimmune diseases. Delayed primary EBV infection increases the risk of subsequent complications. Contemporaneous seroepidemiological data is needed to establish best approaches for successful vaccination strategies in the future. Methods We conducted a sero-epidemiological survey using serum samples from 2325 individuals between 0 and 25 years old to assess prevalence of detectable anti-EBV antibodies. Second, we conducted a retrospective review of Hospital Episode Statistics to examine changes in Infectious Mononucleosis (IM) incidence over time. We then conducted a large case-control study of 6306 prevalent IM cases and 1,009,971 unmatched controls extracted from an East London GP database to determine exposures associated with IM. Results 1982/2325 individuals (85.3%) were EBV seropositive. EBV seropositivity increased more rapidly in females than males during adolescence (age 10–15). Between 2002 and 2013, the incidence of IM (derived from hospital admissions data) increased. Exposures associated with an increased risk of IM were lower BMI, White ethnicity, and not smoking. Conclusions We report that overall EBV seroprevalence in the UK appears to have increased, and that a sharp increase in EBV seropositivity is seen in adolescent females, but not males. The incidence of IM requiring hospitalisation is increasing. Exposures associated with prevalent IM in a diverse population include white ethnicity, lower BMI, and never-smoking, and these exposures interact with each other. Lastly, we provide pilot evidence suggesting that antibody responses to vaccine and commonly encountered pathogens do not appear to be diminished among EBV-seronegative individuals. Our findings could help to inform vaccine study designs in efforts to prevent IM and late complications of EBV infection, such as Multiple Sclerosis.


Background
Epstein-Barr Virus (EBV) is a ubiquitous gammaherpesvirus with which~95% of healthy adults are infected [1]. EBV is the commonest causative agent of Infectious Mononucleosis (IM), a triad of pharyngitis, lymphadenopathy and fever in the context of acute EBV infection [2]. EBV infection is also implicated in a range of haematological malignancies (Burkitt's lymphoma, gastric lymphoma, and Non-Hodgkin lymphoma), and has been strongly associated with autoimmune diseases such as Multiple Sclerosis (MS).
EBV seroprevalence increases with age, and tends to be higher among females, non-Caucasian ethnic groups, and people living in socio-economically deprived households [3,4]. Later age at EBV infection confers both a higher risk of IM and a higher risk of severe features [2]. Overall EBV seroprevalence rates among children and adolescents appear to be decreasing in some populations, implying that the population at risk of complicated primary EBV infection may be increasing [4][5][6].
Various strands of evidence suggest a pathogenic role for EBV infection in MS: EBV seronegativity is exceptionally rare in people with MS, EBV seropositivity is higher in people with MS, higher anti-EBV antibody titres are associated with increased risk of MS, and prior IM increases the risk of subsequent MS [7][8][9]. Understanding the epidemiology of EBV infection during childhood and adolescence is essential [10], for both maximising the efficacy and safety of vaccine studies [11], and in order to inform power calculations for future potential interventional studies [11].
Previous vaccine studies have demonstrated efficacy against symptomatic IM but not EBV seroconversion [12]; however multivalent vaccines are currently being developed with the aim of preventing seroconversion [13]. Alongside this, there remains an urgent need to better understand the current epidemiology of EBV transmission, and potential associations with both early and late seroconversion.
We therefore set out to update and establish current EBV seroprevalence across the UK, establish the potential effect of any change in the pattern of UK EBV seroprevalence on prevalence of hospital admissions with Infectious Mononucleosis (IM), and examine predictors of late EBV infection (IM) in a large primary care cohort. Finally, we used a small cohort of truly EBV-negative samples from adults to explore whether EBV seronegativity is associated with a reduced immune response to other vaccine-preventable and/or ubiquitous infections.

Samples
Serum samples used to investigate EBV seroprevalence across the UK were requested from Public Health England Seroepidemiology Unit (SEU). The SEU houses a serum repository curated from residual samples from NHS laboratories in England. Anonymised samples retain data pertaining to age, sex, date and laboratory of collection. Two thousand five hundred samples spread across age groups and English geographical area (London, South West, West Midlands, North West, North East) collected in 2016 were requested; 2366 samples were available for analysis.
A second cohort of 21 adult EBV-seronegative and 42 age and gender matched EBV-seropositive samples from a Netherlands cohort were used to compare the immunological properties associated with serostatus. EBV seronegativity was defined using an in-house VCA-IgG and EBNA1-IgG ELISA, followed by confirmation with an IgG immunoblot against VCA, EA, and EBNA [14].
All serum samples were stored at -80C until analysis, and freeze thaw cycles kept to a minimum. No samples had > 3 freeze thaw cycles prior to analysis.

Enzyme-linked immunosorbent assay (ELISA)
The anti-Epstein Barr virus (EBV-VCA) IgG Human ELISA kit (Abcam, cat no. ab108730) was used for qualitative determination of IgG antibodies against EBV viral capsid antigen (VCA) in human serum. IgG VCA antibodies peak two to 4 weeks post EBV infection, and persist for life, serving as a marker for recent or historic EBV infection. Serum samples were diluted 1:100 immediately prior to use and assayed according to the manufacturer's protocol. Positive, negative and cut-off controls were run in duplicate; samples were run in singlicate. Samples falling within the +/− 10% cut-off range were repeated in duplicate; samples remaining in the +/− 10% cut-off range were not included in the final analysis.
Quantitative measurement of IgG directed against Rubella and Varicella zoster virus (VZV) and semiquantitative measurement of IgG against Bordetella pertussis were performed in duplicate using ELISA kits (cat no. KA0223, KA1456, and KA2090 respectively; Abnova Corporation, Taipei City, Taiwan) according to manufacturer's instructions. Assay validity was confirmed using control samples provided by the manufacturer in all cases.

Hospital episode statistics (HES)
Hospital admissions where infectious mononucleosis (IM) was recorded in healthcare notes as either a primary or secondary diagnostic code during admission were examined. Hospital Episode Statistics (HES) incorporate every episode of hospital day-case or overnight inpatient care in National Health Service (NHS) hospitals [15]. Record-linkage was performed in order to construct absolute numbers of recorded hospital attendances with IM for each year 2002-2013. Infectious mononucleosis was defined as ICD-10 code B27.9 and ICD-9 code 075. Absolute numbers of cases of IM per age band were converted to rates per 100,000 personyears for each 5-year age band using mid-year population estimates from Office for National Statistics (ONS) as the denominator.

Primary care data analysis
The East London Primary Care dataset consists of deidentified primary care (General Practitioner, GP) healthcare records from all GP practices using the EMIS electronic healthcare records system (https://www. emishealth.com/) across four clinical commissioning groups (CCGs) in east London -Hackney, Newham, Tower Hamlets and Waltham Forest.
To determine environmental and ethnic exposures associated with IM risk, we conducted a large casecontrol study using East London GP database records. Demographic details for all participants within the database were extracted, including self-declared ethnicity according to ONS codes at GP registration. Demographic details were defined as previously described [16]. Age was defined at age data extraction (1st February 2018), and deprivation according to IMD raw scores for each lower layer super output area (LSOA).

Ethical approval
Public Health England Seroepidemiology Unit (SEU) sample analysis was in line with SEU ethical approvals (reference 05/Q0505/45); analysis of the EBV seronegative cohort was covered by internal ethical approval from VU University, Amsterdam. Primary and secondary care data analysis was performed on fully anonymised unlinked records and separate ethical approvals were not required.

Statistical analysis
Determinants of EBV serostatus and IM were examined using univariable and multivariable logistic regression. The strength of association between a variable and EBV serostatus was determined using the likelihood ratio of the full model vs that containing only confounders. Differential proportions of seropositivity to other infections was compared between EBV-positive and EBV-negative cohorts using Fisher's exact test. Unless otherwise stated, results are presented as estimate followed by 95% confidence interval (CI). All analyses were conducted using R version 3.6.1 in RStudio.  Fig. 1). Each year increase in age corresponded to a 12% increase in the odds of seropositivity (OR = 1.12, 95%CI 1.10-1.14, p < 0.0001). Region was associated with EBV seroprevalence in a univariable model, with evidence of lower seropositivity in non-London regions (p = 0.0059), however this effect dissipated on correction for age and sex (data not shown).

Temporal trends in the incidence of infectious mononucleosis 2002-2013
Between 2002 and 2013, the incidence of hospital admissions (per 100,000 patient years) associated with a coded diagnosis of IM steadily rose for males and females (supplementary table 2). Overall, the incidence of hospital admission associated with IM was higher in males than females, however for individuals age 10-15, incidence was higher in females than males. There was no clear shift in sex ratio or the age distribution of incident IM cases, with those aged 15-19 years accounting for the majority of diagnoses in both time periods studies (Fig. 2).

Exposures associated with infectious mononucleosis
Data were available for 6306 IM cases and 1,009,971 unmatched controls. Demographics of included participants are shown in Table 2. We fitted multivariable logistic regression models using prevalent IM status (i.e. any historical episode of IM) as the outcome, controlling for sex and age at the time of data extraction. Adding BMI, ethnicity, smoking, and IMD decile to the model all improved the model fit, suggesting that these exposures explain at least some of the variation in IM prevalence.
The effects associated with these factors persisted in a combined multivariable model incorporating all factors. Factors with an apparent protective effect with respect  Fig. 3 and Table 3). We then looked for first-order interactions between all significant predictors of IM status, after controlling for age and gender. We considered an interaction significant if including the interaction term improved model fit at an adjusted p threshold of 0.05/6 = 0.00833. We observed significant pairwise interactions between ethnicity and IMD, ethnicity and weight, ethnicity and smoking, and weight and IMD (supplementary table3). Repeating this analysis using weight as a binary measure (grouping "normal" and "underweight" individuals, and grouping "overweight" and "obese" individuals) gave similar results (data not shown). Strikingly, we noted opposite effects of smoking and body weight depending on ethnic group: the 'protective' effect of being overweight was reversed in Asian individuals, and the 'protective' effect of smoking was only observed in White individuals.

EBV seronegativity is not associated with diminished serologic response to vaccine antigens
In order to test the hypothesis that EBV-seropositivity is biologically necessary to prime and maintain B cell memory responses to vaccine antigens, we assayed IgG against recall antigens. Sera were confirmed to be EBVseronegative using the methods described above; matched control sera were EBV seropositive. We found no evidence of differences in Rubella, Pertussis, or Varicella serostatus between EBV seronegative and EBV seropositive individuals (Supplementary table 4).

Discussion
In this study we demonstrate the temporal trends in EBV seroprevalence during childhood and adolescence among healthy UK volunteers, and explore socio-demographic associations with IM (a marker of late infection). The seroprevalence estimates in all age bands in our study are higher than in other previous similar studies; additionally, we show that EBV seroprevalence differs between genders in adolescence (10-14.9 years), with male sex conferring a slightly lower risk of EBV seropositivity at this age. This corresponds to a gender imbalance among hospital admissions associated with a code indicating infectious mononucleosis in this age band. Supporting this finding is previous work demonstrating that the prevalence of EBV DNA in healthy tonsils increased with age for males (peak 77.4% for > 35 year-olds), but reached a peak at 79.3% for females from 15 to 24, and remained at that plateau [17]. The implication of these findings is a gender imbalance in EBV seroconversion hazard rates by age, as infectious mononucleosis rates are a function of the interaction between the population at risk (EBV naïve) and EBVpositive population pool [10].
An important finding of our work is the observation that EBV-seronegative individuals do not appear to have lower levels of circulating antibody to common vaccine antigens and pathogens. A major concern regarding EBV vaccines is disruption of the hostpathogen interaction with unintended deleterious consequences for host and population immunity. EBV achieves long-term immune evasion by transforming naive B cells into a resting memory B cell phenotype and activating latency programmes. Given this tropism, and the widespread prevalence of EBV, there is an evolutionary argument that humans have coevolved with EBV due to a selective advantage offered by the virus. Specifically, it is possible that EBV infection promotes herd and individual immunity to a broad range of pathogens by expanding and maintaining the size of the memory B cell pool. Thus, there is a theoretical concern that vaccination against EBV may have unintended consequences for B cell memory at an individual and population level. Our pilot data is reassuring, and will need to be replicated on a larger scale in order to allay this theoretical concern. We show that, as would be expected, the seroprevalence of EBV increases with age from 69.9% in the 1-4.9 age bracket to 96.0% among 20-25 year olds. Our results are consistent with previous studies on the determinants of EBV seropositivity, and fit with previous models of EBV infection and seroconversion [10]. Several studies in other populations have demonstrated a decline in age-adjusted seroprevalence of EBV over the past 15 years [4][5][6]. In contrast, our study demonstrated consistently higher EBV seroprevalence among all age brackets compared to a comparable older study [18] this may reflect a genuine increase in seroprevalence in the UK, differences in population sampling, or different sensitivity and specificity of the assays used (our study used VCA whereas previous studies have used EBNA). Reassuringly, our estimates closely mirror those from a recent random sampling of UK Biobank participants: among those aged 40-69 years, EBV seroprevalence was 94.7% 1 .
We show that exposures associated with increased IM risk (i.e. late infection) include White ethnicity, normal/ low body weight, lower deprivation, and never smoking, in keeping with international data [3,4]. We found additional evidence of interaction between smoking, body weight, and ethnicity in determining IM risk. We cannot rule out that the exposures we identify as being associated with IM are in fact all proxies for lower levels of household crowding or other early life determinants. Our study used prevalent cases, which introduces the problems of reverse causation and confounding. The finding that effects differ between ethnicities is intriguing. These traits may be proxies for different exposures between different ethnicities (e.g. theoretically, being overweight could be a marker of high wealth in one group and deprivation in another). Alternatively, this may reflect a genuine biological effect whereby BMI and smoking interact with genetic background to influence IM risk.
The use of healthcare databases to study IM is challengingcoded diagnoses of IM are usually not linked to serological evidence, and certainly include non-EBV related infections. Others have used healthcare databases to examine IM incidence [10,19], highlighting the role of transmission within families [19]. The observed similarities in gender variation in IM and seroconversion during teenage years between HES and serum data is compelling, as is the similarity in admission incidence by age between the UK and Denmark [19]. It must be noted that within the HES dataset the incidence of all diseases appears to be increasing due to improved coding; we have mitigated this to some degree by focussing on recent decades. However, the increase in incidence within the HES dataset is striking, with almost a 50% increase over 10 years.
Understanding the temporal trends and host determinants of EBV serostatus are essential for rational vaccine design and deployment. In principle, vaccination against EBV could help to reduce rates of EBV-associated malignancies and EBV-associated autoimmune diseases. Although there is still no effective vaccine available, several are in development, and phase 2 results from a recombinant gp350 vaccine suggested an effect on IM rates without affecting overall seroprevalence [12]. The population-level efficacy of such a vaccine depends crucially on the age at which it is administered, with earlier vaccines predicted to offer greater overall reductions in rates of IM and other EBV-related sequlae [20]. Our empirical data showing high seroprevalence even in the youngest age bracket support the modelling data, suggesting that, in terms of maximising efficacy, EBV vaccines should be delivered in the first few years of life.
In conclusion, our data suggest that, in the UK, EBV seroconversion is taking place earlier in life, that overall EBV seroprevalence remains very high (> 95% of 21-25 year olds), IM incidence is increasing, and there are several environmental exposures associated with IM risk (ethnicity, deprivation, smoking, and BMI). Furthermore, we provide preliminary evidence that EBV seronegative persons do not have lower levels of circulating antibodies to common pathogens and vaccine antigens, which mitigates some concerns around early life vaccination.
Additional file 1: Table S1. demographics of EBV-seropositive and EBVseronegative individuals included in the seroprevalence study. Table S2. raw incidence of HES-coded IM (cases per 100,000 person-years) among males and females in three time windows. Table S3. likelihood ratio p values comparing models of IM risk without pairwise interaction and with first-order interaction between the named exposures. P values below the Bonferroni-adjusted threshold of 0.008 are highlighted in bold. Table S4. Serostatus proportions for vaccine preventable and commonly acquired illnesses between EBV seropositive and EBV seronegative individuals.

Funding
We are grateful to Barts Charity for supporting this work. Barts Charity support the PNU, where this work took place. The funder had no role in the design of the study, data collection, analysis or interpretation, or in writing the manuscript.
Availability of data and materials SEU samples and HES data are available to appropriately qualified investigators on application to the relevant bodies in line with their ethical approvals. Permission to use SEU samples can be obtained on application to the SUE steering committee. De-identified East London Primary Care records are available on application by appropriately qualified investigators via the Clinical Effectiveness Group at QMUL. If further details are required, these can be obtained via reasonable request to the corresponding author who will direct enquires to the relevant data custodians and/or steering groups.
Ethics approval and consent to participate UK Seroepidemiology Unit (SEU) sample analysis was covered by SEU ethical approvals which include consent for use in research (reference 05/Q0505/ 45); analysis of the EBV seronegative cohort was covered by internal ethical approval from VU University, Amsterdam. Primary and secondary care data