Age biases in a large HIV and sexual behaviour-related internet survey among MSM
© Marcus et al.; licensee BioMed Central Ltd. 2013
Received: 22 November 2012
Accepted: 5 September 2013
Published: 10 September 2013
Skip to main content
© Marcus et al.; licensee BioMed Central Ltd. 2013
Received: 22 November 2012
Accepted: 5 September 2013
Published: 10 September 2013
Behavioural data from MSM are usually collected in non-representative convenience samples, increasingly on the internet. Epidemiological data from such samples might be useful for comparisons between countries, but are subject to unknown participation biases.
Self-reported HIV diagnoses from participants of the European MSM Internet Survey (EMIS) living in the Czech Republic, Germany, the Netherlands, Portugal, Sweden and the United Kingdom were compared with surveillance data, for both the overall diagnosed prevalence and for new diagnoses made in 2009. Country level prevalence and new diagnoses rates per 100 MSM were calculated based on an assumed MSM population size of 3% of the adult male population. Survey-surveillance discrepancies (SSD) for survey participation, diagnosed HIV prevalence and new HIV diagnoses were determined as ratios of proportions. Results are calculated and presented by 5-year age groups for MSM aged 15–64.
Surveillance derived estimates of diagnosed HIV prevalence among MSM aged 15–64 ranged from 0.63% in the Czech Republic to 4.93% in the Netherlands. New HIV diagnoses rates ranged between 0.10 per 100 MSM in the Czech Republic and 0.48 per 100 in the Netherlands. Self-reported rates from EMIS were consistently higher, with prevalence ranging from 2.68% in the Czech Republic to 12.72% in the Netherlands, and new HIV diagnoses rates from 0.36 per 100 in Sweden to 1.44 per 100 in the Netherlands. Across age groups, the survey surveillance discrepancies (SSD) for new HIV diagnoses were between 1.93 in UK and 5.95 in the Czech Republic, and for diagnosed prevalence between 1.80 in Germany and 4.26 in the Czech Republic.
Internet samples of MSM were skewed towards younger age groups when compared to an age distribution of the general adult male population. Survey-surveillance discrepancies (SSD) for EMIS participation were inverse u-shaped across the age range. The two HIV-related SSD were u- or j-shaped with higher values for the very young and for older MSM. The highest discrepancies between survey and surveillance data regarding HIV-prevalence were observed in the oldest age group in Sweden and the youngest age group in Portugal.
Internet samples are biased towards a lower median age because younger men are over-represented on MSM dating websites and therefore may be more likely to be recruited into surveys. Men diagnosed with HIV were over-represented in the internet survey, and increasingly so in the older age groups. A similar effect was observed in the age groups younger than 25 years. Self-reported peak prevalence and peak HIV diagnoses rates are often shifted to higher age groups in internet samples compared to surveillance data. Adjustment for age-effects on online accessibility should be considered when linking data from internet surveys with surveillance data.
Most HIV surveillance systems in Europe provide reasonably good data on number of new diagnoses among men having sex with men (MSM) [1, 2]. Data on HIV prevalence are less readily available and less comparable due to different estimation methods and sampling biases. Comparable data on HIV prevalence and incidence among MSM across countries or across different surveys are important to assess population effects of prevention efforts, develop prevention policies and target interventions.
Data on sexual risk behaviours among MSM are increasingly collected by online surveys . Data on recency of HIV testing and self-reported HIV diagnosis are also often collected in these surveys. In 2010, the European MSM Internet Survey (EMIS) demonstrated the feasibility and utility of collecting data from MSM from 38 European countries with the same questionnaire – simultaneously available online in 25 languages – and using the same recruitment methods . However, estimation and comparability of MSM HIV prevalence between countries (and between consecutive surveys) is limited by unknown sizes of MSM populations, differences in household internet access across countries and time , Marcus U, Hickson F, Weatherburn P, Schmidt AJ, et al.: Estimating the size of the MSM populations for 38 European countries by calculating the survey-surveillance discrepancies (SSD) between self-reported new HIV diagnoses from the European MSM Internet Survey (EMIS) and surveillance-reported HIV diagnoses among MSM in 2009 (as yet unpublished observations)], and possibly other unknown selection effects. For example, in EMIS, although a broad range of websites were used for recruitment, participation rates varied substantially even between countries with similar household internet access. Substantial differences between national samples were also observed regarding median age, even between countries with very similar socio-cultural, political and economic background and similar histories and starting points of the HIV epidemic among MSM – like e.g. Germany and the Netherlands, two neighbouring countries in the centre of Europe .
Since the internet sites most commonly used for survey recruitment are dating and ‘cruising’ sites, it can be expected that samples recruited this way over-represent more sexually active MSM . In addition to men using these sites not being representative of all MSM (and likely not being used equally across the age range), an unknown participation bias will also be in operation. For both these reasons the age distribution of samples recruited on these sites may differ from the actual age distribution of MSM. Because partner numbers and sexual activity decline with age , older men in particular may be expected to be under-represented.
In previous analyses we have looked into discrepancies between self-reported EMIS data and surveillance data on the prevalence and incidence of diagnosed HIV among MSM by comparing on a country level , Marcus U, et al.: Estimating the size of the MSM populations for 38 European countries by calculating the survey-surveillance discrepancies (SSD) between self-reported new HIV diagnoses from the European MSM Internet Survey (EMIS) and surveillance-reported HIV diagnoses among MSM in 2009 (as yet unpublished observations)]. In cross-country comparisons household access to the internet was a major determinant of participation rates and survey surveillance discrepancies (SSD). This study seeks to measure the differences in rates of age-specific survey participation, new HIV diagnoses, and HIV prevalence between MSM who participate in internet surveys and the general MSM population. By adjusting for these differences, cross-country comparisons in findings from internet surveys can be made with greater validity.
Among the 38 countries with sample sizes larger than 100 respondents in EMIS we selected the following countries (in alphabetical order): the Czech Republic (Central East Europe); Germany (Central West Europe); the Netherlands (West Europe); Portugal (South West Europe); Sweden (North West Europe); the United Kingdom (West Europe). Countries were selected to represent a variety of European sub-regions and varying EMIS participation rates. Further requirements were a sufficient size of the EMIS sample, and availability of relatively reliable HIV surveillance data regarding MSM.
We set the lower and upper age limits of both the EMIS sample and the surveillance data to be 15 years and 65 years. For the Czech Republic, as the only country from an eastern European sub-region, we also analysed the data for the narrower age range of 15 to 49 years, because the HIV epidemic among MSM in the eastern parts of Europe started about 10–15 years later than in the western parts, leading to a different age distribution of HIV infections in the MSM population.
Data on new HIV diagnoses in 2009 were taken from national infectious disease surveillance systems. Cases with unknown risk factors for HIV acquisition were proportionately redistributed based on known cases.
For surveillance measures of HIV prevalence we used diagnosed infections only in order to compare with self-reported prevalence, which is also a diagnosed prevalence. Estimates of the number of MSM with diagnosed HIV were based on a Multi-parameter Evidence Synthesis (MPES) approach for the Netherlands , on a back-calculation model for Germany and the United Kingdom [9, 10], on the proportion of MSM among people infected with HIV and in clinical care for Sweden (InfCareHIV database), the Czech Republic and Portugal (similar databases on people infected with HIV in clinical care; personal communications).
The total size of the adult male population aged 15 to 64 years (or 15 to 49 years for Czech Republic) was taken from national population statistics [11–16]. The relative size of the MSM population was estimated to be 3% in all age groups except males aged less than 20. This estimate is consistent with the upper limit of the confidence interval of men reporting male sexual partners in the previous 12 months in repeated telephone surveys in representative samples of the general population in Germany conducted by the German Federal Agency of Health Promotion (BZgA)  and with published results from a large British national probability survey conducted in 2000 . In Portugal, an unpublished population based study from 2007 also found a proportion of 3% men reporting sex with men in the adult male population in the previous year [H. Barros, personal communication], while a Czech study from 2008 found a proportion of 1.7% reporting repeated sexual contacts with other men . For males below the age of 20 the proportion of MSM was estimated based on the proportion of EMIS respondents in the national samples reporting their first sexual experience with another man before the age of 20 . For these men the proportion was estimated to be 2.1% (70% of 3%), since sex with a male partner before the age of 20 was reported by 70% of the EMIS respondents in the six countries.
Due to the lower social acceptance of homosexuality in the eastern parts of Europe we varied the proportion of MSM in the male adult population for the Czech Republic to include 2%, so that for the Czech Republic we present calculated values for a MSM population covering a range from 2% of the adult population 15–49 years up to 3% of the 15–64 year old adult male population.
EMIS was a large scale pan-European internet survey conducted in 2010. The methodology has been described in more detail elsewhere . In brief, a network of five primary and 77 secondary partners working in MSM sexual health across academia, public health and community organizations in 38 European countries developed a collaborative English language survey. The survey was translated into 24 other languages, and promoted through gay dating websites and through gay community organizations. EMIS was approved by the Research Ethics Committee of the University of Portsmouth, United Kingdom (REC application number 08/09:21).
Among other questions, survey participants were asked about the year of birth, their age when they first had sex with a man, the result of their last HIV test, and the recency of that test if it was negative, or the year of first diagnosis if it was positive.
Assuming a stable proportion of MSM in the adult male population once a homosexual debut has occurred, the age distribution of the EMIS samples was compared to the age distribution of the general population, taking into account a reduced proportion of MSM below the age of 20.
Self-reported prevalence rates (per hundred EMIS respondents, regardless of having been tested for HIV) and new diagnoses rates in 2009 per 100 EMIS respondents were compared with the prevalence and incidence of diagnosed infection calculated from surveillance data and population estimates. The denominators for the rate of newly diagnosed HIV in 2009 were the total national EMIS sample for self-reports and the estimated total MSM population for surveillance data.
Comparisons were made by calculating a ratio of the proportions with EMIS data in the numerator and population/surveillance data in the denominator. This ratio of proportions we call the Survey-Surveillance Discrepancy (SSD) [5, 21]. We calculated the SSD for EMIS participation (proportional distribution of EMIS participants by age groups/proportional age distribution of the total male population), the SSD for prevalence (self-reported prevalence rate in EMIS by age group/estimated diagnosed prevalence in the MSM population), and the SSD for new HIV diagnoses (self-reported HIV diagnoses in 2009 per 100 EMIS respondents/reported new HIV cases per 100 MSM based on surveillance data). Particularly for age-group data on new HIV diagnoses in 2009 the numbers can get quite small and precision of SSD calculation is thus affected by chance effects. Therefore for comparisons of countries we used the best fitting 2nd order polynomic trendlines imposed on the data curves. Despite the problem of low numbers we looked at new diagnoses data, because these data are more readily available from national surveillance systems than prevalence data/estimates.
In Table 1 we present, for each of the six countries analysed:
The number of men aged 15 to 64;
The estimated size of the MSM population;
The rate of newly diagnosed HIV per 100 MSM based on surveillance reports;
The prevalence of diagnosed HIV infections per hundred MSM based on surveillance reports;
The national EMIS sample sizes;
Self-reported HIV diagnoses in 2009 per 100 EMIS respondents;
Self-reported diagnosed HIV infections per hundred EMIS respondents;
The EMIS participation rates per 10,000 adults (15–64 years);
The response rates to individualized instant messages sent to men on the two most productive recruitment websites inviting members to participate in EMIS;
The median age of the national EMIS samples;
The proportion of households with broadband internet access in 2009.
Population data, surveillance data and EMIS derived data to characterize and compare the six national MSM samples
Adult male population (15–64 years)
Estimated MSM population (15–64)
Proportion of households with broadband Internet access in 2009 (%)
HIV diagnoses in MSM per 100 MSM in 2009
Estimated proportion of MSM diagnosed with HIV by end of 2009 (%)
EMIS sample size
HIV diagnoses in 2009 per 100 EMIS respondents
Proportion of EMIS respondents diagnosed with HIV by end of 2009 (%)
EMIS participation rate per 10,000 adults
Response rates to individualized invitation instant messages (%)
Median age of EMIS participants (years)
Czech Republic (1)
Czech Republic (2)
Diagnosed HIV prevalence among MSM aged 15–64 was estimated to be between 0.63% (3% MSM, 15–64) and 1.34% (2% MSM, 15–49) in the Czech Republic and 4.93% in the Netherlands by the end of 2009. The rate of new diagnoses with HIV in 2009 ranged between 0.10 per 100 MSM (3%, 15–64) in the Czech Republic and 0.48 in the Netherlands. Contrastingly, self-reported prevalence in EMIS respondents ranged between 2.68% in the Czech Republic and 12.72% in the Netherlands. The rate of respondents reporting being newly diagnosed with HIV in 2009 varied between 0.36 per 100 in Sweden and 1.44 per 100 in the Netherlands (Table 1; see also Additional file 1).
Overall SSD for prevalence and for new HIV diagnoses
SSD for diagnosed HIV prevalence
SSD for newly diagnosed HIV
Czech Republic (1)
Czech Republic (2)
The proportional age distributions of large internet samples of MSM recruited mainly via gay social media (dating websites) were skewed towards younger age groups – except for males younger than 20 years - when compared to the age distribution of adult males. The proportion of men accessible on Internet dating sites declines once they get older than 45 years.
Participants of the internet survey had a higher risk for lifetime and recent HIV diagnosis than a hypothetical random sample of MSM, provided the proportion of MSM among adult males in the countries is closer to 3% than to 1.0% or 1.5%. Particularly among men older than 45 years it was observed that those who are still accessible and participate in an HIV-related survey are increasingly biased towards higher lifetime and recent risks for HIV. This biased age representation in internet samples results in a peak of self-reported HIV prevalence and new HIV diagnoses in an older age group than measured by surveillance data. From the six countries analysed here, this happened in four countries when we considered HIV prevalence, and all six countries when we considered new HIV diagnoses.
In most age groups and most countries – except the relatively overrepresented young age groups in the Czech Republic and Portugal – both HIV-related SSDs were higher than 1.0., and in four of the six countries the SSD for new HIV diagnoses was slightly higher than the SSD for (diagnosed) prevalence, suggesting one or both of the following:
HIV diagnosis, including a recent HIV diagnosis is a strong motivation to participate in a HIV-related survey;
MSM accessible on internet dating sites have higher odds of being diagnosed with HIV than a random MSM sample.
A fourfold higher risk for an STI diagnosis among MSM participating in a community based internet study compared with MSM from a population based probability sample was also reported from a UK study .
A variety of different factors could be responsible for reduced accessibility of younger (15–19) and older (50+) MSM on internet dating websites. While we have supporting data on sexual debut to explain the paucity of younger men, we have no data that could explain reduced accessibility of older age groups. More research will be required to elucidate the reasons for this. We speculate that lower internet literacy, less sexual partner change, and increasing proportions living in settled relationships may be among these factors.
Previous research in Germany has shown that regarding geographical distribution of newly diagnosed cases of HIV, large internet samples can be representative of MSM populations . The discrepancy between skewed age distribution of internet samples and good representation of new HIV diagnoses suggests that internet samples may very adequately represent the sexually active MSM population most at risk for HIV.
It would be interesting to repeat a survey like EMIS and to look at changes particularly in countries with increasing household access to Internet. Also, it would have been interesting to analyse SSDs for more countries in eastern parts of Europe. However, for many smaller countries the EMIS samples were too small for this kind of analysis (numbers of men with HIV in age groups become too small), and particularly for the larger countries (Russia, Ukraine, Poland, Romania) surveillance data with regard to the size of the MSM transmission group are unreliable. One way to circumvent the issue of smaller sample sizes would be to use cumulated diagnoses numbers over several years for SSD calculations.
While the above interpretation of SSDs explains well the differences observed between western and eastern Europe and between a less densely populated country like Sweden and densely populated countries like Germany and England, it would not explain the SSD differences between the Netherlands and countries in the other western parts of Europe. The higher SSDs in the Netherlands, together with the lower participation rates, may indicate real differences in selection biases of survey participants. A possible reason for a different self-selection bias could be the high frequency of national MSM Internet surveys in the Netherlands (yearly), and the launch of a national survey for MSM shortly before the launch of the European survey (EMIS) in 2010 which may have resulted in survey fatigue effects in the target population.
Our paper on the limitations of data from internet convenience samples itself has some limitations, not least because we are using data from an internet convenience sample, which we know are not representative. There are also representation limitations to our other sources of data, censuses and HIV surveillance, most crucially that the prevalence estimates from surveillance data from the six countries are based on different methods. While the estimates for Germany, the Netherlands and UK on the one side and for Czech Republic, Portugal and Sweden on the other side may be largely comparable, there may be some differences between these two groups of countries. In the datasets for MSM in clinical care, men with an early HIV diagnosis not meeting the thresholds for starting antiretroviral treatment may be slightly under-represented. This may disproportionally affect younger age groups. However, since SSD values particularly for the younger age groups are already quite low in these countries, it seems unlikely that diagnosed HIV prevalence is substantially underestimated. Last but not least, our knowledge of the proportion of the male population who are homosexually active is weak, especially across the age range.
Increasing underrepresentation of older men on internet dating sites was associated with an increasing bias towards men with diagnosed HIV. Through this phenomenon self-reported peak prevalence and new HIV diagnoses rates may be shifted to higher age groups in internet samples than in the population.
For many comparisons between national EMIS samples it may be unnecessary to take biased age representations into account, because it is unlikely to change the ranking between the countries. However, when survey data are linked to surveillance data, survey data are used to interpret surveillance data or survey data are used to make statements on the MSM population it may be advisable to take survey surveillance discrepancies into account.
If reasonably reliable prevalence estimates or data on new HIV diagnoses by age groups are available, a survey surveillance discrepancy factor for internet based or other convenience samples can be calculated for adjusting survey data for the total population, if necessary.
Associated Researchers: Rigmor A. Berg (Norwegian Knowledge Centre for the Health Services, Oslo); Michele Breveglieri (Regione del Veneto, Verona); Laia Ferrer (CEEISCat, Barcelona); Percy Fernández-Davila (CEEISCat, Barcelona); Cinta Folch (CEEISCat, Barcelona); Martina Furegato (Regione del Veneto, Verona); Ford Hickson (Sigma Research, London); Harm J. Hospers (University College Maastricht); Ulrich Marcus (Robert Koch Institute, Berlin); David Reid (Sigma Research, London); Axel J. Schmidt (Robert Koch Institute, Berlin); Todd Sekuler (Robert Koch Institute, Berlin); Peter Weatherburn (Sigma Research, London).
National collaborating partners of the EMIS Network: Aids-Hilfe Wien (Austria); Facultés Universitaires Saint-Louis, Institute of Tropical Medicine, Ex Aequo, Sensoa, Arc-en-ciel (Belgium); Vstrecha (Belarus); National Centre of Infectious and Parasitic Diseases, Queer Bulgaria Foundation (Bulgaria); Charles University, Institute of Sexology (Czech Republic); Research Unit in Behaviour & Social Issues (Cyprus); University of Zagreb, Faculty of Humanities and Social Sciences (Croatia); Statens Serum Institut, Department of Epidemiology, stopaids (Denmark); National Institute for Health Development (Estonia); University of Tampere, Department of Nursing Science, Finnish AIDS council (Finland); Institut de veille sanitaire (InVS), AIDeS, Act UP Paris, Sida Info Service, Le kiosque, The Warning (France); Berlin Social Science Research Center (WZB), Deutsche AIDS-Hilfe (DAH), Federal Centre for Health Education, Cologne (BZgA) (Germany); Positive Voice (Greece); Hungarian Civil Liberties Union, Háttér (Hungary); Gay Men’s Health Service, Health Services Executive (Ireland); University of Bologna, Italian Lesbian and Gay Association (Arcigay), Instituto Superiore di Sanità (National AIDS Unit) (Italy); The Infectiology Center of Latvia, Mozaika (Latvia); Center for Communicable Diseases and AIDS (Lithuania); GenderDoc-M (Moldova); schorer (Netherlands); Norwegian Knowledge Centre for the Health Services, The Norwegian Institute of Public Health (Norway); National AIDS Centre, Lamda Warszawa (Poland); GAT Portugal, University of Porto, Medical School, Inst. of Hygiene and Tropical Med. (Portugal); PSI Romania (Romania); PSI Russia (Russia); Safe Pulse of Youth (Serbia); OZ Odyseus (Slovakia); National Institute of Public Health, SKUC-Magnus, Legebitra, DIH (Slovenia); National Centre of Epidemiology, stopsida, Ministerio de Sanidad, Política Social e Igualdad (Spain); Malmö University, Health and Society, RFSL, National Board of Health and Welfare (Sweden); Institut universitaire de médecine sociale et preventive, Aids-Hilfe Schweiz (Switzerland); Turkish Public Health Association, Siyah Pembe Üçgen İzmir, KAOS-GL, Istanbul-LGBTT (Turkey); Gay Alliance, Nash Mir, LiGA, Nikolaev (Ukraine); City University London, Department for Public Health, Terrence Higgins Trust and the CHAPS partners including GMFA, The Eddystone Trust, Healthy Gay Life, The Lesbian and Gay Foundation, The Metro Centre London, NAM, Trade Sexual Health, Yorkshire, MESMAC (United Kingdom).
European Collaborating Partners: International Gay and Lesbian Organization (ILGA); European AIDS Treatment Group (EATG); PlanetRomeo.com; Manhunt and Manhunt Cares.
We thank the following persons for contributing national surveillance data and prevalence estimates: Vratislav Nemecek, Hana Zákoucká, Marek Malý (Czech Republic); Matthias an der Heiden (Germany); Eline Op de Coul, Ard I. van Sighem (Netherlands); Helena Cortes Martins (Portugal); Torsten Berglund (Sweden); Sarika Desai (United Kingdom).
The EMIS project was funded by EAHC - Executive Agency for Health and Consumers, EU Health Programme 2008–2013 (funding period: 14.3.2009 - 13.9.2011).
CEEISCat - Centre d’Estudis Epidemiològics sobre les ITS/HIV/SIDA de Catalunya (2009–2012); Terrence Higgins Trust (CHAPS) for Department of Health for England (2009–2012); Maastricht University (2009–2012); Regione del Veneto (2009–2012); Robert Koch Institute (2009–2012); BZgA (Bundeszentrale für gesundheitliche Aufklärung, Köln: 2010–2011); German Ministry of Health (2010); Finnish Ministry of Health (2010); Norwegian Institute of Public Health (2010–2011); Swedish Board of Health and Welfare (2010–2011).
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.