We propose the calculation of the Survey-Surveillance Discrepancy (SSD) as a method to compare self-reported data on newly diagnosed HIV between different countries, and with surveillance system-derived data. Internet access rates are one important factor to take into account when comparing self-reported data from internet samples, as we can demonstrate from previous internet surveys conducted in different countries of Western Europe between 2003 and 2010. However, already these data from repeated surveys suggest that survey promotion strategies may also have an impact on SSD. The general promotion strategy for EMIS was the same in all countries, with some country-by-country variation of offline promotion activities and of the proportion of recruitment via individualized invitation messages. These differences were not adequately captured by our simplifying assumption that household internet access is the principal determining factor of survey-surveillance discrepancy. Although calculating the SSD based on an empirically observed relation between access rates and SSD clearly distinguishes between geographically defined groups of countries with similar political, historical, cultural characteristics, additional variability of the SSD within these country groups is likely.
A limitation of this approach is the lack of respective longitudinal data from countries in Eastern and Central Europe (WHO classification). Thus it is not possible to rule out that in addition to household internet access, different factors than in Western European countries modify the SSD in those countries.
If we look at the results of SSD and population size calculations for individual countries (see Table 1, column Npop calculated with SSD), the estimated MSM population size remains well below 1% of the adult male population for eight countries – Bulgaria, Belarus, Macedonia, Poland, Romania, Russia, Turkey, and Ukraine – given the calculated SSD. For these countries the ratio of EMIS participants reporting to have been diagnosed in 2009 to the number of MSM diagnosed in 2009 and reported within the national surveillance system was larger than 0.2 (Additional file 1: Table S1, column N). This means, more than 1 in 5 newly diagnosed MSM in these countries would have participated in EMIS, although the respective country samples were among the smallest in EMIS, and didn’t exceed 4 per 10,000 male adults or 4% of the MSM population if we assume that in these countries only 1% of the male adults are MSM. However, after risk re-distribution of cases with unknown transmission risk, in Poland and Russia the ratio dropped to a level comparable with other countries (Additional file 1: Table S1, column O). This strongly suggests that surveillance data are unreliable in terms of transmission risk categorization (or number of reported cases) also in Bulgaria, Belarus, Romania, Turkey, and Ukraine.
When comparing the number of EMIS participants diagnosed with HIV in the years before the survey (i.e. between 2000 and 2010) with HIV diagnoses among MSM reported in the national systems [16], the data for Poland, Romania and Bulgaria suggest a disproportionally high participation of men diagnosed with HIV in 2009. This would be compatible with an SSD higher than calculated from household internet access for these countries, and might be explainable by targeting EMIS survey promotion to MSM recently diagnosed with HIV. However, at least for Romania – even if considering a disproportionate participation of newly diagnosed men in EMIS – the data suggest either considerable underreporting of newly diagnosed HIV or a high level of misclassification of MSM into other transmission groups (the number of men diagnosed with HIV in 2009 in the Romanian EMIS dataset was 2.7-fold the national notification rate of MSM in 2009 – see Additional file 1: Table S1, columns E and G). Misclassification should be taken into account, particularly since in all three countries the level of social discrimination of gay men is high [20]. However, in Romania at least, the high proportion of females in newly diagnosed HIV (41% in 2009) suggests substantial heterosexual transmission. Thus, either cases among MSM were underreported or heterosexual transmission is to a considerable degree accounted for by women having sex with bisexual men.
In Macedonia the high ratio (of EMIS participants diagnosed in 2009 to the reported number of diagnosed MSM in 2009) and the very low estimate for the total MSM population may be a consequence of a low number of cases and the chance event that one of three reported cases may have participated in the survey.
The most likely explanation for the findings in all these countries - except Macedonia - is the underreporting of MSM cases in the national surveillance systems, as was suspected already by others [21, 22]. At least in Poland and Romania – possibly also in other countries – there may be additional issues with the SSD factor, which may be higher than calculated based on Internet access due to targeted promotion of EMIS to MSM (recently) diagnosed with HIV. Alternatively it may be that not only the transmission group reporting but also the total number of newly diagnosed and reported infections is too low in these two countries.
In addition to these eight countries, the estimates for the MSM population size for another six countries seem questionably low when compared to other countries from the same sub-region. These countries are the Czech Republic, Luxemburg, the Netherlands, Italy, Spain, and to a lesser degree Germany.
For Italy and Spain, the most likely explanation may be errors in the calculation of notification rates of newly diagnosed infections among MSM. In both countries the reporting system does not cover the whole country and misses some regions (although we explicitly tried to take this into account when calculating national notification rates). In addition, transmission risk group assignment may underestimate MSM and overestimate heterosexual cases due to physicians not asking about and patients not reporting sexual preferences. While this is true for all countries, it may be truer for countries in which same sex behaviours are more stigmatized. According to data collected in EMIS [8] this could be the case for Italy, but less so in Spain. In Luxemburg, the problematic parameter may be the number of EMIS participants living in the country who reported having been diagnosed with HIV in 2009. A major proportion of the HIV positive survey participants from Luxemburg may not have been diagnosed in Luxemburg but in the surrounding countries (e.g. Belgium, Germany, or France) or the countries of origin. The proportion of MSM with a migration history is very high in Luxemburg [8](~50%), and more than 90% of men diagnosed with HIV whose current country of residence is Luxemburg report having had sex in other countries in the previous 12 months. If this hypothesis is correct, the calculated MSM population size would increase.
For Germany and the Netherlands, MSM population sizes closer to 3% of the adult male population, and for the Czech Republic, closer to 2% would be expected. As long as the surveillance data are assumed to be correct (it could be argued that adjustments for the number of newly diagnosed MSM reported in the surveillance system are too low), the SSD factor would have to be higher in those countries than expected from household internet access calculation. For the Netherlands, there is indeed some indication for an increased selection bias, and thus an increased SSD value [6], possibly attributable to the yearly internet surveys in this country, which may result in survey fatigue affecting HIV-uninfected and untested men disproportionally more than men diagnosed with HIV. Germany may be a similar case, because internet surveys addressing MSM have been increasingly launched in recent years.
On the other hand, as shown in Table 1, given the calculated SSD the estimated MSM population size exceeds 3% of the adult male population in three of four Scandinavian countries, Belgium, France, the UK, and Slovenia. For the three Scandinavian countries Denmark, Finland, and Norway, as for Slovenia, exceeding the 3% range may just be a chance event due to the small sample sizes. With one or two more EMIS participants diagnosed with HIV in 2009 in each of those countries the MSM population estimates for these countries would be below 3%, within the expected range. Particularly, among Danish and Finnish EMIS respondents only four and two respondents respectively reported having been diagnosed in 2009, while the mean number for men diagnosed in the previous four years were 9.5 and 6.3, and the surveillance data from Denmark and Finland reported no corresponding decline in the numbers of new HIV diagnoses among MSM during these years.
For Belgium, France, and the UK, there are different possible explanations for this finding: (1) The SSD may be lower than expected in these countries. This would mean that men diagnosed with HIV – for whatever reasons - participated to a lesser extent in EMIS than men from other countries. Such reasons would have to reduce the gap between the willingness of men diagnosed with HIV and men not diagnosed with HIV to participate in EMIS. (2) The adjustments made on national level for the number of diagnosed HIV infections among MSM may be too high, i.e. the real numbers may be closer to the numbers reported to ECDC. (3) The MSM population size is indeed larger than 3% of the adult male population, e.g. due to disproportionate immigration of MSM from other countries, as suggested by Dougan et al. [23].
In the EMIS samples from Bosnia-Herzegovina, Cyprus, Estonia, Moldova, and Malta no respondent reported to have been diagnosed with HIV in 2009. It should be noted that the samples for Bosnia-Herzegovina, Moldova, and Malta were also quite small (<=150). In addition, in Estonia also no case was reported within the national surveillance system. When we tested how many cases of newly diagnosed MSM would be required in the EMIS sample to arrive at a total MSM population of at least 1% of the adult male population (Bosnia-Herzegovina, Moldova) or at a percent-range similar to neighbouring countries (Cyprus, Estonia), the values ranged from 0.1 (Bosnia-Herzegovina) to 1 (Cyprus). Thus, having no men diagnosed with HIV in the year 2009 in the respective country sample would not be unexpected considering the small sample sizes and the low diagnosis numbers. Assuming one man diagnosed with HIV in 2009 had participated in EMIS in Estonia, the number of MSM diagnosed in that year with HIV and not reported as MSM in the national surveillance system would be expected between 5 and 10 (from a total of 243 HIV diagnoses in males in 2009).
Austria reports neither the number of new diagnoses of HIV per year nor the proportion in each transmission group. Assuming a similar size of the MSM population and a similar SSD value in Austria as in the neighbouring countries Germany and Switzerland, the number of newly diagnosed MSM in Austria in 2009 would be estimated at 248 men. In a publication of the Austrian Cohort Study from 2013 a number of 507 new HIV diagnoses in Austria were reported for 2009 [24]. Among them, approximately 150–200 might be estimated to be MSM.
Turkey reported 341 new HIV diagnoses in males and 126 new HIV diagnoses in females in 2009. A heterosexual mode of transmission was reported for 216 cases, sex between males for 2 cases. With the SSD calculated for Turkey based on household internet access, and assuming a proportion of 1% MSM among the adult male population, we would expect approximately 80 MSM among the newly diagnosed cases in 2009 (the number would be lower if the SSD was higher, and higher if the proportion of MSM would be larger).
To summarize, under the assumption that the SSD is mainly (but not exclusively) dependent on the household internet access, we estimated SSD values as a function of household internet access based on earlier repeated internet surveys among MSM from several Western and Central European countries. When MSM population sizes are estimated using these calculated SSD values, differences within and between country groups become obvious, which require further explanation and analysis. These differences may partly reflect the impact of additional factors on the SSD calculation, such as the intrinsic lack of data validity resulting from sample size limitations and uncertain transmission risk attribution for reported national surveillance data. However, even beyond those countries where these additional factors may explain implausible results for MSM population size estimates, there are still some countries with large enough samples and relatively reliable surveillance systems for which the population size estimates result in estimates above the assumed upper limit of 3% MSM in the adult male population. For these countries it must be assumed that either the MSM population size is in fact larger than 3%, or that other factors except household internet access exhibit measurable impacts on the SSD. Such factors would have to equalize the willingness of HIV-positive MSM to participate in EMIS compared to HIV-negative or untested MSM.
After transformation into comparable formats, self-reported incidence of HIV diagnosis in a large internet convenience sample of European MSM correlates strongly with surveillance system reported diagnosis incidence. This argues against any large survey-surveillance discrepancies caused by sampling biases of the survey despite large differences regarding the (relative) sample sizes, and supports a high inter-country comparability of self-reported incidence indicators from this first pan-European internet survey for most of the countries. However, for some countries with low household internet access in East and South-East Europe the survey-surveillance discrepancy factor may be up to double and triple as high as for the other countries, which of course had to be considered when comparing self-reported new HIV diagnosis. Unfortunately, for the same countries the surveillance data regarding MSM are particularly unreliable.
The strong correlation between EMIS derived self-reported HIV diagnosis incidence and surveillance system derived data critically depends on an estimated parameter, the size of the respective MSM populations. The SSD, calculated to assess whether similar parts of the MSM population were reached by the internet survey, is also interdependent with the estimated size of the MSM population: if other parameters remain the same and the estimate for the size of the population increases, the SSD also increases. Unfortunately, for most countries no studies have been conducted to generate empirical data on the size of this population. We would like to emphasize that our definition of the MSM population is strictly behavioural (men who had sex with men in the recent twelve months) and describes a population which is sufficiently and effectively connected to contribute to the HIV epidemic among MSM in a measurable way. Thus our definition would not include men who e.g. would like to have sex with other men, but did not have the opportunity to do so, and it may not cover MSM subpopulations marginalized and isolated due to a lack of gay venues, internet access and other means of communication.
Thus, we assume that low levels of internet access as well as social, cultural, political and economic factors such as restrictive laws, high levels of social discrimination, and the lack of a commercial gay infrastructure can restrict the size of the so defined MSM population. All of these factors prevent people from connecting with each other and reduce their opportunity to acquire sexual partners. However, we would like to emphasize that deliberately using such measures to restrict or reduce the size of MSM populations (as currently attempted by “anti-gay propaganda” laws in Russia) in order to prevent the spread of HIV is a violation of human rights. It will also be counterproductive in the longer run, because it will undermine the capacities of MSM communities to mount an effective response towards the HIV epidemic.
An additional factor we didn’t take into account are the different starting points of the HIV epidemics in Western and Central and Eastern Europe, which result in different age ranges for the (HIV vulnerable) MSM populations. A self-reported recent diagnosis rate of 6.7 per 1000 participants in the German EMIS sample may have to refer to an MSM population comprising 3% of the 15 to 64 year old male population, while a self-reported recent diagnosis rate of 13.3 per 1000 participants in the Polish sample may have to refer to an MSM population comprising 1% of the 15 to 49 year old male population.
Thus, if we compare the incidence of HIV diagnoses in MSM communities between different countries and regions in Europe, we should be aware that the relative size of the denominator populations may be different.