The Delphi survey collected expert opinions on the contents of SGs created more than 10 years ago to meet the surveillance objectives of SurSaUD®. If consensus was found for only part of the submitted SGs, codes proposed by the participants were added to others.
Selection and participation of experts
SGs are indicators based on clinicians’ consultations and perceptions. It was therefore important to link the surveillance objectives with coding practices by clinicians, hence their inclusion in the indicator review process as they provide the medical expertise useful to the definition of SGs [8]. Almost all the SOS Médecins survey experts and just over half of the experts in the OSCOUR® survey said they knew about the SurSaUD® system. This knowledge of the SurSaUD® system may have contributed to a proper understanding of the surveillance objectives by the experts and influenced their opinion in choosing the composition of SGs. However, it is difficult to assess the (positive or negative) impact on the survey.
This knowledge of the network also may have helped to maintain a high participation rate as of the first round, especially for the SOS Médecins survey, but also in the 2nd and 3rd rounds, despite the longer time frames than initially planned. Among the volunteers initially enrolled in the survey, the participation of more than 3 out of 4 experts in the SOS Médecins survey and 1 expert out of 2 in the OSCOUR® survey was recorded. Among the respondents in the 1st round, a participation rate of more than 80% in the 2nd and 3rd rounds for the 2 surveys was observed.
The survey was anonymous and the results of each round could not be used to identify the responses from anyone, so as not to influence respondents in their future choices.
Survey process
The survey took place in three separate rounds over a period of 5 months for SOS Médecins and 9 months for OSCOUR® in 2019. The number of codes to be submitted in each survey was very different, due to the difference in the thesauruses, in their content and the number of codes available. The OSCOUR® SG survey involved too many codes and subcodes, requiring discussion on their display upstream, to facilitate reading and understanding of the survey, while optimising the time needed to respond to them. To do this, developments that were not initially planned were necessary in order to allow user-friendly display of subcodes in tooltips by rolling over diagnostic codes (display method used again to return the results after each round). This approach probably had a positive impact on maintaining the participation rate over the course of the survey. In addition, the OSCOUR® survey was interrupted during the summer holidays (2 months in total), as some areas are impacted by an increase in their tourism-related activity, leaving only a little time to respond to this type of survey for the ED physicians involved. Extending the duration of the survey had negative impacts, such as the higher number of reminders for SOS Médecins only. This may also have led to a memory bias of the participants, even if it was partially made up for by use of the bar chart corresponding to the response selected in the previous round.
Delphi method for compiling syndromic surveillance indicators
Although syndromic surveillance has existed for several years and is widely used [16, 17], there is no reference definition for SGs, which would otherwise make it possible to facilitate the exchange or comparison of data between systems and to evaluate performance [8].
To our knowledge, this is the first time that the Delphi method is used to work on the definition and composition of SGs. In existing publications, the method often used is that of a group consensus reached after a discussion meeting [7, 8]. Using the Delphi on-line method, a panel of experts working in different geographical areas could be consulted without needing to schedule or travel to any meetings. In addition, given the large number of SGs to be reviewed, several discussion sessions would have been required to reach a result for all SGs. This would likely have been a barrier to the participation of several clinical and international experts, and their workload would not allow them to be as closely involved in this type of project.
Finally, this approach also measured a consensus percentage, which was a more objective decision-making aid for the codes to be maintained or not in each SG.
Consensus level reached
The SGs for which consensus on codes was reached as early as the first round had a specific surveillance objective.
In syndromic surveillance, sensitivity is used to detect the highest number of patients likely to be in the early stage of a disease that is not yet characterized (with presentation of little specific signs) while specificity is used to refine investigations if a large number of cases with similar symptoms are identified [8]. In studies conducted on the performance of SGs, a better positive predictive value is observed when the surveillance objective is specific rather than sensitive [7, 8].
The diagnostic codes for SGs used for winter surveillance (bronchiolitis, gastroenteritis) were all kept from the 1st round. The specific objective and the small number of codes they comprised probably facilitated consensus among the experts. Used every season for many years, they are used in region-based, multi-source surveillance, which is widely communicated, with weekly reports published and meetings held with data provider partners, during which they were regularly reworked. This visibility can also help healthcare professionals see how seasonal surveillance can make sense, as it is carried out in the aim of contributing to the organisation and adaptation of the care offer, directly benefiting clinicians in their daily activity. These hypotheses could also explain the results for hyperthermia/heat stroke or insect bite SGs, traditionally used in summer surveillance, even if both SGs have poorer consensus, especially the latter, or for trauma and abdominal pain SGs, which correspond to the diagnoses found most frequently in the lead among the top 10 reasons for recourse to emergency medicine.
There is difficulty in reaching consensus for SGs with a sensitive objective that more often incorporates symptoms (diarrhoea, abdominal pain, anxiety disorders, stress, etc.). Among them, some codes of impaired general condition SG (OSCOUR®) were rejected while others did’nt reach consensus. This SG reflects a clinical picture that can have variable aetiologies and come with several pathologies, which means that it may be perceived differently from one clinician to another. This example showed that the surveillance objectives were not always sufficiently explicit or the responses of experts in line with the expectations of epidemiologists. Thus, the participants’ responses focused on supporting the end of life or the condition of the elderly, whereas, for epidemiologists in charge of health surveillance, this SG aimed to measure the sudden deterioration in a patient’s condition (with or without clearly identified aetiology).
Another example is lower respiratory infection SG for which most of the codes did not reach consensus. However, this SG relates to several issues, particularly in the surveillance of winter respiratory diseases [18].
SGs with a sensitive objective are composed of a multitude of diagnostic codes that can be a barrier to reaching a consensus. In addition, as not all codes and subcodes were displayed, this certainly influenced participants’ choice and could partly explain the lack of consensus for some codes. It is not certain that all participants saw the tool tip displayed on rolling over certain codes with the mouse, thus meaning they only read part of the subcodes in the selection in response to the set surveillance objectives.
There was little discrepancy between the responses of international experts and those of the ED specialists. Despite the lack of a reference definition, these results suggest that the development of indicators coincides between countries and thus allows for the comparison of observations between international systems, which is a strong point in the case of international threats.
More generally, the outputs highlighted two limits of using the complete ICD10 classification (40,000 codes) to code medical diagnosis in ED. First, this classification contains symptoms which should not be used for coding medical diagnosis [19]. This symptoms would be more relevant for coding chief complaint. These codes of symptoms in the definition of SG may partly explain why consensus was not reached for these SG. Furthermore, it had been showed that a unconstrained data sets with a large number of codes available for ED give poor data [20]. The usability of a system is an important factor in the quality of data we collect [21]. Based on this observations, a study was conducted by ED syndromic system in UK. With a panel of expert, they proposed a limited list of about 1200 codes based on SNOMED ontology, without any symptoms in this list [22].
In France, a similar process was launched to revised the format for collecting ED data. Particularly, a new format for ED data proposed three major evolutions: for coding medical diagnosis, a list of 1500 ICD10 codes were defined, instead of the entire ICD10 classification. Symptoms codes are removed from this list. A thesaurus were also proposed for coding chief complaint (in the current format, information were collected in free text, without thesaurus). Finally an additional information would be collected to indicate circumstance of the ED visit. This new format is still on discussion and may be implemented soon.
Review of SGs and implications for epidemiological surveillance
Even if the survey made it possible to add diagnostic codes initially absent and proposed by the experts to SGs, other diagnostic codes were not selected and the differences should therefore be discussed with the expert group.
The epidemiological impact of the results of this survey on the composition of SGs should be analysed by comparing the temporal dynamics of the former and the new composition for each SG. It would also be relevant to assess the performance of SGs by calculating their sensitivity and specificity with regard to the diagnoses actually mentioned in the patient records, however, such studies at national level are burdensome and expensive and can only be considered over a small scope, be it geographic or in terms of the SG selected.