Skip to main content

Table 1 Attributes included in the development of a classifier for statistical alarms recorded by a PHE multi-system syndromic surveillance service

From: Machine learning to refine decision making within a syndromic surveillance service

Field Name Description Entries Missing Unique Ip Values p-value
Class Variables
Decision Decision taken by syndromic surveillance analyst 592 66,913 3 1.0000 Alert, Monitor, No-action
Attribute Variables; from event
Year Year of the alarm 67,505 0 3 0.0001 2013, 2014, 2015 9.7 × 10−2
Q Quarter 67,505 0 4 0.0002 Jan-Mar, Apr-Jun, Jul-Sep, Oct-Dec 3.3 × 10−2
D Day of the week 67,505 0 7 0.0006 Sun, Mon, Tue, Wed, Thu, Fri, Sat 6.9 × 10−8
Alarm Was the event a statistical alarm? 67,505 0 3 0.0014 Yes, No, Unknown <  10−10
System The system that alarmed 67,505 0 5 0.0006 NHS111, NHS24, EDSSS, GPOOHSS, or GPIHSS 4.4 × 10−9
IndicatorS Indicator that alarmed 67,505 0 53 0.0041 1 of 53 different syndromes <  10− 10
IndicatorG Coarse grained version of IndicatorS 67,505 0 8 0.0013 Cardiac, Impact of Cold, Gastrointestinal, Impact of Heat, Influenza-like illness, Respiratory, Other & Unspecified <  10− 10
IndicatorP Specific/general indicator 67,505 0 2 0.0001 specific, General 2.0 × 10−3
IndicatorL Indicator severity 67,505 0 5 0.0002 Consultation, Admitted, Severe, High Dependency Unit/Intensive Care Unit, Mortality 1.0 × 10−10
Region PHE Region 67,505 0 13 0.0037 1 of 13 PHE regions <  10−10
LocationP Geography of alarm 67,505 0 3 0.0037 Local, Regional, National <  10−10
Experience Is syndromic surveillance analyst experienced? 67,505 0 2 0.0001 Yes, No 2.1 × 10−2
Attribute Variables; from first stage risk assessment
Excess Size of the alarm 66,406 1099 4 0.0115 0,1,2,3 <  10−10
Repeated Is the alarm a repeat? 65,766 1739 4 0.0026 0, 1,2,3 <  10−10
Multi-system Is the alarm in multiple systems simultaneously? 65,742 1763 4 0.0094 0,1,2,3 <  10−10
Nattrend Is the alarm counter to the national trend? 65,771 1734 4 0.0003 0,1,2,3 2.3 × 10−5
Score1 Sum of scores from first stage risk assessment 65,795 1710 13 0.0277 0–12 <  10−10
BInitial Does first stage analyst engage consultant epidemiologist to perform second stage? 67,505 0 2 0.0357 Yes, No <  10−10
Attribute Variables; from second stage risk assessment
Season Is the alarm counter to the seasonal trend? 573 66,932 3 0.0258 Yes, No, Missing <  10−10
Geography Does the alarm show an atypical geographical clustering? 572 66,933 3 0.0259 Yes, No, Missing <  10−10
Age Is the alarm centred on a particular age group? 572 66,933 3 0.0264 Yes, No, Missing <  10−10
Severity Is there an unusual increase in illness severity associated with the alarm? 571 66,934 3 0.0259 Yes, No, Missing <  10−10
BScore Are the second stage scores subsequently completed? 67,505 0 2 0.0130 Yes, No <  10−10
Score2 Sum of scores from second stage risk assessment 67,505 0 15 0.0325 1–15 <  10−10
Bsummary Presence of text in summary field 67,505 0 2 0.0041 Yes, no <  10−10
  1. Notes: Ip is the amount of information obtained about the decision through observing the attribute (the mutual information between an attribute and decision)
  2. P-value is a significance obtained from a Pearson χ2 measure of the association between a variable and the Decision