Skip to main content

Table 1 Attributes included in the development of a classifier for statistical alarms recorded by a PHE multi-system syndromic surveillance service

From: Machine learning to refine decision making within a syndromic surveillance service

Field Name

Description

Entries

Missing

Unique

Ip

Values

p-value

Class Variables

Decision

Decision taken by syndromic surveillance analyst

592

66,913

3

1.0000

Alert, Monitor, No-action

Attribute Variables; from event

Year

Year of the alarm

67,505

0

3

0.0001

2013, 2014, 2015

9.7 × 10−2

Q

Quarter

67,505

0

4

0.0002

Jan-Mar, Apr-Jun, Jul-Sep, Oct-Dec

3.3 × 10−2

D

Day of the week

67,505

0

7

0.0006

Sun, Mon, Tue, Wed, Thu, Fri, Sat

6.9 × 10−8

Alarm

Was the event a statistical alarm?

67,505

0

3

0.0014

Yes, No, Unknown

<  10−10

System

The system that alarmed

67,505

0

5

0.0006

NHS111, NHS24, EDSSS, GPOOHSS, or GPIHSS

4.4 × 10−9

IndicatorS

Indicator that alarmed

67,505

0

53

0.0041

1 of 53 different syndromes

<  10− 10

IndicatorG

Coarse grained version of IndicatorS

67,505

0

8

0.0013

Cardiac, Impact of Cold, Gastrointestinal, Impact of Heat, Influenza-like illness, Respiratory, Other & Unspecified

<  10− 10

IndicatorP

Specific/general indicator

67,505

0

2

0.0001

specific, General

2.0 × 10−3

IndicatorL

Indicator severity

67,505

0

5

0.0002

Consultation, Admitted, Severe, High Dependency Unit/Intensive Care Unit, Mortality

1.0 × 10−10

Region

PHE Region

67,505

0

13

0.0037

1 of 13 PHE regions

<  10−10

LocationP

Geography of alarm

67,505

0

3

0.0037

Local, Regional, National

<  10−10

Experience

Is syndromic surveillance analyst experienced?

67,505

0

2

0.0001

Yes, No

2.1 × 10−2

Attribute Variables; from first stage risk assessment

Excess

Size of the alarm

66,406

1099

4

0.0115

0,1,2,3

<  10−10

Repeated

Is the alarm a repeat?

65,766

1739

4

0.0026

0, 1,2,3

<  10−10

Multi-system

Is the alarm in multiple systems simultaneously?

65,742

1763

4

0.0094

0,1,2,3

<  10−10

Nattrend

Is the alarm counter to the national trend?

65,771

1734

4

0.0003

0,1,2,3

2.3 × 10−5

Score1

Sum of scores from first stage risk assessment

65,795

1710

13

0.0277

0–12

<  10−10

BInitial

Does first stage analyst engage consultant epidemiologist to perform second stage?

67,505

0

2

0.0357

Yes, No

<  10−10

Attribute Variables; from second stage risk assessment

Season

Is the alarm counter to the seasonal trend?

573

66,932

3

0.0258

Yes, No, Missing

<  10−10

Geography

Does the alarm show an atypical geographical clustering?

572

66,933

3

0.0259

Yes, No, Missing

<  10−10

Age

Is the alarm centred on a particular age group?

572

66,933

3

0.0264

Yes, No, Missing

<  10−10

Severity

Is there an unusual increase in illness severity associated with the alarm?

571

66,934

3

0.0259

Yes, No, Missing

<  10−10

BScore

Are the second stage scores subsequently completed?

67,505

0

2

0.0130

Yes, No

<  10−10

Score2

Sum of scores from second stage risk assessment

67,505

0

15

0.0325

1–15

<  10−10

Bsummary

Presence of text in summary field

67,505

0

2

0.0041

Yes, no

<  10−10

  1. Notes: Ip is the amount of information obtained about the decision through observing the attribute (the mutual information between an attribute and decision)
  2. P-value is a significance obtained from a Pearson χ2 measure of the association between a variable and the Decision