Completeness and Reliability of the Republic of South Africa National Tuberculosis (TB) Surveillance System

Background Accurate surveillance data are paramount to effective TB control. The Republic of South Africa’s National TB Control Program (NTP) has conducted TB surveillance since 1995 and adopted the Electronic TB Register (ETR) in 2005. This evaluation aimed to determine the completeness and reliability of data in the Republic of South Africa’s TB Surveillance System. Methods Three of nine provinces, three subdistricts per province, and 54 health facilities were selected by stratified random sampling. At each facility, 30 (or all if <30) patients diagnosed in Quarter 1 2009 were randomly selected for review. Patient information was evaluated across two paper and four electronic sources. Completeness of program indicators between paper and electronic sources was compared with chi-square tests. The kappa statistic was used to evaluate agreement of values. Results Over one-third (33.7 %) of all persons with presumptive TB recorded as smear positive in the TB Suspect Register did not have any records documenting notification, treatment, or management for TB disease. Of 1339 persons with a record as a TB patient at the facility, 1077 (80 %) were recorded in all data sources. Over 98 % of records contained complete age and sex data. Completeness varied for HIV status (53-86 %; p < 0.001) and DOT during the intensive phase of treatment (17-54 %; p < 0.001). Agreement for sex was excellent across sources (kappa 0.94); moderate for patient type (0.78), treatment regimen (0.79), treatment outcome (0.71); and poor for HIV status (0.33). Conclusions The current evaluation revealed that one-third of persons diagnosed with TB disease may not have been notified of their disease or initiated on treatment (‘initial defaulters’). The ETR is not capturing all TB patients. Further, among patients with a TB record, completeness and reliability of information in the TB Surveillance System is inconsistent across data sources. Actions are urgently needed to ensure that all diagnosed patients are treated and managed and improve the integrity of surveillance information.


Background
Accurate surveillance data are crucial to plan, implement, and evaluate TB control programs. South Africa is ranked by the World Health Organization (WHO) as the 6th highest among the top 22 high-burden countries for TB in the world, with an estimated 329,000 persons diagnosed with TB each year (incidence rate 860/100,000) [1]. South Africa's National TB Control Program (NTP) has been monitoring TB case rates and treatment outcomes since 1995 [2]. In 2005, the Electronic Tuberculosis Registry (ETR) was implemented nationally as a vehicle to monitor key indicators essential to understanding the tuberculosis burden and management in South Africa [3][4][5]. The ETR has approximately 400 users at the sub-district, district, provincial and national levels of the TB program and contains over one million patient records [6,7]. As part of the strategy to integrate TB and HIV services [8], the ETR also captures basic information on HIV status and HIV-related treatment among TB patients.
Since the ETR was introduced, there has not been a systematic evaluation of the National TB Surveillance System. Periodic evaluations of surveillance systems are critical to ensure the data are accurate and reliable and to guide public health programs [9]. Previous studies in specific communities reported incongruence between diagnosed TB cases in laboratory records and TB cases reported in the surveillance system [10]. Limitations of the utility of the ETR for health facilities and the local TB program have also been reported [11].
A WHO-led review of the NTP in July 2009 emphasized the need to systematically evaluate the TB Surveillance System in South Africa [12]. Specifically, the reviewers noted variability in the completeness and quality of records; a backlog of data entry; and incomplete understanding of TB indicators among some staff responsible for recording and reporting TB data. The committee recommended a comprehensive validation of the TB Surveillance System.
This project aimed to systematically evaluate the completeness and reliability of the TB Surveillance System in South Africa.

Study design
A retrospective data audit was performed to evaluate the accuracy of the TB Surveillance System for identifying persons with TB disease; the completeness of information from different sources; and the reliability of data from different sources.

Study Population and selection of sites
The sampling strategy was determined in consultation with the National TB Program. The Republic of South Africa has nine provinces, each divided into districts and subdistricts (or local governmental units (LGUs) in provinces with no subdistricts). Provinces were divided into tertiles of cure rate for 2008 in the National ETR surveillance database, and one province was selected randomly from each tertile [13]. Within each province, subdistricts (or LGUs) were categorized according to the cure rate as above. One subdistrict from each tertile of cure rate was randomly selected. A full listing of NTP facilities managing TB patients was obtained for each subdistrict, and facilities were categorized as rural or urban based on National Statistics and consultation with district TB program managers [14]. One urban and one rural district/Level 1 hospital, community health center (CHC), and primary health clinic were each selected at random in each subdistrict (6/subdistrict). If a subdistrict did not have a facility in a particular category, a facility was randomly selected from the remaining facility categories.

National TB program surveillance system overview
The National TB Surveillance System spans multiple levels of TB care and program management (Fig. 1). The NTP provides forms to health facilities for recording and reporting information on persons with presumptive TB and persons with TB disease, including the: 1) TB suspect register, a logbook to record baseline information on persons with presumptive TB disease based on NTP Guidelines at the health facilities [2].Persons that are positive for the presence of acid-fast bacilli (AFB) on smear microscopy are subsequently recorded in the TB Register. 2) TB Blue Card, the primary medical record for persons diagnosed with who initiate treatment. 3) TB Register, a spreadsheet to record key TB information on all persons diagnosed with TB disease.
Separate from this paper-based system, the Electronic TB Register (ETR) is a software program the NTP uses to quantify, monitor, and evaluate TB burden and treatment outcomes. It is installed on computers at all subdistrict, district, provincial, and national offices.
Health facilities record information on person with presumptive TB in the TB suspect register. Specimens collected from persons with presumptive TB are sent to a centralized laboratory, where they are tested for TB using smear microscopy and microbacterial culture; a paper copy of the laboratory results are transmitted to the clinic via courier (the laboratory system is not directly linked with the surveillance system). A patient file, or TB Blue Card, is initiated for all persons diagnosed with TB disease, and key information is also recorded in the paper TB Register. The TB Registers are sent to the subdistrict (or local government unit) office, where the information is entered into the initial ETR. The database file from the ETR is transferred to the district TB program, where the data from all subdistricts within that district are merged. District ETR databases are provided to the provincial office, where again they are merged to represent the entire province. The provincial TB data is then sent to the NTP, where it is used to generate annual reports.

Data audit and collection
Cross-check of persons with presumptive TB, TB Register, and TB patient management To evaluate whether the TB Surveillance System is capturing persons with presumptive TB who are newly diagnosed with TB disease, the first 30(or all if <30) individuals documented as smear positive in the TB suspect register at each facility during Quarter 1 (Q1) 2009 were cross checked with TB Blue Cards and the paper TB Register.
Standardized forms were utilized to record: date of smear positivity, TB registration number, presence of TB Blue Card, patient listing in TB Register, and if the patient had died or was lost to follow up before initiating treatment. Only diagnostic smears were considered; date of sputum collection was cross-checked with date of initial presentation when the person was identified as having presumptive TB disease.

Data audit: TB patient records
For the data audit, 30 (or all if <30) TB Blue Cards were randomly selected from TB patients diagnosed in e Q1 2009. If Blue Cards were not available, patients were sampled from the TB Register.
Sociodemographic and clinical patient information was collected from each data source (TB Blue Card, TB Register, initial ETR, district ETR, provincial ETR, and national ETR) using standardized forms. Variables included: age, date of birth, patient category (new, retreatment after failure, retreatment after default, other retreatment, relapse), disease classification (pulmonary or extrapulmonary), pretreatment smear date and results, culture results, conversion smear date and results, HIV status, HIV treatments, whether or not the patient received DOT, treatment outcome date and treatment outcome.

Statistical analysis
All analyses were conducted using Stata 13.0 (Galveston, Texas, USA). Frequencies and proportions were used to describe the facilities and patients included in the evaluation.

Proportion of TB cases identified and recorded
Overall counts were tallied to determine the number of total number of persons with presumptive TB documented in the suspect registers for Quarter 1 2009 at selected facilities, number with sputum results recorded, and number with a positive sputum smear for TB disease. The number of TB cases detected was divided by the total number of persons with presumptive TB in the suspect register with a recorded sputum smear result to calculate the proportion of cases among persons recorded as having presumptive TB and tested for TB disease. The proportion of persons with presumptive TB diagnosed with smear positive TB based on the TB suspect register that had a record in each the a) TB Blue Card file and b) TB Register was calculated.

Completeness
The total number of selected patients with a record in each data source was divided by the number of total number of persons with TB to yield a proportion for completeness of records for each source. Completeness for individual TB indicator variables was evaluated using the subset of TB patients with a record available in all data sources. For each variable of interest, the proportion of records with a non-missing value recorded was calculated, and the chi-square statistic was used to compare proportions across data sources.

Reliability
Using the subset of TB patients with a record available in all sources, reliability of the actual value recorded for key TB indicators was examined across data sources. The intraclass correlation coefficient was used to measure reliability for continuous variables (age) and the Cohen's kappa coefficient was used to for categorical variables. Reliability for age was examined for all pairwise combinations. The kappa statistic was calculated for all pairwise comparisons; for 3-level comparisons: a) TB Blue Card, TB Registry, and initial ETR; b) TB registry, initial ETR, provincial ETR; c) initial ETR, provincial ETR, national ETR; and d) overall. For all comparisons at least one data source was required to have a non-missing value, such that credit of agreement was not granted when all sources were missing. An overall weighted kappa value was calculated to summarize overall reliability (weighted across all data sources) and account for the absence of subdistricts in some provinces.

Ethical review
This project was reviewed and approved by the Ethics Committee of the South African Medical Research Council. The evaluation was also reviewed by the U.S. Centers for Disease Control and Prevention and determined to be non-research, thereby not requiring approval by the institutional review board for research in human subjects. The Republic of South Africa National Department of Health NTP and all provincial and district TB program offices provided written approvals prior to evaluation.

Results
The three provinces selected for the study were Gauteng (GAU), KwaZulu-Natal (KZN), and Mpumalanga (MPU); eighteen facilities were selected from each province to yield a total of 54 facilities.

Case detection
Using simple counts, a total of 8409 persons with presumptive TB were logged in the TB suspect registers at the selected facilities. Of these, 6853 (81.5 %) had a smear result recorded on the suspect register, with 857 (12.5 %) of those having a result recorded as a positive smear for TB (Table 1).

Cross-check of TB suspects with the TB register and TB blue card
A total of 721 persons with presumptive TB recorded as smear positive in the TB Suspect Register (i.e., diagnosed with and documented as having TB disease) were selected for more in-depth comparisons and analysis. Of these, 355 (49.2 %) had TB Blue Cards available and 457 (63.4 %) were documented in the paper TB Register at the facility (Table 2). Of 250 patients without a TB Blue Card and not recorded in the TB Register, 3 (1.2 %) were identified as lost to follow-up and 4 (1.6 %) patients were noted as having died before starting treatment. In total, over one-third (33.7 %) of all persons with presumptive TB diagnosed with TB and recorded as smear positive in the TB Suspect Register did not have any records documenting notification, treatment, or management for TB disease.

Data audit of TB patient records
A total of 1339 persons with TB disease were selected for inclusion in the data audit of TB cases. Approximately one-third of the total records were reviewed from each province (Fig. 2). Records were almost equally distributed from rural and urban facilities in KZN and MPU, but most records in GAU were sampled from urban facilities. Over half (52.1 %) of all TB patients were diagnosed at clinics, one-third were sampled from hospitals (34.1 %), and under 15 % were from community health clinics (CHCs).

Key TB indicators
Using only TB patients with records in all sources (n = 1077), data on age or date of birth, sex, and patient registration type was over 98 % complete across all data sources ( Register was not reflected in the ETR. Discrepancies were evident across electronic sources from the initial ETR to the national ETR, with the proportion of records with a non-missing value for a given variable generally declining with increasing levels of management. However, the national database had slightly more records with a value recorded for sex, treatment start and outcome dates, and treatment regimen than the initial ETR database.  Died before treatment start and lost to follow-up before treatment start ("early defaulters") as noted on the suspect register a Percentages are of those without a TB Blue Card or in the TB Register d = a -(b + c)

Reliability
Data on patient age and sex demonstrated high consistency across all data sources evaluated, with pairwise intraclass correlation coefficients ranging from 0.93 to 1.0 for age, and an overall kappa value of 0.94 for sex (Tables 4a-l and 5). Information on patient category and disease classification varied slightly between paper sources and electronic files, but was moderately reliable (kappa range 0.57-0.99). Information on the initial smear result showed inconsistencies from the TB Blue Card to the TB Register (kappa 0.62), from the TB Register to the initial ETR (kappa 0.58) and across electronic databases (kappa range 0.55-0.96). Documentation of the initial treatment regimen varied slightly between the TB Register and the initial ETR (kappa 0.80), but had excellent agreement across the ETR sources (kappa range 0.94-0.99). Information on HIV status was only available in the TB Blue Card, TB Register, and initial ETR database: agreement was poor overall (0.33) and for all pairwise (kappa range 0.11-0.41) comparisons. There was poor consistency across sources for DOT coverage during the intensive treatment phase and at the end of treatment, and likewise for smear conversion results for new and retreatment patients. Information noting treatment outcome had moderate agreement between paper sources, largely due to outcome not being recorded on the Blue Card (kappa 0.66).

Observational findings
There were several observations of the surveillance system noted during the audit. Individual patients do not have a unique TB registration number, so there are multiple patients with the same number once the data is merged across facilities, subdistricts, and districts. In addition, the system is not networked, which inhibits the ability to track patients who transfer or move. Finally, the system is not directly linked to the laboratory or other relevant surveillance systems, such as HIV, and there are no consistent mechanisms in place to reconcile patient information across systems.

Discussion
The current evaluation revealed that information in different components of the South African National TB Surveillance System is often incomplete and inconsistent.

National ETR
Values represent intraclass correlation coefficients for the continuous variable of age, and kappa statistics for categorical variables Comparisons with district/2nd-level ETR only include Gauteng and Mpumulanga provinces, as the initial ETR in KwaZulu-Natal is at the district level, and the data is included as initial ETR for KwaZulu-Natal Over one-third of patients documented as smear positive in the Suspect Register (i.e., confirmed laboratory diagnosis were not registered in the paper TB Register and did not have a TB Blue Card on file at the facility. This absence of recording and reporting suggests there are a number of persons with presumptive TB identified as having TB disease who may not be aware of their disease status and who may not be receiving TB treatment ('initial defaulters'). This represents a missed opportunity for TB control, as these are persons who already entered the public health system but were not followed up or managed. These individuals are likely to experience increased morbidity and continue to spread TB in the community.
Additionally, this study demonstrated that different data sources reflect different numbers of TB cases. Underestimating the true number of TB patients managed at facilities inhibits the local and NTP's ability to properly allocate human and logistical resources necessary to manage and treat TB patients. Among patients with records in all data sources, sociodemographic data appears to be largely retained between paper and electronic files. However, almost half the initial ETR database records were missing values for HIV status, and over three quarters of records were missing information on patient DOT. This evaluation identified that even when information is available in all sources, often the values differ. The pairwise analyses of patient category, disease classification, pretreatment smear result, treatment regimen, and treatment outcome data ranged from satisfactory to excellent which may be a testament to their programmatic relevance. However, pairwise and multi-level comparisons of information on DOT, two and three month smear results, and HIV related data demonstrate a failure of the surveillance system and demand action for resolution. The overall concordance of the data shows a troubling incongruence with the trend being such that the data across electronic sources (i.e., the ETR system at all programmatic levels) are more consistent than when paper sources are included. This may be attributed to differences in data quality management of paper and electronic systems or may be due to algorithms and decision rules that are part of the ETR computer system programming.
The current evaluation also identified multiple challenges that may inhibit the linear flow and transfer of information as the TB Surveillance System is designed (Fig. 3). Though the NTP guidelines indicate all persons with presumptive TB who are diagnosed with TB are to be recorded in the TB Register, our findings revealed that these confirmed TB patients are often not recorded or reported. The system also lacks the capacity to track or reconcile information on patients who move or are transferred. Individual TB patients are not assigned a unique TB number; therefore it is not possible to link information if a patient is listed twice after seeking care at more than one facility. These observations challenge the NTP's capacity to accurately monitor the TB burden and evaluate the management and outcomes of TB patients.

Conclusions
These findings suggest that one third of persons diagnosed with TB are not started on treatment or notified of their disease. Further, the information for persons with a TB record who are being managed at a health facility will differ according to different levels of management and will have different implications for guiding program activities. Because the ETR is fully implemented, items of information about the same patient on the blue card, the TB register, and all levels of the ETR are expected to be identical and redundant. The ETR is expected to replace other sources of redundant data except for paper sources at the facility. The sources of the discrepancies identified in the current study are unclear; however, differences between paper sources and the initial ETR may be due to data entry errors or updates made to the ETR without revision or documentation in the paper register. Inconsistencies between levels of the ETR may also be due to information being updated at one level without ensuring other levels of the ETR are updated, or due to problems with the merging process. It is evident that a well-structured quality control and assurance process is needed to improve the reliability of the TB Surveillance System. The information in the national ETR is the basis for evaluating, prioritizing needs, and allocating resources for the entire NTP and for generating annual statistics. Implementing measures to ensure all persons diagnosed with TB are properly retained and managed, and unifying paper, electronic, and laboratory systems may improve the integrity of the TB Surveillance System and also help to control and prevent the spread of TB in South Africa.

Competing interests
The authors declare that they have no competing interests.
Authors' contributions LJP, APet, and LDM contributed to the development and design of the evaluation protocol. LJP, NB, CB, and LEB developed and piloted all data collection instruments and forms; created and finalized standard operating procedures and the SOP manual; and trained field staff. NB and CB led study field teams and were responsible for verifying and monitoring all data collected and monitoring data entry. LJP, CB, and LEB were responsible for data management and all statistical analyses. All authors provided assistance with interpretation of results. LJP, CB, and LEB were primarily responsible for the writing of the manuscript; NB, APet, APym, and LDM reviewed and made substantial edits and contributions to the final manuscript. All authors read and approved the final version of the manuscript.