Sensitivity and specificity of routine diagnostic work-up for tuberculosis in lung clinics in Yogyakarta, Indonesia: a cohort study

Background Establishing a correct diagnosis is challenging. We aimed to investigate the sensitivity and specificity of routine tuberculosis (TB) diagnostic work-up in lung clinics in Indonesia, a country with the third highest TB burden and the second highest gap between notifications of TB cases and the best estimate of incident cases in the world. Methods In the lung clinics of the Province of Yogyakarta, Indonesia, we recruited all consecutive patients with symptoms suggesting TB, aged ≥18 years. Routine TB examination consisted of clinical evaluation, sputum smear microscopy, and chest radiography. For research purposes, we added sputum culture, Human Immunodeficiency Virus (HIV) testing, and follow-up for 1.5 years or 2.5 years if culture results disagreed with the initial clinical diagnosis. The initial diagnosis was considered incorrect if patients did not respond to treatment. We calculated sensitivity and specificity of the TB routine examination using culture and a composite reference standard (CRS – a combination of routine examination, culture, and follow-up) as the reference standards. All analyses were conducted with IBM SPSS Statistics 25 (IBM Corp., Armonk, NY, USA). Results Between 2013 and 2015, we included 360 participants, and 21 were excluded due to incomplete data. Among those analyzed, 115 were initially diagnosed with smear-positive TB, 12 with smear-negative TB, and 212 non-TB. In 15 study participants, the diagnosis was changed after median 45 (range: 14–870) days; 14 participants initially not diagnosed with TB were later diagnosed with TB, while one subject initially diagnosed with TB actually did not have TB. Compared with culture and CRS, TB routine examination had sensitivity of 85% (95%CI: 77–91) and 90% (95%CI: 84–94), and specificity of 86.3% (95%CI: 81–91) and 99.5% (95%CI: 97–100), respectively. Conclusions A combination of clinical evaluation with sputum microscopy and chest radiography provided high sensitivity and specificity in diagnosing TB in lung clinics; in only 4.4% the diagnosis was incorrect. There is a need to improve routine TB diagnostic work by using clinical evaluation, sputum smear microscopy, and chest radiography all together in other settings, such as in primary health centers. Trial registration NCT02219945, clinicaltrials.gov. Registered 19 August 2014 (retrospectively registered).


Background
Indonesia is the country with the third largest TB burden in the world and has accounted for the second highest gap between notifications of TB cases and the best estimate of incident cases [1]. As in many low resource settings, TB is usually diagnosed by clinical examination, sputum smear microscopy, and chest radiography (CXR). Among the 442,172 new and relapsed pulmonary TB cases, only 54% were confirmed bacteriologically by microscopy or less often, by culture [1]. Culture examination is reserved for suspected drug resistance in patients failing to respond to treatment or in patients who relapse [2]. Other diagnostic tools such as Polymerase Chain Reaction (PCR) and Gene-Xpert are still difficult to access and without subsidy are unaffordable for the majority of patients.
The World Health Organization (WHO) criteria for pulmonary TB diagnosis include clinical symptoms and a CXR suggestive of TB, isolation of Mycobacterium tuberculosis (MTB) by culture or acid-fast bacilli by sputum smear microscopy if culture is unavailable [3]. Clinicians might need to make a tentative diagnosis if clinical symptoms and CXR suggest TB while microscopy remains negative, or if the patient does not respond to the treatment of an alternative pulmonary diagnosis [3]. Patients with a clinical response to TB treatment are likely to have suffered from TB, although occasionally clinical improvement may occur in patients with sarcoid, cryptogenic organizing pneumonia or non-tuberculous mycobacteria (NTM) infection.
Establishing a correct diagnosis is challenging [4]. Symptoms have low sensitivity and specificity; CXR of NTM lung disease and lung cancer may mimic TB, and bacteriology examinations sometimes fail [4][5][6]. Only a few studies have evaluated the sensitivity and specificity of TB diagnosis in the routine settings of low-and middle income countries [7,8].
The government of Indonesia established the latest national strategy for TB control in 2016, emphasizing reinforcement of TB programs' leadership, escalation of quality of TB services (prevention, diagnosis and treatment), management of risk factors for TB, enhancement of collaborations and community participation, and reinforcement of TB program management [9].
The sensitivity and specificity of a routine work-up for suspected TB in Indonesia, particularly in lung clinics, have not been studied. Meanwhile, this kind of study could provide information for the government about the quality of routine TB service and how to improve it, thus eventually it could support decision making and escalate the quality of the national TB program. We included culture examination and long-term follow-up in order to evaluate the sensitivity and specificity of the diagnostic algorithm under service conditions.

Study aim and design
We aimed to investigate the sensitivity and specificity of routine TB diagnostic work-up in lung clinics in Indonesia. This research was a cohort study of patients suspected to have TB in Yogyakarta, Indonesia.

Study setting
Yogyakarta Province had a population of 3,679,176 in 2015 [10]. It had 5 public lung clinics that provide services for lung diseases, predominantly TB, and where more than half of suspected TB patients in Yogyakarta were screened [11]. In 2016, after our study was completed, the lung clinics were integrated into one lung hospital. Patients were either self-referred or referred by primary health centers. The lung clinics used microscopy, CXR, and HIV Voluntary Counseling and Testing services. The province has 21 primary health centers that mostly have sputum smear microscopy facilities; few among them have CXR facilities [12]. The total TB case notification rate in Yogyakarta in 2015 was 73/100,000 [10].

Study population
This study was part of a research to investigate the diagnostic sensitivity and specificity of an electronic nose in Yogyakarta, Indonesia (eNose study-NCT02219945). The study population consisted of TB suspects aged 18 years and older, who agreed to participate in the eNose study. They were enrolled from October 2013 to December 2015.

Study parameters
As part of routine examination, spot-morning-spot smear microscopy and CXR were conducted. For research purposes, we added sputum culture, HIV testing, and followed the study participants over time. Following the normal procedures, all study participants attended the Lung Clinics for two consecutive days to have their diagnosis established by the Lung Clinics' physicians. On the first day, patients underwent clinical examination, CXR, microscopy examination from a spot sputum specimen, and HIV testing. On the second day, patients collected morning and spot sputum specimens for microscopy, and for research purpose, another morning sputum specimen for microscopy and culture in the Microbiology Laboratory, Faculty of Medicine, Public Health and Nursing, Universitas Gadjah Mada, Indonesia. As part of the ongoing research, all TB suspects were prospectively followed up at their home, lung clinics, or by phone for 1.5 years after diagnosis. Patients were followed up for 2.5 years when the culture results disagreed with the initial clinical diagnosis (i.e. culture was positive for MTB, but the clinical diagnosis was non-TB, or culture was negative but patient was diagnosed with TB).
We recorded information about previous TB treatment, demographics (age, sex, and Body Mass Index -BMI), housing conditions, bacteriological examination, CXR reading, follow up of clinical symptoms, and comorbidities (HIV or Type 2 Diabetes Mellitus/T2DM). T2DM diagnosis was based on the national guidelines [13]. A.M.S. and F.S. double-entered all data into a database, and ensured no missing data or typing errors.
Sputum microscopy and culture examinations with Löwenstein-Jensen(LJ) medium were conducted according to the WHO laboratory guidelines [14]. For research purposes, one independent physician (TsW -a pulmonologist) re-read the CXRs, which were scored as suggestive for TB, possible TB, abnormal but no TB, and normal. Patients who were lost to follow-up at any point in time, or who had incomplete results of any diagnostic test were excluded from the study. Results of TB routine examination were available to those seeing the patients during follow-up, but not to the laboratory personnel.
The initial diagnosis (TB or non-TB) was considered incorrect and was revised in the referral health centers when patients did not respond to treatment, or if an alternative diagnosis was made during follow-up. A composite reference standard (CRS) [15,16] which consists of symptoms, sputum microscopy and culture, CXR, and follow-up determined the final diagnostic classification (TB or non-TB).
A TB case was defined as: [1] patient with bacteriological confirmation and clinical illness or CXR suggestive of TB, and responding to TB treatment; or [2] patients without bacteriological confirmation, but with clinical illness and CXR suggestive of TB, and responding to TB treatment. At the end of study period, the patients were then divided into two groups: [1] patients whose diagnoses were not revised, and [2] patients who had their definitive diagnosis changed compared to their initial diagnoses.

Data analysis
Previous studies revealed that among all TB symptoms, cough, weight loss, and night sweats were independent predictors of TB with or without HIV coinfection, [17][18][19] thus we confined the analysis to these symptoms. The CXRs were dichotomized: CXR suggestive for TB or possible TB were scored as "suggestive for TB", while a CXR abnormal but no TB, and CXR considered normal were scored as "not suggestive of TB". We calculated the proportion of revised diagnoses, and the sensitivity, specificity, and positive and negative predictive values (PPV and NPV) of the TB routine examination using culture and the CRS as the reference tests. We also investigated which factors were associated with the revision of diagnosis from non-TB into TB by calculating the relative risks (RR) and their 95% confidence intervals (95%CI). The RR was considered significant if the 95%CI did not contain value of 1.00. All analyses were conducted with IBM SPSS Statistics 25 (IBM Corp., Armonk, NY, USA).

Results
In all, 360 consecutive TB suspects were prospectively included. Twenty-one subjects were excluded for various reasons (Fig. 1). Most of the subjects were male, with median age 46 (range: 18-87) years, and normal BMI (18.5-25 kg/m 2 ). Most subjects (79%) lived in a crowded neighborhood or poorly ventilated housing, 7% of them had TB index cases in their surrounding (house-hold members, colleagues, or close neighbors), 12% of them had a history of previous TB, 2.4% had HIV infection, 7.7% had T2DM, and 32% were current smokers.
Out of 339 participants, 115 were initially diagnosed with smear-positive TB, 12 had smear-negative TB, and 212 had diagnoses other than TB (asthma, pneumonia, bronchiectasis, chronic bronchitis, Chronic Obstructive Pulmonary Diseases, Obstructive Syndrome Post TB, lung fibrosis, lung abscess, lung cancer, pleural effusion, and polycystic lung disease). Of the eight patients who had HIV infection, 4 patients tested positive for TB in smear microscopy; 2 patients were diagnosed with TB but were smear negative and the other 2 patients were diagnosed as non-TB. Table 1 shows that when culture and follow-up were used in parallel with TB routine examination to establish the final diagnosis, the final diagnosis changed for 15 (4.4%) study participants; 14 more study participants in retrospect had TB now, while only one subject initially diagnosed with TB actually did not have TB.
Fourteen patients were initially diagnosed as non-TB (7 were diagnosed with bacterial pneumonia, 3 with chronic bronchitis, 2 with post TB sequelae, 1 with bronchiectasis, and 1 with bronchopneumonia), but then their clinical conditions deteriorated and after median 83 (range: 14-870) days, 13 out of 14 patients were diagnosed as pulmonary TB, and 1 patient was diagnosed as Multi Drug Resistant Tuberculosis (MDR-TB). One of these pulmonary TB patients was HIV co-infected. One patient who was initially diagnosed with TB suffered from drug-induced liver toxicity at week 6 of treatment, and his TB drugs had to be stopped. He was referred to the hospital and obtained other non-TB drugs for one month, and afterwards, when sputum microscopy appeared negative, the treatment was stopped and he has remained well since; we therefore concluded that he probably did not have TB. Two other TB patients were considered cured, but then relapsed and obtained a re-treatment TB regimen.
Among non-TB patients whose diagnoses were revised into TB, six patients with CXR suggesting TB developed TB after around one month, and one patient with CXR suggesting TB developed TB after 26 months. Seven patients who had CXR not suggesting TB developed TB around 9 months after their initial diagnostic work-up.
Seven non-TB patients had MTB cultured in a sputum specimen, but they did not develop any TB symptoms during follow-up. Thus, we suspected them to have false-positive culture results. Three out of these 7 patients had a history of previous TB. Among 30 TB patients whose diagnoses were not revised, 19 patients had negative culture, and in 11 patients, sputum culture grew NTM. They responded to TB treatment and all of them were considered cured. Therefore, these patients were suspected to have false-negative cultures. In two patients, probably a clerical error occurred, with results exchanged between them. Samples of these two patients were processed in the same day; one had positive smear microscopy for TB, while the other had a negative microscopy test. The patient who was smear-positive and negative by culture was treated with anti TB drugs and subsequently recovered, whilst the other patient whose smear was negative and whose culture was positive did not get any TB treatment and did not deteriorate over time. Moreover, we noticed from the laboratory notes that three non-TB patients with negative culture whose diagnoses were revised into TB collected saliva instead of sputum, and sample specimens were too small (1 ml while > 2 ml is required). Most of patients who were suspected to have false-negative culture collected saliva instead of sputum, and with insufficient volume (< 2 ml).
The TB routine examinations in lung clinics in Yogyakarta had sensitivity of 85% (95%CI: 77-91) and  (Table 2). Sensitivity and specificity of different symptoms and diagnostic tests were various, using CRS as the reference test (Table 3). Table 4 shows that patients with TB index cases in their surroundings were 12 times more likely to have revision of diagnosis from non-TB into TB compared to patients without TB index cases in their surroundings. Patients with a positive culture were 29 times more likely to have revision of diagnosis than patients with negative culture, and patients with CXR suggesting TB were 4 times more likely to have revision of diagnosis than patients with CXR not suggesting TB. Number of patients whose diagnoses were revised is too low to conduct a multivariate analysis.

Discussion
TB routine examinations in Yogyakarta lung clinics had high sensitivity and specificity. Only 4.1% of 339 consecutively enrolled study participants who were initially not diagnosed with TB later turned out to have TB. To our knowledge this study is the first report addressing sensitivity and specificity of TB diagnosis under routine conditions in lung clinics in Indonesia. Follow-up as a part of a composite reference standard to assess diagnostic test characteristics has been successfully used in many different settings [16,17,20,21]. The diagnostic sensitivity and specificity in this study are higher compared to the study from Kwazulu-Natal, South Africa [8]. All 12 patients in our study who were bacteriologically negative but clinically diagnosed with TB, had clinical improvement during the TB treatment. Boehme et al. found that only 67 out of 138 patients who were clinically diagnosed with TB despite negative bacteriology improved after TB treatment [7]. Using culture as reference test, a study in smear negative TB patients in Pakistan showed that the clinical diagnosis was correct in 80% [22]. In a retrospective study in Taiwan, 87% of smear negative patients who were clinically  diagnosed with TB were confirmed to have TB based on their response to therapy [6]. Some factors may explain this: Indonesia has higher TB prevalence which results in higher pretest probability; the lung clinics conducted multiple tests in the first examination, thus the physicians had more data to support correct diagnosis; and Indonesia has lower HIV prevalence [1], which contributes to the higher sensitivity of sputum microscopy and CXR. In Yogyakarta, Indonesia, the lung clinics had the facilities needed to establish a TB diagnosis, meanwhile some of the primary health centers did not have laboratories facilities for sputum smear microscopy or CXR, thus they needed to send samples or patients for microscopy or CXR examinations to the lung clinics or higher-level hospitals. In addition, the lung clinics conducted both microscopy and CXR examinations in the first visit, while primary health centers followed the national diagnostic algorithm that recommends CXR examination only when sputum smears are negative. Therefore, the duration of diagnostic delay for TB diagnosis in primary  health centers was significantly longer than in lung clinics [23], and the primary health centers may not have sufficient tools to establish the correct diagnosis. However, there have been no studies that investigated the sensitivity and specificity of TB diagnosis in the primary health centers in Indonesia, thus we could not compare the results with the lung clinics. Furthermore, the proportion of patients lost in primary health centers was higher than in lung clinics, due to this longer duration of the diagnostic process [23], although generally the distances between patients' houses and primary health centers were closer than distances between their houses and lung clinics. Often patients needed to come more frequently to the primary health centers than to lung clinics to get diagnosed, which means more costs and time spent, thus causing patients' reluctance to finish the diagnostic procedures [23]. Indeed, there is a high gap (estimated around 47%) between notification and the estimation of incident TB cases due to undetected and underreported cases in Indonesia, including Yogyakarta [1]. Therefore, if the TB diagnostic process in the lung clinics could be replicated in the primary health centers, it would improve the detection rate and reduce further TB transmission in the community. Presence of all symptoms showed fairly good sensitivity and specificity as reported earlier from Tanzania [18], and it helped to establish an accurate diagnosis. Persistent cough of at least 2 weeks duration had the highest sensitivity, although other studies have reported low sensitivity and high specificity for persistent cough [17,24]. Night sweats were associated with the highest specificity. Indeed, night sweats have generally been considered as a specific symptom for TB though the pathophysiology is poorly understood; [25,26] night sweats are limited to the upper trunk in TB [27].
It is reasonable that a revision of diagnosis from non-TB into TB was associated with a presence of TB index cases in patient's surroundings, as the presence of exposure to TB index cases is one of risk factors to contract TB [1,9]. Culture examination is indeed considered the gold standard for the diagnosis of TB; 10 out of 14 patients whose diagnoses were revised had positive culture but negative smear. Nonetheless, the solid media used for culture in our study typically has a slow turn-around time, considerably longer than cultures in liquid media that have the additional advantage of improved sensitivity. Liquid culture may however have lower specificity due to higher contamination rates [28][29][30]. To increase sensitivity of LJ culture, 2-3 specimens per patient should preferably be examined to improve the diagnostic yield [31,32]. In only 34 participants in our study, multiple specimens were submitted for culture. CXR has higher specificity and sensitivity compared to clinical symptoms, which corresponds with previous studies [24,33]. CXR in TB may resemble other pulmonary conditions [4,5], however, almost all non-TB patients whose diagnoses were revised into TB and who had CXR suggesting TB, developed TB after a short time. Our data illustrate that CXR is a helpful tool to screen for active TB. A previous study reported that ≥80% of patients with pulmonary TB have at least one among five different radiographic appearances [19].
According to the TB national guidelines in Indonesia, patients with negative smear should return or be followed-up to investigate their responses to the antibiotics given [2]. However, in reality, patients often do not return to the health centers, and the patients who are lost are not tracked [23]. Many negative-sputum-smeared patients in our study did not visit the lung clinics or other health centers after their initial diagnosis. In this situation, we note  [1]. Follow-up also provides an affordable way to evaluate the sensitivity and specificity of the diagnostic process in low resource settings. This study is the first report on the proportion of false positive cultures in Indonesia. We suspect that the rate of false (positive and negative) culture results was 11.2%, while our estimate of false positive culture rate is 2%, which is comparable with earlier reports in high and middle/low-income countries [34][35][36]. To confirm suspected cross-contamination resulting in false-positive diagnoses, fingerprinting techniques help to identify similarity of strains from patients that have no epidemiological link. For resource-poor settings like Indonesia, these techniques are currently unaffordable but clinical laboratories can improve operational procedures by enhancing adherence to protocols [31], and participating in External Quality Assessment programs. Despite unavailability of fingerprinting analysis, we assessed the possibility of false-positive culture in our study through all available data. The lack of quality and quantity of samples might influence the quality of culture in these suspected false culture cases. Therefore, laboratory and other health care workers should instruct patients how to produce sufficient samples to increase the sample quality and detection rate [37]. In a similar setting in Java, Indonesia, education provided for sputum collection for the patients was suboptimal [38].
To address the diagnostic delay, a rapid, reliable, and point-of-care diagnostic tool is needed. However, clinicians need to be conscious of the possibility of false-positive test results of any diagnostic tool, and clinical judgment remains important. Some tools are now under development, such as the loop-mediated isothermal amplification test for TB [1], and electronic nose [39]. In the settings of lung clinics and primary health centers in Yogyakarta and elsewhere in Indonesia, Gene-Xpert TB/RIF is not readily available; it is available in some hospitals [40], and although the use of Gene-Xpert could increase the TB positivity rate [40], the relatively high cost of cartridges precludes extensive use if suspicion for drug resistance is low. False-positive test results for TB may also occur with PCR [41] or Gene-Xpert [42].
While waiting for an implementation of a rapid, reliable, and point-of-care TB diagnostic tool, optimizing the current routine TB diagnostic work is a reasonable option. The current national diagnostic algorithm that recommends CXR examination when sputum smear results are negative, was proven to delay the TB diagnosis, and one of the factors causing patient's loss, although it was originally developed to improve the sensitivity and specificity of TB diagnosis [23]. Our study showed that the diagnostic process in the lung clinics which employs clinical evaluation, sputum smear microscopy, and chest radiography all together had high sensitivity and specificity, and in the same time reduced the delay time of TB diagnosis, as indicated by a previous study [23]. Therefore, an attempt should be done to replicate the diagnostic process in the lung clinics to other settings, such as primary health centers. If laboratory facilities could not be provided in the primary health centers, there should be a prompt referral system that enables patients to get microscopy and CXR examinations on the same day. This effort will reduce the rate of patient loss thus reducing the number of undetected cases and ongoing TB transmission.
There are some limitations in this study; in particular, it was performed in Yogyakarta province alone. The organization of lung clinics in Yogyakarta is however typical and representative for lung clinics in Indonesia and we therefore assume that the diagnostic sensitivity and specificity of TB routine examination is comparable to other lung clinics in Indonesia. Another limitation was that we used solid (LJ) culture medium, which has lower sensitivity compared to liquid (e.g. Mycobacteria Growth Indicator Tube) culture, and we used only one specimen for culture. We did not perform fingerprinting so that we could not further confirm the suspected false positive culture results, and we did not perform Drug Susceptibility Testing (DST) on every specimen; however, we did not detect study participants failing to respond to TB treatment and the MDR-TB prevalence in Indonesia is low [1], thus for that matter, the lack of DST did not confound our assessment.

Conclusion
In summary, a combination of clinical evaluation with sputum microscopy and chest radiography in lung clinics provided high sensitivity and specificity in diagnosing TB; in only 4.4%, the diagnosis was incorrect. While waiting for an implementation of point-of-care, fast, accurate, and easy-to-use TB diagnostic tool, there is a need to improve routine TB diagnostic work by using clinical evaluation, sputum smear microscopy, and chest radiography all together in other settings, such as in primary health centers.