We evaluated agreement between health administrative data and self-report for ascertainment of chronic disease, in a population of community dwelling Ontarian residents, using linked population-based data. With the exception of acute myocardial infarction and stroke, prevalence estimates for diseases were higher based on health administrative data compared to self-report data. In general, we found that there was a good level of agreement between data sources only for diabetes and hypertension. For the remaining diseases that were examined, there was considerable discordance in ascertainment that could only be partially explained by individual and disease characteristics. There are likely multiple reasons for these discordances that include: disease specific biases, misclassification due to the disease definitions used and the prevalence of the disease.
Okura proposes that diseases which are less familiar to patients and have nonspecific and intermittent symptoms, such as heart failure or chronic lung disease, may be particularly prone to underreporting by patients . Conversely, administrative data may be more likely to identify chronic diseases requiring ongoing contact with the health care system [10, 12]. This is in keeping with our results—where disease prevalence, by health administrative data, was higher for most diseases. Our finding that the self-reported prevalence of stroke or myocardial infarction was higher than the prevalence from administrative data is also consistent with other studies [11, 12, 22]. These two diseases are commonly known in the community and this may lead to patients falsely attributing their symptoms to them. False-positive rates of self-reported stroke ranging from 5% to 15% have even been reported from specialized stroke units, mostly from patients admitted with transient ischaemic attacks . Rosamond et al. found a 40% false-positive self-report of myocardial infarction among patients in a coronary care unit, primarily due to hospitalization for unstable angina .
The particular question used in a survey and the case definition employed in administrative data can also affect ascertainment. In general, health administrative definitions restrict to patients with hospitalization or repeated health care contact for a disease and have a limited look-back period. This could lead to underreporting by administrative data particularly important for “event-based” diseases such as stroke and myocardial infarction where “silent” events not requiring hospitalization or events that occur outside the time period are not identified. The particular question used for ascertainment can impact ascertainment. For example, in this study the survey question for stroke was “do you suffer from the consequences of a stroke” and the health administrative data definition identified all persons admitted to the hospital with a diagnosis of stroke or transient ischemic attack (Additional file 1: Appendices B and C).
There is no agreement about which concordance measure is most valid when comparing ascertainment between data sources. Level of agreement in this study varied widely depending on the measure used particularly for the low prevalent diseases. For example, while stroke concordance was very high when comparing raw prevalence estimates (1.0% and 1.7% for administrative and self-report) it was only fair according to kappa (κ = 0.36). Some concordance measures have known limitations that are important in this context: sensitivity and specificity are less valid when no gold standard for diagnosis exists and the Kappa statistic is unreliable in the setting of a significant imbalance in the 2 × 2 table . In a recent review evaluating the quality of health administrative data, Benchimol et al. proposed that a minimum of four statistical measures should be used to assess for accuracy and validity of administrative data source to help mitigate these limitations . Others have similarly recommended that when measuring agreement in administrative data researchers should report kappa, the prevalence, positive agreement, negative agreement and relative frequency of each cell (a, b, c and d) . While there are other measures of agreement, such as the prevalence adjusted Kappa, these may not be as accurate in the setting of low prevalent conditions. We agree with these general guidelines, and we found that looking at the raw counts in a 2 × 2 table often revealed most clearly the patterns of discordance in a particular disease. Until the patterns of concordance for specific diseases are more clearly understood, using summary concordance measures (including prevalence estimates) alone may obscure the underlying patterns and should be avoided. In addition the measures selected for concordance will need to consider the particularities of the disease and population sampled.
We were particularly interested in the relationship between morbidity and agreement. This study is the first to present and explore the relationship of HUI, a validated self-reported measure of overall disease burden, to disease ascertainment. As anticipated, we found for some conditions (myocardial infarction, stroke and heart failure) cases identified by administrative data had higher median HUI scores (thus lower reported morbidity) compared with self-report cases across all diseases. For these conditions health administrative data therefore tended to identify healthier patients than those found through self-report. While the HUI is an overall measure of morbidity, and not a disease-specific measure the severity, it is probable that the severity of underlying diseases relates strongly to overall morbidity. Our finding underscores the need for researchers to consider the clinical significance of cases identified by different data sources.
Stroke and congestive heart failure had the lowest HUI scores, the largest differences in median HUI scores, and poor concordance for the two disease ascertainment methods; while diabetes and hypertension had high HUI scores and high concordance in both median HUI and disease ascertainment. Previous research has, in general, found comorbidity is associated with poorer agreement in ascertainment [10–12]. Our findings confirm that care should be taken in the interpretation of disease estimates in population with high levels of disease burden.
We acknowledge that there is no clear reference standard for the ascertainment of chronic diseases. While clinical charts are often used to assess ascertainment accuracy, even this approach is not a gold standard. For example, clinic chart review for diabetes can miss cases that are not receiving glucose lowering medications, are not regular clinic attendees or who have their diabetes care provided by practitioners . In our view, disease ascertainment is usually linked to disease severity, with less severe disease often poorly ascertained. Therefore, case ascertainment, the likelihood of truly being diagnosed with a disease and disease severity, health burden from disease are all intertwined. The paucity of disease-specific severity measures that use administrative data reveals an important gap in knowledge in our efforts to improve accurate ascertainment diseases using population based data.
This study excludes a number of key chronic diseases for which we do not yet have validated algorithms, but we do not feel that this affects the implications of our findings. It is clear that the relationships between ascertainment, disease, and patient characteristics are complex. Future analysis should consider multivariate methods to explore the effect of these factors.