Validation of data quality in the Swedish National Register for Breast Cancer

Löfgren, Lars; Eloranta, Sandra; Krawiec, Kamilla; Asterkvist, Annette; Lönnqvist, Charlotta; Sandelin, Kerstin

doi:10.1186/s12889-019-6846-6

Research article
Open access
Published: 02 May 2019

Validation of data quality in the Swedish National Register for Breast Cancer

Lars Löfgren ORCID: orcid.org/0000-0002-1221-3199¹,
Sandra Eloranta²,
Kamilla Krawiec³,
Annette Asterkvist³,
Charlotta Lönnqvist³ &
Kerstin Sandelin⁴
On behalf of the steering group of the National Register for Breast Cancer

BMC Public Health volume 19, Article number: 495 (2019) Cite this article

4352 Accesses
70 Citations
6 Altmetric
Metrics details

Abstract

Background

The National Breast Cancer Register (NBCR) of Sweden was launched in 2008 and is used for quality assurance, benchmarking, and research. Its three reporting forms encompass Notification, Adjuvant therapy and Follow-up. Target levels are set by national and international guidelines. This national validation assessed data quality of the register.

Methods

Data recorded through the Notification form were evaluated for completeness, timeliness, comparability and validity. Completeness was assessed by cross-linkage to the Swedish Cancer Register (SCR). Comparability was analyzed by comparing registration routines in NBCR with national and international guidelines. Timeliness was defined as the difference between the earliest date of diagnosis and the reporting date to NBCR. Validity was assessed by re-abstraction of medical chart data for 800 randomly selected patients diagnosed in 2013.

Results

The completeness of the NBCR was high with a coverage across regions and years (2010–2014) of 99.9%. Of all incident cases reported to the NBCR in 2013 (N = 8654), 98.5% were included within 12 months and differences between health regions were essentially negligible. Coding procedures followed guidelines and were uniformly adhered to. The proportion of missing values was < 5% for most variables and reported information generally had high exact agreement (> 90%).

Conclusions

Completeness of data, comparability and agreement in the NBCR was high. For clinical quality purposes and benchmarking, improved timeliness is warranted. Assessment of validity has resulted in a thorough review of all variables included in the Notification form with clarifications and revision of selected variables.

Peer Review reports

Background

The national cancer inquiry in 2005 concluded that cancer care in Sweden, although keeping a high standard, had several inequities both in its structure, its process and in outcome [1]. Regional cancer centers were established through the Association of Local Authorities and Regions, for buildup of national cancer registers [2]. For the most prevalent cancers, regional registers were already established and formed the basis for many outcome studies.

The National Breast Cancer Register (NBCR) has been operating since 2008 and collects data in a national common database. It encompasses the diagnostic and therapeutic processes and outcome for all primary invasive and in situ breast cancer cases. Registration is performed via the web-based INCA platform (Information Network for Cancer care). Quality indicators proposed by the National Board of Health and Welfare mirror the care process. Coding routines follow national and international classification guidelines. Cancer staging and TNM classification followed the AJCC Cancer Staging Manual 7: th edition and the TNM Classification of Malignant Tumours, UICC 7: th edition [3, 4].

The register consists of three sections, Notification (including planned adjuvant therapy), Adjuvant therapy and Follow-up. Target levels are set by national and international guidelines. Continuous revisions and updates of the variables makes it a dynamic work tool. The responsibility for reporting lies on the individual health care providers and data are further monitored by the six Regional Cancer Centers located across Sweden’s health care regions. The NBCR steering committee has national multi-professional and multi-disciplinary team members and representatives from the breast cancer survivor group. Individuals can actively opt out from registration although this is extremely rare.

Based on register data, the National Board of Health and Welfare has previously published reports for several cancer diagnoses that assess and follow-up on defined quality indicators such as compliance and timeliness [5]. The reports serve as audits, quality assurance and benchmarks. Other stakeholders such as the public, patient representatives, purchasers of health care and decision-makers make use of reported data. Register data also provides a resource for clinical and epidemiologic research.

In 2013, the NBCR steering committee decided to conduct a nationwide validation of the recorded data based on a manual (AKI) developed by the working group for quality registers and INCA [6]. The manual builds upon the validation strategy of cancer registry data proposed by Parkin and Bray [7, 8] and includes the following four quality dimensions; timeliness, completeness, comparability and validity. This study presents the results of the nationwide validation of the NBCR and aims to describe how the results have been instrumental for improving the register through revision of the included variables, the reporting forms and its manual, and to assist in training of data managers.

Methods

For evaluation of timeliness, all incident cases reported to the NBCR in 2013 were included (N = 8654), and the difference in time between the earliest date of diagnosis and the reporting date in the registry was calculated.

Completeness was assessed by comparing the cases in the NBCR with registrations in the Swedish Cancer Registry (SCR) [9], to which reporting is mandatory according to the National Board of Health and Welfare’s regulations (SOSFS2006:15). Data from the time period 2010–2014 was used. The completeness of the SCR is secured as any diagnosed cancer case is reported by the clinician and from the pathology laboratory after verification of morphological examinations i e biopsies and autopsy. Two publications describe in detail the process [10, 11].

Comparability refers to the recording and coding practices and should be clear, nationally uniform and follow international guidelines to enable comparisons between regions and countries. Inclusion criteria are: location (primary breast cancer); sex (women and men); age (all ages); morphology (invasive breast cancer and carcinoma in situ); basis for diagnosis (all cases except diagnosis at autopsy).

Two control functions secure comparability. Firstly, the manual and the report form are unique documents. Secondly, monitoring is performed at the regional cancer centers whereby adherence to inclusion criteria and or any erroneously reported data and or ambiguity will be corrected.

Comparability concerning the workflow was assessed by a questionnaire addressing how different breast units handled reporting routines, involved staff, time allotted, and management support [12].

To assess validity, re-abstracted data from medical records was compared to the reported data via an independent review process. Eight hundred recorded cases between September 2013 and January 2014, were randomly selected using a two-stage cluster sampling plan.

Two hospitals offering breast cancer services (ranked according to size) from each health care region were selected. Within each region (cluster), a subsample of all breast cancer patient records in the 12 selected hospitals were drawn with a probability proportional to the size of region and hospital. The sampling plan was chosen to ensure national representation as well as participation from both large and small breast cancer units.

Re-abstraction of medical records took place in the second part of 2014 and was performed by three specialist nurses with previous experience in register validation and monitoring, henceforth referred to as validators. The re-abstracted information was entered into a specially designed module and subsequently merged with the originally recorded data to calculate exact data agreement. Exact agreement corresponds to the proportion of women for whom the data recorded in the NBCR is the same as in the validation data set. Missing observations were also included in the calculations of exact agreement to account for the plausible situations when 1) data had been reported to the NBCR but could not be found in the medical records, 2) the information was available in the medical records but had not been reported to the NBCR. Strength of agreement was measured by Cohen’s Kappa (κ) scores for categorical variables, including 95% confidence intervals (CI), and Pearson correlation coefficients (r) for numerical variables.

Results

Timeliness

Timeliness of reporting showed wide regional variations within 3 months, ranging from 30.2% in the Southeast to 77.4% in the Uppsala-Örebro regions. In 2013, 83.8% of all incident cases had been reported to NBCR within 6 months. At 12 months, 98.5% had been reported (Fig. 1). Differences between the regions were essentially negligible after 1 year (Table 1).

Table 1 Timeliness of reporting to the National Breast Cancer Register (NBCR) by health care region in Sweden in 2013. Percentage of cases reported after three respectively 6 and 12 months

Full size table

Completeness

The average coverage across all healthcare regions during 2010–2014 was 99.9%. The difference in coverage between regions was small for all calendar years (Table 2).

Table 2 Completeness (%) of reporting to the National Breast Cancer Register (NBCR) 2010–2014 as compared to reporting to the National Cancer Register by health care region in Sweden

Full size table

Comparability

Eleven of 12 units responded to the questionnaire about the workflow in the reporting process. Nurses and doctors were mainly responsible for reporting to the NBCR, and responses indicated local and regional differences concerning the workflow and routines. Notable is that concerning allotted time resources for registration, five responded, affirmative and six negating. Weather support from the departmental leadership prevailed the responders described a strong formal support but a weak actual support and insufficient resource allocation.

Validity

Of the 800 patients selected for re-abstraction of medical records, one individual had been incorrectly registered and was excluded. The number of missing observations, the exact agreement between the originally recorded and the re-abstracted data is summarized in Table 3.

Table 3 Missing data and exact agreement for variables included in the validation of the notification section of the Swedish National Register for Breast Cancer

Full size table

A detailed summary of each variable included in the validation can be found in the publicly available report (Swedish only) [12].

Lead times

The recorded variables (“Date first contact”, “Date first visit to breast unit”, “Date of first diagnosis”, “Date for care plan”, “Date of surgery”) were close to complete in the NBCR (≥ 99.9%). There have been historical ambiguities regarding the definition of the variable “Date first contact” which refers to the date of first contact with the specialist clinic/breast unit, but the extracted data in this material correlated highly with the information recorded in the NBCR (r = 0.91) (Fig. 2). Overall, the correlation coefficients were 0.90 or higher for all lead time variables referring to dates prior to start of treatment.

The completeness of the variables “Date of first postoperative histopathological result” and “Date of adjuvant therapy decision” in the NBCR was slightly lower (92.5 and 92.7%, respectively). However, for both variables the correlation with the information recorded in the medical records was high (r > 0.95).

Diagnostics

While the variable “Detection within the screening program” showed good validity (exact agreement 95%, κ = 0.90, 95% CI: 0.82–0.96), the strength of agreement of variable “Malignant diagnosis verified at first visit (to the breast cancer unit)” was weaker (exact agreement 83%, κ = 0.58, 95% CI: 0.51–0.64) (Fig. 3a).

The exact agreement for the variable “Multidisciplinary Treatment Conference” was high (95%), but the κ zero as 100% of the original records that negated that a multidisciplinary meeting had taken place (n = 17) were classified differently in the re-abstracted data (Fig. 4).

Stage

In order to fulfill the Swedish Cancer Registry requirements the notification form should include information on clinical stage, TNM at the time of diagnosis.

The agreement for clinical TNM classification was moderate although TNM classification is mandatory to report. The T (tumour) category showed lowest agreement (exact agreement 70%, κ = 0.54, 95% CI: 0.49–0.57), followed by the N (lymph nodes) (exact agreement 93%, κ = 0.73, 95% CI: 0.66–0.79) and M (distant metastasis) (exact agreement 99%, κ = 0.67, 95% CI: 0.60–0.73) categories (Figs. 3a and 4).

Planned care

Recommended primary treatment (surgery, neoadjuvant chemo-, radio- or endocrine therapy) discussed during the Multidisciplinary Treatment Conference, was reported with good validity. Conversely, indications for non-primary surgical treatment (exact agreement 99%, κ = 0.36, 95% CI: 0.24–0.46) and whether other therapies (exact agreement 99%, κ = 0.41, 95% CI: 0.20–0.62) were planned showed weaker agreement (Fig. 3a).

Surgery

Final result of breast surgery (breast conservation/mastectomy/subcutaneous mastectomy/no breast surgery) and number of supplementary surgeries showed high validity, with exception for contralateral prophylactic surgery (exact agreement 96.2%, κ = 0.11, 95% CI: 0.10–0.16) (Fig. 3a). For the variable “Axillary surgery” the exact agreement was high (91%) but the strength of agreement was relatively low (κ = 0.48, 95% CI: 0.42–0.54), due to inconsistencies in the re-abstracted data. After re-recording of validation responses (i.e. axillary surgery = “yes” if the re-abstracted number of lymph nodes > 0) the κ increased to 0.89. Reasons for supplementary surgery of the axilla had poor validity (exact agreement 58%, κ = 0.34, 95% CI: 0.19–0.48). The variable coding for complications after surgery (“Additional interventions performed due to surgical complications within 30 days”) exhibited low agreement (exact agreement 97%, κ = 0.25, 95% CI:0.18–0.31).

Histopathology

While the variables “Invasiveness” and “Type of invasive histopathology” demonstrated high validity, the variable “multifocality” was associated with poor agreement (exact agreement 77%, κ = 0.28, 95% CI: 0.12–0.42) (Fig. 3b). This is likely due to ambiguities in the manual since similar information is coded in the variable “Number of invasive tumors” which shows a good result (exact agreement 94%, κ = 0.79, 95% CI:0.74–0.84).

The validity of receptor status variables, ER (Estrogen receptor) status (exact agreement 91%, κ = 0.75, 95% CI: 0.70–0.79), PR (Progesteron receptor) status (exact agreement 85%, κ = 0.72, 95% CI: 0.67–0.76) and KI67 (antigen KI-67) (exact agreement 74%, κ = 0.64, 95% CI: 0.60–0.68) was good when classified as binary variables (No/Yes). Moreover, the correlation between the percentages of immunohistochemical staining, reported to the NBCR and those recorded in the medical records was also strong (ER:r = 0.97, PR:r = 0.94, KI67:r = 0.97). The agreement for the variable HER2-neu (Human Epidermal Growth Factor receptor 2) was, however, weaker (exact agreement 73%, κ = 0.48, 95% CI: 0.43–0.51). HER2 analysis is more time consuming and the results arrive later. Failure to report or to make a distinct notification in the patients file about the test result occurred in 116 cases where the validators recorded that a HER2 analysis had not been carried out despite reported as negative. The validity for “HER2-neu” was heterogeneous across all healthcare regions [12]. Regarding the biologic tumor variables, the proportion in situ cases were overrepresented with respect to missingness (proportion of missing in the NBCR is 30–31% for ER, PR, Ki67 and 34% for HER2-neu).

Postoperative treatment

Planned postoperative treatment variables showed generally poorer validity (Fig. 3b).

Discussion

Register data contains an abundance of valuable information useful for improvement of care on a population basis. This validation study showed that data from the Notification form is of high quality and that the validity is generally good. There are to our knowledge few examples of validation concerning process and outcome data of cancer quality registers [13,14,15]. However, several validation studies are published on national cancer registers [16,17,18,19]. Studies based on cancer registries comparing time from diagnosis to treatment between different cancer forms have also been published [20, 21]. An enquiry commissioned by the Ministry of Health and Welfare on, waiting times for diagnosis and treatment derived from three Swedish quality registers found large variations between the different diagnoses breast, colorectal and prostate cancers [22].

Some aspects of the investigated quality dimensions were identified with obvious improvement options. Variables with information considered insignificant were identified. As the registers aim to deliver process data they also serve as a source for clinical research where the demand for detailed information is important. This balance between sufficient and relevant data serving both purposes is challenging. The main findings of each of the four quality dimensions are discussed below.