Incorporating competing risk theory into evaluations of changes in cancer survival: making the most of cause of death and routinely linked sociodemographic data

Background Relative survival is the most common method used for measuring survival from population-based registries. However, the relative survival concept of ‘survival as far as the cancer is concerned’ can be biased due to differing non-cancer risk of death in the population with cancer (competing risks). Furthermore, while relative survival can be stratified or standardised, for example by sex or age, adjustment for a broad range of sociodemographic variables potentially influencing survival is not possible. In this paper we propose Fine and Gray competing risks multivariable regression as a method that can assess the probability of death from cancer, incorporating competing risks and adjusting for sociodemographic confounders. Methods We used whole of population, person-level routinely linked Western Australian cancer registry and mortality data for individuals diagnosed from 1983 to 2011 for major cancer types combined, female breast, colorectal, prostate, lung and pancreatic cancers, and grade IV glioma. The probability of death from the index cancer (cancer death) was evaluated using Fine and Gray competing risks regression, adjusting for age, sex, Indigenous status, socio-economic status, accessibility to services, time sub-period and (for all cancers combined) cancer type. Results When comparing diagnoses in 2008–2011 to 1983–1987, we observed substantial decreases in the rate of cancer death for major cancer types combined (N = 192,641, − 31%), female breast (− 37%), prostate (− 76%) and colorectal cancers (− 37%). In contrast, improvements in pancreatic (− 15%) and lung cancers (− 9%), and grade IV glioma (− 24%) were less and the cumulative probability of cancer death for these cancer types remained high. Conclusion Considering the justifiable expectation for confounder adjustment in observational epidemiological studies, standard methods for tracking population-level changes in cancer survival are simplistic. This study demonstrates how competing risks and sociodemographic covariates can be incorporated using readily available software. While cancer has been focused on here, this technique has potential utility in survival analysis for other disease states.


Background
Public funding of prevention, screening, treatment programs and medical research for cancer is often justified by citing improved population health. While clinical trial data are important for showing efficacy for a selected population in a trial setting, only population-based data allows the overall and incremental impact of these initiatives on the burden of cancer to be evaluated.
Survival statistics are popular for tracking the progress of cancer initiatives and spending on cancer because they appear easy to interpret (i.e. increased survival is a measure of success) [1]. However, other measures need to be considered. For example, the influence of changing incidence can be factored in by assessing agestandardised incidence-based mortality [2]. While incidence has changed with time, so has the mortality rate from non-cancer causes, these being 'competing risks' to cancer as a cause of death. Sociodemographic factors influencing deathboth from cancer and from competing risksalso vary over time, and this change should be accounted for if attempting to isolate changes in the probability of death from cancer due to cancer control strategies.
Relative survival is the most common method used for measuring survival from population-based cancer registries. This metric is a ratio of the observed survival rate of the patients and an expected survival rate in the general population. Relative survival therefore gives an estimate of mortality in the cancer population over and above what is observed for the general population, termed excess mortality. Relative survival can introduce bias if the non-cancer mortality risk differs among the group with cancer relative to the general population (e.g. if markedly different consumption of alcohol to the general population), or as the proportion of people with cancer in the general population increases (e.g. for older persons) [3,4]. These biases may be less important for cancer types with, on average, short follow-up time in registry data (e.g. lung cancer) [5]. While traditional survival analysis allows stratification of measures, for example by age group or sex, stratification or adjustment for sociodemographic covariates is difficult, requiring the availability or construction of specialised life tables (e.g. used in [6]).
Fine and Gray multivariable regression models [7] provide a means to consider the probability of death from cancer over time, allowing for incorporation of competing risks and adjustment for sociodemographic variables [8]. The aim of this study was to evaluate the utility of Fine and Gray competing risks regression to analyse the probability of death from cancer in the context of changing cancer incidence in Western Australia (WA) from 1983 to 2011.

Methods
The reporting of this population-based retrospective cohort study is based on the REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement [9].

Data sources and linkage
Person-level routinely linked data for individuals diagnosed with cancer in WA between 1 January 1983 and 31 December 2011 were extracted from the WA Cancer Registry, WA Death Registrations and the WA Hospital Morbidity Data Collection, via the WA Data Linkage System [10]. These data were used in a previous study evaluating changes in prevalence of cancer [11].

Description of participants
Incident cancers were included on the basis of tumour site code, morphology code and behaviour type. WA residents with a diagnosis of any invasive primary cancer (excluding metastases from a previous primary cancer, benign and in situ neoplasms) were included. Individual cancer types were classified using the International Classification of Diseases for Oncology code [12], or morphology / tissue type code (for grade IV glioma) provided in the cancer registry data (Additional File 1). The earliest record (index cancer) was included if the same primary cancer type was recorded more than once for the same individual.
A mix of high-to low-incidence cancers with variable survival profiles were selected: major cancer types combined, and separately: female breast, colorectal, lung, prostate and pancreatic cancers, and grade IV glioma.
The cancer types constituting major cancer types combined in this analysis constitute the bulk of cancer types reported through the registry. Different cancer types recorded for the same individual were included for the major cancer types combined analysis and considered as separate overlapping events with the incident record identified for each cancer type.

Outcomes, exposure and covariates
The available years of diagnosis were divided equally into six sub-periods: 1983-87; 1988-92; 1993-97; 1998-2002; 2003-07; 2008-2011. Cause of death was divided into 'primary cancer' (hereafter referred to as cancer death) or 'other' (which included non-cancer causes, or primary cancers in a different site) using a dedicated cancer registry variable. Follow up was from the date of diagnosis to the first of recorded date of death or 18 November 2012, inclusive.
Age at diagnosis, sex, Indigenous status, censusspecific postcode-based Socio-economic Index for Areas (SEIFA) quintiles of relative socioeconomic disadvantage [13], and access to health services using the Accessibility and Remoteness Index of Australia [14] were also extracted. Comorbidity was ascertained from the hospital record most closely aligned with the date of the incident cancer record using the Multipurpose Australian Comorbidity Scoring System (MACSS, incorporating 102 comorbid conditions) [15]. Comorbidity was specified as the number of the conditions specified in MACSS, excluding cancer, as a continuous variable. Hospital data are available in the WA Data Linkage System for all separations from 1970; however, since the data used here were part of a larger study this was limited to the analysis of hospital use occurring on or after 1 January 1998 [11].

Statistical analysis
Patient characteristics at diagnosis were compared using the Kruskal-Wallis test for continuous variables and the Chi-squared test for categorical variables. Sexspecific age-standardised incidence rates for each time period were calculated using WA population estimates from the first sub-period as a reference [16]. Age-and sex-adjusted Poisson regression was used to assess differences relative to the first time period.
Fine and Gray competing risks regression [7] was performed to estimate the cumulative incidence of cancer death, for major cancer types combined and for specific cancer types separately using death from the index cancer as the primary failure event and death from any other cause, including subsequent cancer diagnoses, as the competing risk. Since hospital data were only available from 1 January 1998, modelling for cancer death with and without comorbidity was undertaken for the following sub-time periods 1998-2002; 2003-2007; 2008-2011. All models were adjusted for age at diagnosis, sex, Indigenous status, socio-economic status, accessibility to services and calendar time period of incident cancer diagnosis. Cancer type was also adjusted for in the model for major cancer types combined.
Competing risks analysis accounts for the potential imbalance observed in standard Cox regression when subjects who are lost to follow up and those who have died of non-cancer causes (or not the cancer of interest) are considered equivalent from a statistical perspective. Fine and Gray proposed an alternative model that keep subjects who experience competing events 'at risk' in the model so that they can be counted as not having any chance of failing, rather than treating these subjects as censored [7]. Austin and Fine have published on the practical application of Fine and Gray competing risks regression, including interpretation of model outputs [17]. Where competing risks are present, the use of standard Cox regression may over-estimate the incidence of the outcome of interest [17]. Fine and Gray competing risks regression allows direct estimation of the effect of model covariates on the cumulative incidence function (i.e. probability of the outcome of interest over time) making this more appropriate for prediction [18]. The model was modified to enable robust standard errors to account for correlation within multiple records of the same person using clustering on the unique person identifier. The assumption of proportionality of sub-distribution hazards for the Fine and Gray model was tested by evaluating the log (−log) transformation of the non-parametric cumulative incidence function estimators stratified by exposure variable (i.e. time sub-period).
The sub-distribution hazard ratios (SHRs) for model covariates indicate the relative rate of cancer death among those still alive or who have experienced a competing event [17]. The models were then used to determine changes in the probability of cancer death (using the cumulative incidence function) at various times after diagnosis (using actual follow up and limited, modelled out-of-sample extrapolation, where necessary) for those diagnosed with: (i) all major cancer types combined, and; (ii) specific cancers for each according to calendar time period of diagnosis holding other covariates at their mean. The cumulative probability of cancer death is directly related to the SHRs and is more intuitive than the SHRs to interpret.
All p-values are two sided. Statistical analyses were performed using Stata SE (Version 14.1, College Station, Texas).

Results
Over the study period 192,641 individual cancer diagnoses were included; 88% of the total 218,203 for all cancers reported in WA (Table 1, Additional File 2). Age was missing for 5 (0.00%) and Indigenous status missing for 89 (0.05%) of total diagnoses; these cases were dropped from the Fine and Gray models. The cohort were more likely to be male (55%), aged 65-84 years (48%), and live in an area highly accessible to healthcare services (85%). Exceptions to this trend were for female breast cancer where the highest proportion of diagnoses were among women aged 45-64 years and grade IV glioma, where those aged 45-64 and 65-84 years at diagnosis each accounted for 44% of cases. Lung cancer diagnosed from 2008 to 2011 had approximately equal proportions in the higher and lower socioeconomic groups, representing a shift from earlier time periods where the more disadvantaged groups were proportionately higher.
The age-standardised incidence rate of major cancer types combined increased by 9.5% overall over the study period with greater increases observed in males (+ 14.2%) compared with females (+ 4.5%) ( Table 2). For individual cancers, the largest increases in age standardised incidence were observed for prostate cancer (+ 153.5%), lung cancer   in females (+ 34%) and female breast cancer (+ 31%). Lung cancer showed the largest disparity across sexes being the cancer having the largest increase in females (+ 34.3%) and largest reduction in males (− 40.7%). Disparate changes in the incidence of colorectal cancer were also observed, with a modest reduction observed in females (− 10.8%, p < 0.001) but no change in males (+ 2.9%, p = 0.534). Similarly, pancreatic cancer showed a large increase in incidence in females (+ 21%, p = 0.015). A decreased rate of cancer death for major cancer types combined of 31% (p < 0.001) was observed, when comparing the most recent sub-period (2008-2011) to the earlier sub-period (Table 3). While there were also observed relative decreases for all individual cancer types, this reduction was markedly less for lung (9%, p < 0.001) and pancreatic cancers (15%, p = 0.035), and grade IV glioma (24%, p = 0.024). For individual cancers, the largest decreases were observed for prostate cancer death (76%, p < 0.001) and female breast cancer death (63%, p < 0.001). A 37% decrease in the rate of colorectal cancer death was also observed (p < 0.001).
Because comorbidity could only be adjusted for in the most recent three sub-periods, SHRs for models with and without comorbidity as a covariate have been provided in Additional File 3. In comparing to the baseline sub-period (1998)(1999)(2000)(2001)(2002), adding comorbidity moved the negative SHRs towards the null for the 2008-2011 subperiod for major cancer types combined, lung, prostate and pancreatic cancers, with only a minor increase for colorectal (0.86 to 0.88) and a decrease in the SHR for  Figure 1 shows the adjusted probability of cancer death against time following a primary cancer diagnosis, in the presence of competing risks from other causes. For the majority of cancers the largest change in the probability of cancer death (i.e. slope of the cumulative incidence function curve) between time periods occurred in the first year following diagnosis. For lung and pancreatic cancers and grade IV glioma the between-subperiod changes are less overall, with steeper increases in the cumulative probability of cancer death with time compared to other cancer types. Figure 2 compares the probabilities of cancer death at 1, 5, 10 and 20 years following diagnosis by cancer types across sub-periods. One to four years out-ofsample extrapolation is included in panel b, c and d to allow 5, 10 and 20 year probabilities to be reported for the respective time periods of 2008-2011, 2003-2007 and 1993-1997. The probability of cancer death was lowest for female breast and prostate cancer, relative to other cancer types, in stark contrast to pancreatic and lung cancers, and grade IV glioma where the cumulative probability of death was uniformly high. For all cancer types except for lung Table 3 Competing risk regression analysis of Western Australia cancer-specific mortality by time period of diagnosis for major cancer types combined and six selected cancer types SHR = sub-distribution hazard ratio of cancer death (of specific cancer types for models with one cancer type) *Fine and Gray competing risks regression model. All p-values are two-sided. #Analysis for all incident cancers was also adjusted for cancer type cancer, there were reductions in the probability of death at or before 1 year post-diagnosis. There was little change beyond 5 years post-diagnosis for any cancer type.

Discussion
Unlike similar studies from different settings [19,20], we have analysed changes in cancer incidence alongside changes in the probability of cancer death. This is in the context of competing risks of death from other causes, and adjusting for sociodemographic confounders.
With reference to relative survival, Talback and Dickman [3] have demonstrated that biases introduced by differential mortality risk profiles among people with cancer can be large for older age groups or common cancers, and when combined can produce substantial error. While the major advantage of a relative survival approaches is that cause of death information is not required, where this information is available incorporating competing risks provides a conceptually improved way of analysing temporal changes. While this study did not aim to prove quantitative superiority of competing risks regression, an assessment of the face validity of changes observed is helpful to correlate observed trends with known changes to cancer prevention, detection and management.
The largest observed increases in age-standardised incidence and reduction in the probability of cancer death were seen in prostate cancer, with the greatest change in seen between those diagnosed in 1988-1992 and those diagnosed in 1993-1997. There was increased transurethral resection of the prostate for symptomatic benign prostatic hyperplasia post 1994, with the potential for increased incidental diagnoses of early prostate cancer [21]. In addition, public funding of Prostate Specific Antigen (PSA) tests via Medicare became available from November 1993 [22]. Though its net public health value is extensively debated [23,24], PSA testing has been commonly used in the community. Both changes increase the likelihood of earlier diagnosis and an increased incidence, potentially via diagnosis of cancers which would not otherwise be clinically significant [22,25]. Since there were no substantive treatment advances during this period, changes in lead-time may be the predominant factor associated with the observed reduction in the probability of cancer death. The further marked decrease in the probability of prostate cancer death between 1998-2002 and 2003-2007 is consistent with widespread introduction of adjuvant androgen deprivation for high risk early prostate cancer [26]; these changes mirror those seen in similar high-income countries [27].
Population based screening women for breast cancer in this jurisdiction began in 1989 [28]. Between 1985 and 1990, data emerged demonstrating the efficacy of tamoxifen as adjuvant therapy for early breast cancer [29][30][31]. The advent of screening is likely reflected in the > 10% increased incidence of female breast cancer observed in the second and to lesser extent third subperiods, and more importantly in the substantial Fig. 1 Adjusted* cumulative probability of death from index cancer # in Western Australia for major cancer types combined and selected cancers diagnosed 1983 to 2011, by sub-period. *Age, sex, period, Indigenous status, socioeconomic quintile, accessibility to health services and, for the major cancer types combined analysis, cancer type. Covariates are held at the mean of the observations in the respective cancer cohort used in the model. Thus the probability of cancer death is adjusted for these factors across each time period. Note curves show within sample estimations (i.e. no extrapolation of the probability of death beyond the follow-up time is shown in this figure). # Index cancer refers to the first invasive primary cancer of each type reduction in the probability of death from female breast cancer seen for patients diagnosed in this period. The reduction in the probability of female breast cancer death is most marked from 5 to 20 years post-diagnosis, suggesting that changes in diagnosis or management have prevented the development of metastatic, or fatal, breast cancer rather than merely delaying its onset. For patients diagnosed in subsequent periods, further reductions in cancer death may reflect the use and refinement of adjuvant chemotherapy for early breast cancer (reviewed in [32]).
For colorectal cancer the most marked reduction in the probability of cancer death parallels the introduction in 1990 of adjuvant chemotherapy for stage III colon cancer with 5-fluorouracil-based treatments [33], with subsequent uptake and long-term mortality reductions being delayed for the impact of 'cure' in disease that would have otherwise become metastatic. Treatment of stage II colon cancer with adjuvant chemotherapy became more widespread from 1995 onwards [34,35]. Oxaliplatin-based adjuvant chemotherapy was introduced in 2004 [36] and this appears to be associated with further modest incremental reductions in the probability of death seen in periods after this time point. The growth of informal screening via colonoscopy and later introduction of formal faecal occult blood test screening is also likely to have played a role [37].
Lung cancer patients continued to present with advanced stage disease with poor prognosis [38]. The change in sex distribution observed in our study has previously been described [39][40][41] and is explained by the later peak prevalence for smoking for females relative to males in Australia [42]. A small reduction in the probability of death from lung cancer was observed after 2002, aligning with the introduction of adjuvant chemotherapy for resected non-small cell lung cancer [43], but may also be influenced by the increasing use of doublet platinum-based systemic therapy at that time [44]. The small reduction in probability of lung cancer death is seen both at 1 year after diagnosis, as well as in subsequent periods (5, 10, 20 years after diagnosis) suggest that both improved adjuvant and metastatic treatment are contributing.
Our finding of increased incidence and improved, though still high, probability of pancreatic cancer-death over time may be explained by concurrent increases in computed tomography use, which has increased incidental discovery of less-advanced disease. Even so, the high probability of cancer related death reflects late-stage presentation. The reduced probability of cancer death at 1 year following diagnosis may be partly explained by chemotherapy, which showed a small survival benefit in the late 1990s [45] and/or adjuvant chemotherapy following resection [46]. Similarly, in grade IV glioma the cumulative incidence of cancer related death did not decrease until 2003-2007; even then improvement was modest. This coincides with the introduction of combined chemoradiotherapy and adjuvant temozolomide (publicly funded in Australia in 2005 [47]), which showed improved survival [48]. WA provides an ideal location to study changes in the burden of cancer. Only 3% of individuals have any health records in another Australian state or territory [49] and there is good population data capture [10]. This means that though these data were not collected with this study aim in mind, the data used for this analysis have supported achieving the study aim. Newly implemented screening programs, new medical technology and new drugs are federally funded and made available concurrently for the whole population of Australia (even if roll-out is not uniform [50]), enhancing generalisability.
This study had some limitations. The allocation of sub-periods was uniform, rather than cancer-specific. While clinical practice changes have not occurred simultaneously for all cancer types, allocation of uniform sub-periods facilitated inter-cancer comparisons via the cumulative incidence function. No staging information was available in the data analysed. Cause of death coding is unlikely to be 100% accurate, especially for older persons [51]. For example, 593 of the 3,687 people dying in 2012 (0.3% of the study cohort), were censored as they had not yet been assigned a cause of death flag by the WA Cancer Registry at the time of data extraction. Coding is subject to quality assurance [52] and was cross-referenced with the ABS monthly until 2005 [53]; thus our broad interpretation at a population-level is unlikely to be affected. In terms of administrative data, it is the best source to use to achieve the study aim. Finally, factors such as over-diagnosis may account for some of the documented changes [54]. Detailed assessment of this is beyond the scope of this study, though should be considered when interpreting the study findings.

Conclusions
Considering the justifiable expectation for confounder adjustment in observational epidemiological studies, standard methods for tracking population-level changes in cancer survival and death are simplistic. This study demonstrates how competing risks and sociodemographic covariates can be incorporated using readily available software. These estimates are conceptually more likely to reflect changes to cancer prevention, detection and management, compared with other survival measures where accounting for these factors is more difficult. While cancer has been focused on here, this technique has potential utility in survival analysis for other disease states with outcomes of interest subject to competing risks.
Additional file 1. Cancer types included for major cancer types combined cohort # Additional file 2. Full characteristics of the cohort.
Additional file 3. Competing risk regression analysis of Western Australia cancer-specific mortality by time period of diagnosis for major cancer types combined and six selected cancer types with (A) and without (B) adjustment for comorbidities at diagnosis.
Abbreviations SHR: Sub-distribution hazard ratio; WA: Western Australia