Modelling excess mortality among breast cancer patients in the North East Region of Peninsular Malaysia, 2007–2011: a population-based study

Background Measurement of breast cancer burden and identification of its influencing factors help in the development of public health policy and strategy against the disease. This study aimed to examine the variability of the excess mortality of female breast cancer patients in the North East Region of Peninsular Malaysia. Methods This retrospective cohort study was conducted using breast cancer data from the Kelantan Cancer Registry between 2007 and 2011, and Kelantan general population mortality data. The breast cancer cases were followed up for 5 years until 2016. Out of 598 cases, 549 cases met the study criteria and were included in the analysis. Modelling of excess mortality was conducted using Poisson regression. Results Excess mortality of breast cancer varied according to age group (50 years old and below vs above 50 years old, Adj. EHR: 1.47; 95% CI: 1.31, 4.09; P = 0.004), ethnicity (Malay vs non-Malay, Adj. EHR: 2.31; 95% CI: 1.11, 1.96; P = 0.008), and stage (stage III and IV vs. stage I and II, Adj. EHR: 5.75; 95% CI: 4.24, 7.81; P < 0.001). Conclusions Public health policy and strategy aim to improve cancer survival should focus more on patients presented at age below 50 years old, Malay ethnicity, and at a later stage.


Background
Breast cancer is the leading cause of mortality and morbidity among women globally [1]. About 24% of total breast cancer cases worldwide were accounted for in the Asia Pacific region in 2012 [2]. A study in 2014 reported that breast cancer accounted for 25% of cancer-related deaths in Malaysia which was higher than neighbouring countries such as Indonesia (22%), Singapore (20%) and Brunei (17%), while in term of mortality to incidence rate ratio Malaysia was at 0.49 whereas other neighbouring countries such as Indonesia (0.41), Brunei (0.23), and Singapore (0.24) had a lower ratio [2]. Additionally, according to the Malaysian National Cancer Registry report, breast cancer was the most common cancer among Malaysian between 2007 and 2011, by which 99% of cases were female [3].
The survival statistics are the most commonly used measures to reflect the prognosis and the burden of cancer [4]. There are two approaches for the survival analysis; the relative survival approach and the causespecific survival approach. The use of relative survival approach in a population-based cancer study has been considered as a standard practice [5]. The main challenges of conducting the population-based study are the availability and the quality of the data. The cause of death in most of the cancer registries is not reliable and sometimes is not available at all. Thus, a cause-specific survival approach is not appropriate in this condition, and the relative survival approach is justified since the information on the exact cause of death is not necessary for this approach.
Over the last 10 years, in Malaysia, most of the research conducted to study the burden of breast cancer, and its prognostic factors used the local hospital registry data, and the cause-specific survival approach [6][7][8][9][10][11][12][13] with only a few were population-based studies [11][12][13]. Given the scarcity of breast cancer studies among Malaysian residents at the population level, this study was conducted to measure the prognostic factors of excess mortality among female breast cancer patients in one of the regions in Malaysia using data from a populationbased cancer registry.

Study site and population
North-East Region of Peninsular Malaysia consists of three states; Kelantan, Terengganu, and Pahang. This study was conducted in Kelantan state where the majority of the residents were Malay (94.6%), followed by Chinese (3.3%), Indian (0.3%), and others (1.8%) [3]. In this study, two sources of data were used for relative survival approach namely 1) the expected population which was derived from the general population mortality data, and 2) the observed population which was derived from the breast cancer data.

General population mortality data
General population mortality data for Kelantan was obtained from the Department of Statistics, Malaysia (DOSM). To be able to conduct the relative survival analysis, the general population mortality data must be in the form of a complete life

Study design and patient selection
This was a retrospective cohort study of female breast cancer using Kelantan Cancer Registry. All breast cancer cases were diagnosed with International Classification of Diseases for Oncology (ICD-O) codes C50 series. The inclusion criteria were that the cases must be diagnosed between 1st January 2007 and 31st December 2011, and a Kelantan resident. Additionally, male patients and patients with incomplete data of any variables were excluded. All breast cancer cases had a follow-up record until 31st December 2016. Out of 598 cases, 46 cases with missing information on cancer staging and three cases of male breast cancer were excluded from the study. Thus, 549 of breast cancer cases met the study criteria and were included in the analysis.

Statistical analysis and software
This study used MORTPAK for Windows version 4.3 [14] for expansion of the abridged life Tables. R version 3.6.0 [15] was used for data cleaning and manipulation, descriptive statistics, univariable Poisson regression, multivariable Poisson regression.

Expanding abridged life table
The abridged life tables of Kelantan population mortality were expanded into the complete life tables using the UNABR application in the MORTPAK software [14]. The UNABR application used the Heligman-Pollard model for this expansion. The variant of the model used in the UNABR was: where 1qx denotes the probability of dying at a yearly interval, and A, B, C, D, E, F, G, and H were the parameters to be estimated. Several studies had agreed that the Heligman-Pollard model fits the Malaysian population considerably well [16][17][18]. Two variables needed for the UNABR application were age and probability of dying between age x and age x + n from the abridged life tables of Kelantan population mortality data. A complete life table produced from this application included variables; age in the yearly interval, central mortality rate between age x and age x + 1, probability of dying between age x and age x + 1, survivors at exact age x and life expectancy.

Descriptive statistics
The numerical variables were checked for normal distribution visually by histogram and quantile-quantile plot (Q-Q plot). An approximation of a bell-shaped curved histogram and a 45°line in Q-Q plot were considered as a normally distributed variable. The numerical variables were presented in mean and standard deviation (SD) for a normally distributed variable, and in median and interquartile range (IQR) for a non-normally distributed variable. The categorical variables were presented in frequency and percentage (%). The survival time was presented in range, minimum value, maximum value, median, and IQR.

Poisson regression
This analysis was conducted using a relsurv package [19] in R software. The analysis of excess hazard was carried out using Poisson regression as proposed by Dickman et al. [5]: where; u jk = number of deaths for observation j in interval k, y jk = person-time at risk for observation j in interval k, d Ã jk = number of deaths in the expected population comparable to observation j in interval k, xβ = a vector of covariate x assumed to be in multiplicative function with coefficient β, In univariable Poisson regression, the survival times were split into several time intervals. Thus, the time intervals were set according to the recommendation of the United Kingdom and Ireland Association of Cancer Registries (UKIACR) which were monthly up to 6 months, 3-monthly up to 2 years, 6-monthly during 2 to 5 years, and yearly up to 10 years [20]. However, for variable radiotherapy and chemotherapy, different time intervals were used since the univariable models for both variables did not converge. The time intervals used were monthly up to 6 months, 3-monthly up to 2 years, 6monthly during 2 to 5 years, yearly up to 7 years, and 3yearly up to 10 years. All variables with a p-value below 0.25 were included in the multivariable Poisson regression.
In modelling the multivariable Poisson regression, the analysis was conducted using the time intervals recommended by the UKIACR. Variable with the highest pvalue above 0.05 was removed one at a time. Once the variables for the Multivariable Poisson regression were confirmed, the time intervals were reduced to achieve a more parsimonious model. Models comparison were done using Deviance (−2Log-likelihood), Akaike Information Criterion (AIC), and the most significant p-value for each variable.
Finally, the final model was tested for all possible two-way interactions between the variables, nonproportional excess hazard, and overdispersion. A pvalue below 0.05 indicates a significant two-way interaction term. For the non-proportional excess hazard model, the interaction term between variable and time interval would be included in the model to adjust for the significant non-proportionate variable. A p-value below 0.05 for any variable in the non-proportional excess hazard test indicates a significant non-proportional excess hazard and the variable was considered nonproportionate with the time of diagnosis. The nonproportional excess hazard test is available in the relsurv package. Overdispersion was tested using the deviance statistics against the degrees of freedom of chi-squared distribution and p-value below 0.05 indicates a significant overdispersion in the model.

Results
After exclusion of 49 cases, the remaining 549 cases were included in the analysis. The descriptive statistics were presented in Table 1. For univariable Poisson regression, variable age was subdivided at 50 years old, variable cancer morphology was categorised into two subgroups; infiltrating ductal carcinoma and other types of morphology, and variable ethnicity was categorised into Malay and non-Malay. The result of univariable Poisson regression was presented in Table 2.
For multivariable Poisson regression, the final model is presented in Table 3. In this model, variable cancer staging was further categorised into early stage (stage I and II) and late stage (stage III and IV) to ease the convergence of the model. Also, the model included interaction terms between variable surgery and time interval, and variable morphology and time interval, since the excess hazard was not proportionate for both variables. There were no significant two-way interactions between the variables and there was no overdispersion in the model before the adjustment for the significant nonproportional excess hazard (Chi-square (df) = 201.94 (250), P-value = 0.989).
Five prognostic factors were found significant in this study were the age at diagnosis, ethnicity, stages, morphology, and surgery. Breast cancer patients diagnosed at age 50 years old and younger had 47% higher excess hazard of death compared to those diagnosed at an older age. Also, Malay breast cancer patients had a 2.31 higher excess hazard of death compared to non-Malay patients. Additionally, late-stage breast cancer patients had a 5.75 higher excess hazard of death than early stage breast cancer patients.
The excess hazard for breast cancer morphology was not proportionate with the time of diagnosis. In Table 3 for example, breast cancer patients with infiltrating ductal carcinoma, not otherwise specified (NOS) in the second interval had a 3.3 higher excess hazard compared to breast cancer patients with infiltrating ductal carcinoma, NOS in the first interval, while those with   infiltrating ductal carcinoma, NOS in the fourth interval had only 11% higher excess hazard compared to those with infiltrating ductal carcinoma, NOS in the first interval. The non-proportionate excess hazard effect of breast cancer morphology with the survival time was further categorised in Table 4. The excess hazard for breast cancer patients with infiltrating ductal carcinoma, NOS was lower than those with other types of breast cancer morphology for most of the survival time. However, the excess hazard was higher between one-and three-years following diagnosis. The same occurrence was observed for variable surgery. In Table 3, the breast cancer patients who received surgery in the fourth interval had 8.44 higher excess hazard than breast cancer patients who received surgery in the first interval, while the ratio of the excess hazard of breast cancer patients who received surgery in the fifth interval compared to those in the first interval was only at 64%. The nonproportionate excess hazard effect of the surgery was further categorised in Table 5. Generally, breast cancer patients who received surgery had a lower excess hazard of death than those who did not receive surgery for most of the survival time. However, between period threeand six-years following diagnosis, the patients who received surgery had a higher excess hazard of death than those who did not receive surgery.

Discussion
This study found that younger breast cancer patients had a higher excess hazard compared to older patients. However, a study in Malaysia found an opposite result in which a higher excess hazard was observed in older breast cancer patients [12], while another study did not find age at diagnosis as a significant prognostic factor of breast cancer [11]. Both studies were population-based studies but used a cause-specific approach in the analysis, which may explain the difference in finding. Additionally, other two hospital-based studies in Malaysia reported that age at diagnosis was not a significant prognostic factor in their study [10,21]. Both studies used a cause-specific approach in their study design. Our finding, however, is consistent with the other findings that concluded breast cancer patients diagnosed at younger age present with a more advanced and severe tumour thus has a higher risk of mortality [22][23][24][25]. Several studies outside Malaysia did find age as a significant prognostic factor of breast cancer. A study done in Singapore using medical records from National Cancer Centre Singapore found that breast cancer patients treated with breast-conserving therapy (BCT) aged 40 years old and below had two times higher risk of mortality compared to those who at an older age [26]. Other two studies in the US found that breast cancer patients who aged 40 years old and below had a higher hazard of death compared those who aged older [27,28].
Ethnicity was a significant prognostic factor of breast cancer in this current study, which is in agreement with  several other studies [10,11,21,29,30]. Malay breast cancer patients had been observed to present with a more aggressive and larger tumour compared to other ethnic groups [31,32]. Neighbouring countries such as Singapore also reported a similar finding in which Malay ethnicity is a poor prognostic factor of breast cancer in the country [33]. Additionally, another study done in both Malaysia and Singapore, which used Singapore-Malaysia Breast Cancer Registry found that Malay breast cancer patients had the poorest survival compared to other ethnic groups [32]. A study involving six public hospitals across Malaysia reported that Malay breast cancer patients significantly associated with the use of complementary and alternative medicine (CAM) which had been observed to significantly cause a delay in presentation and diagnosis of breast cancer [34]. Thus, these findings may explain the excess hazard of death observed among Malay breast cancer patients compared to other ethnic groups. Several studies had reported that cancer staging was a significant prognostic factor of breast cancer [10,11,21]. The cancer staging was combined into early stage and late stage in our study due to the convergence issue. A presentation of breast cancer at an advanced stage is a significant contributing factor to breast cancer mortality, especially in low-and-middle-income countries [35]. Also, a similar trend of prognosis of breast cancer in term of cancer staging had been observed in Singapore. According to the Singapore Cancer Registry, between 2011 to 2015, a five-year age-standardised relative survival of breast cancer patients was lowest among patients with stage 4 at 23%, followed by stage 3 at 72%, stage 2 at 89%, and stage 1 at 100% [36]. Additionally, a latestage presentation of breast cancer in Malaysia may be explained by factors such social and cultural belief, the use of CAM, lack of awareness, and inaccessibility to health care services [37].
Additionally, breast cancer morphology was a significant prognostic factor in this study despite its effect was not proportionate with the time of diagnosis. Other studies had reported that different type of breast cancer such as infiltrating lobular carcinoma (ILC), metaplastic carcinoma of the breast and medullary breast carcinoma had a different survival rate [38,39]. The nonproportionality of the excess hazard of breast cancer morphology in our study could be explained by factors such as the occurrence of metastases and lymph node involvement. For example, a study done in the United Kingdom had reported that infiltrating ductal carcinoma (IDC) and ILC each had a distinct pattern of lymph node involvement and IDC had a less tendency for metastasis [40]. Unfortunately, this additional information was not available in this study.
Our study found that surgery was a significant prognostic factor of breast cancer, although there was a nonproportionality of excess hazard between surgery and survival time. This finding is consistent with another population-based study in Kelantan despite the difference in the survival analysis approach [11]. Besides, another population-based study done in the East of England had reported a similar finding that surgery is a significant prognostic factor of breast cancer but without the non-proportionality to the survival time [41]. Breast cancer patients who received surgery in the early period following diagnosis most probably those with a more advanced tumour, while in the latter period following diagnosis, those who received a surgery most probably patients who diagnosed with a less advanced tumour. Thus, the difference in the characteristic of breast cancer patients between each time interval may explain the different effect of surgery on breast cancer patients.
Radiotherapy and chemotherapy were not a significant prognostic factor in this study. Several studies had reported a similar finding to ours [8,11,42]. On the  contrary, a population-based study done in the East of England reported an opposite result to ours in which radiotherapy and chemotherapy were a significant prognostic factor of breast cancer [41]. Additionally, a multicentre study conducted in Malaysia, Singapore, and Hong Kong concluded that adjuvant radiotherapy was associated with survival of breast cancer among patients younger than 40 years old, but not in older patients [43]. Evidently, radiotherapy and chemotherapy had a more complex association involving other types of treatment and factors in which this complexity could not be observed in our population-based data. Admittedly, the majority of breast cancer patients in this study did not receive these two treatments. However, a more focus study in Malaysia should be conducted to evaluate the association between a combination of different type of treatment and breast cancer mortality. So, the benefit of each treatment and in a combination of other treatments could be well observed.
There are a few limitations to our study. Since this study used secondary data from a cancer registry, the information available in this study is, however, limited to the information available in the cancer registry. Important information such as tumour size, degree of metastases, and lymph node involvement was not available. Also, a Poisson regression under the relative survival approach is unable to deal with zero deaths in an interval subgroup. Thus, this leads to difficulty in the model convergence. For example, the levels of cancer staging in our study need to be combined to get a converged model. Lastly, A complete life table of general population mortality was not available for Kelantan population, and therefore, complete life table was expanded from an abridged life table of general population mortality in this study. Other researchers may use a different method of expansion, leading to a lack of standardisation in the relative survival analysis among studies.

Conclusions
The relative survival approach has been considered as a standard practice among population-based studies, especially in cancer research. This approach provides a better alternative when the cause of death is not reliable or unavailable. A population-based study gives a perspective beneficial for public health planning and policymaking. This population-based study had found three poor prognostic factors significantly associated with breast cancer mortality, which were age below 50 years old, Malay ethnicity, and late stage.