Demographic, health, and prognostic characteristics of Australians with liver cancer: a cohort study of linked data in New South Wales for informing cancer control

Background Australian age-standardized incidence and death rates for liver cancer are lower than world averages, but increasing as in other economically advanced western countries. World Health Organization emphasizes the need to address sociodemographic disparities in cancer risk. A more detailed sociodemographic risk profiling was undertaken for liver cancer in New South Wales (NSW) by diagnostic stage, than possible with NSW Cancer Registry (NSWCR) alone, by incorporating linked data from the Australian Bureau of Statistics (ABS). The purpose was to inform targeting and monitoring of cancer services. Methods The ABS manages the Multi-Agency Data Integration Project (MADIP) which includes a wide range of health, educational, welfare, census, and employment data. These data were linked at person level to NSWCR liver cancer registrations for the period post 2016 census to December 2018. De-identified data were analyzed. Sex-specific age-adjusted odds ratios (95%CIs) of liver cancer were derived using logistic regression by age, country of birth, residential remoteness, proficiency in spoken English, household income, employment status, occupation type, educational attainment, sole person household, joblessness, socioeconomic status, disability status, multimorbidity, and other health-related factors, including GP consultations. These data complement the less detailed sociodemographic data available from the NSWCR, with alignment of numerators and population denominators for accurate risk assessment. Results Results indicate liver cancer disproportionately affects population members already experiencing excess social and health disadvantage. Examples where 95% confidence intervals of odds ratios of liver cancer were elevated included having poor English-speaking proficiency, limited education, housing authority tenancy, living in sole-person households, having disabilities, multiple medicated conditions, and being carers of people with a disability. Also, odds of liver cancer were higher in more remote regions outside major cities, and in males, with higher odds of more advanced cancer stages (degrees of spread) at diagnosis in more remote regions. Conclusions Linked data enabled more detailed risk profiling than previously possible. This will support the targeting of cancer services and benchmarking. Supplementary Information The online version contains supplementary material available at 10.1186/s12889-023-16809-y.


Background
Incidence and mortality from liver and intra-hepatic bile duct cancer (ICD-10-C22) are lower in Australia than the world average by 34% and 52% respectively, and much lower than in Africa and Asia [1].Nonetheless, as in other economically advanced western countries, increases have occurred in recent decades [1][2][3].Between 1982 and 1989, when national incidence data were first available, and 2015-2019, the age-standardized Australia-wide incidence increased by about 313%, with a corresponding mortality increase of 169%. 3 In 2021, an estimated 2,832 Australians (72% males) were diagnosed with liver cancer and 2,424 (66% males) died from this cancer [4.World Health Organization emphasizes the need to address sociodemographic disparities in cancer incidence, as reported in Australia for liver cancer [3][4][5].This is reflected in national and international declarations and strategies, [6][7][8] and supported by the Australia Public Health Association, Australian and NSW Governments, and the NSW Cancer Plan [9][10][11].
Viral hepatitis B and C infections are major causes of liver cancer and contribute to upward trends in incidence trends [5].Research in Australia indicates that almost half these cancers are associated with hepatitis B or C infection [5].Other risk factors include type 2 diabetes mellitus, overweight and obesity, high alcohol consumption, tobacco smoking, illicit drug use, unprotected sex, medical conditions such as metabolic dysfunction associated fatty liver disease, and hereditary haemochromatosis [5,6].In some low-income countries, humidity and suboptimal storage of foodstuffs may also contribute to risk through increased aflatoxin contamination of food products [6].
Like most cancers, incidence of liver cancer increases and survival decreases in Australia with age [3].Males are more frequently affected, with elevated male to female sex ratios of around 3.3 to one [3].The age-sex standardized rate varies by socioeconomic status at about 52% higher in the most than least disadvantaged socioeconomic quintile of residential areas [7].Compared with the non-Indigenous population, the age-standardized incidence rate has been reported at 132% higher for Aboriginal and Torres Strait Islander people [7].Residents of major cities and remote/very remote country areas are reported to be at a 20-25% higher risk than those living in regional country areas [7].
The study aim is to examine sociodemographic disparities in risk of liver cancer, and of more advanced stage at diagnosis, in greater detail to inform the planning and benchmarking of NSW cancer services.This has become possible through linking NSW Cancer Registry (NSWCR) data with Australian Bureau of Statistics (ABS) data extracts for the post-2016 census diagnostic period.

Study design
A retrospective cohort design covered all NSW residents included in the 2016 census when aged 18 + years.The study period was from September 2016 to December 2018.The study was designed to gain more detailed evidence of sociodemographic disparities in risk of liver cancer, and of more advanced diagnostic stage, to inform the planning and benchmarking of NSW cancer services.In particular the purpose was to indicate groups at elevated risk of liver cancer who may require additional attention in service planning and delivery.This was not a causal study.

Data sources
NSWCR liver cancer registrations were linked with sociodemographic and health data obtained through the Australian Bureau of Statistics (ABS) Multi-Agency Data Integration Project (MADIP), [12] for the study period.NSWCR provided dates of liver cancer diagnoses and stages (degree of spread) at diagnosis.Sociodemographic data were obtained through MADIP from the 2016 Australian census and other administrative sources, including data on health and educational status, ethnicity, household income, and employment [12].Universal health insurance claims data from the Medicare Benefits Schedule (MBS) and Pharmaceutical Benefits Scheme (PBS) were also obtained [12].

Data management
MADIP included a unique Person Linkage Spine (PLS) for any person recorded in the Australian Medicare Consumer Directory, Centrelink or Taxation datasets between 2006 and 2016, which enabled the ABS, as the accredited Integrating Authority, to link multiple datasets.The present study included records for all adults aged 18 years or more in NSW at time of the Census (August 2016) and recorded on the PLS.Exclusions comprised those without a PLS, and those with a first invasive cancer diagnosis other than cancer of the liver and intrahepatic bile ducts (ICD-10-AM C22) occurring between September-2016 and December-2018 [13].
Conclusions Linked data enabled more detailed risk profiling than previously possible.This will support the targeting of cancer services and benchmarking.

Data variables
The NSWCR provided data on degree of cancer spread (local, regional, and distant/unknown).Census records provided socio-demographic data on age, sex, geographic residential remoteness using the Accessibility and Remoteness Index of Australia (ARIA), Aboriginal selfidentification, country of birth, and household composition [12].Census data further indicated socio-economic position as indexed by the ABS Socio-economic Indexes for Areas (SEIFA) and specifically, the Index of Relative Socio-Economic Disadvantage (IRSD) [14,15].
PBS records available through MADIP were used for each person in the 12-months before liver cancer diagnosis, or the 12-months before the census enumeration for those without a cancer diagnosis.PBS extracts included the Anatomical Therapeutic Classification (ATC) of prescribed medications enabling categorization of medicated conditions using the Rx Risk comorbidity index [16].To assess the potential for selection bias within the entire enumerated population, we examined census variables for systematic variations according to whether recorded on the PLS.
The linked NSW data with identifiers removed were stored in a high-security ABS repository (DATALAB) for analysis by remote access [12].The primary outcome variable was first diagnosis of liver cancer following the census, from September 2016 to December 2018 (= 1), as opposed to no first cancer diagnosis (= 0).The secondary outcome variable was the degree of spread among those diagnosed with liver cancer, classified as distant/ unknown disease (= 1) compared with localized/regional extent of disease (= 0).These binary classifications were used to gain sufficient numbers to avoid prohibitively small cell counts, and increase interpretability of results.Unknown degree of spread was combined with distant spread because its disease-specific survival was previously shown to be lower than for localized/regional spread (i.e., its survival was more akin to survival for distant spread) [17].
Socio-demographic variables included age at census, arranged in categories for tabling results and as a continuous measure in multivariable models, sex, geographic remoteness (major city, inner regional, outer regional/ remote), ancestry (based on country of birth grouped as Australia; China; Greece; Italy; Lebanon; New Zealand; the Philippines; the United Kingdom; Vietnam; "other mainly English speaking" and "other mainly non-English speaking" countries), and lone occupant household.Socio-economic disadvantage covariates included arealevel IRSD quintiles based on Statistical Areas (SA2).
We included each of the discrete variables underpinning the IRSD as dichotomized variables following ABS methods [17].Those variables included poor English language proficiency, low household income, core function-limiting disability, employment status and occupation (drivers and laborers), education attainment, a household with children, resident household numbers, and rental through a housing authority.Additional data on housing, such as overcrowding, household internet connection, and car ownership, were not available through MADIP.

Data quality
Registry data achieve high quality data standards as recommended for liver cancer by the International Agency for Research on Cancer (Volume 11), Cancer Incidence in Five Continents for 2008-2012 diagnoses) [13].A key index of data accuracy is the percentage of cancers microscopically verified (MV%) which is high for Australia by world standards [13].In Australia, NSW reported a higher MV% for liver cancer than other States and Territories for males and females collectively, and higher than for Australia overall both for males and females [13].

Statistical analysis
Analyses for each outcome were undertaken for males and females separately in two steps.In step 1, cross-tabulations, and odds ratios with 95% confidence intervals derived from multiple logistic regressions, were used to compare characteristics of cohort members according to whether they were subsequently diagnosed with liver cancer.Also, within cohort members diagnosed with liver cancer, similar comparisons were made by degree of spread (i.e., distant/unknown degree of spread compared with localized/regional) [18,19].This was undertaken for each socio-demographic and health status variable.Odds ratios were age-adjusted, as initial examination showed age was strongly associated with each variable (e.g., 93% of the linked liver cancer cohort compared with 44% of the unlinked were aged 50 + years; and proportions of liver cancers with distant/unknown degree of spread increased from 35.2% for ages < 50 years to 50.6% for ages 80 + years for males and females in aggregate) (Tables 1  and 4).
In step 2, multivariable analyses were undertaken, starting with all covariates, then purposefully removing the least-contributing covariates (using Wald statistic p-values of > 0.2 as a guide), and refitting the model with remaining covariates until deriving a main effects model where each retained covariate substantially contributed [18,19].Potential for co-linearity among predictor variables was tested using variance inflation factors to ensure it was within accepted limits.Data preparation and analyses were undertaken using Stata 17 within the ABS DATALAB facility.
Step 1 and 2 analyses were undertaken for all variables where data were collected prior to the liver cancer diagnoses (Tables 2-4).Preliminary age-adjusted analyses were also undertaken for supplementary variables shown in Appendix A.

Flow diagram for cohort selection
Census enumeration in NSW included 6,120,982 adults, 89.9% of whom were linked with a PLS link assigned (Fig. 1).Higher proportions of persons without a PLS link included males (11.1% of males versus 9.2% of females), the youngest (11.9%) and oldest adults (13.1%), those in remote areas (15.5%), and those from an unspecified, non-English speaking country of birth (29.1%).People first diagnosed with a cancer other than liver cancer were excluded.The resulting linked liver-cancer cohort (n = 1,185) was compared with other MADIP linked people in NSW for whom there was no evidence of liver cancer from the Cancer Registry (n = 5,422,133) (Fig. 1; Table 1).

Males -Adjusting only for age
Odds ratios for liver cancers in males are shown in Table 2 by: country of birth, with higher ratios for China at 1.87 (1.36, 2.57), New Zealand, Philippines and Vietnam in aggregate at 1.86 (1.41, 2.46), and "other non-English speaking" countries at 1.25 (1.05, 1.50), compared with Australian-born; and for residents with poor English-speaking proficiency, low-income households, educational achievement of less than year 12, those recording no education, and those with occupations recorded as labourer or driver.Elevated odds ratios were also seen for those renting from housing authorities, those living in a sole person household, or in a jobless household with children.Less socioeconomic disadvantage was associated with reduced odds of liver cancer.Other associations with elevated odds of liver cancer occurred when having a disability of at least 6-months duration when aged < 70 years, increased numbers of medicated conditions, receiving affective and antipsychotic medications, and with increased numbers of general practitioner consultations, evidence of health plans and mental health plans, and elevated numbers of chronic disease plans (Appendix A).

Multivariable age-adjustment
Results indicated substantive elevated odds of liver cancer by country of birth for China at 2.25 (1.56, 3.24), and New Zealand, Philippines and Vietnam in aggregate at 1.80 (1.35, 2.41), when compared with Australian-born (Table 2).Elevated odds also applied for residents with poor English-speaking proficiency, low-income households, low educational attainment of under year 12, occupations of labourer or driver, renting from a housing authority, having a sole person household, having a disability of 6 + months when aged < 70 years, and having higher numbers of conditions medicated.

Females -Adjusting only for age
Odds ratios for liver cancer are shown in Table 3.A lower odds ratio at 0.54 (0.32, 0.83) was indicated for regional/ remote residential areas than major cities.Other differences included elevated odds ratios by country of birth for China at 2.22 (1.31, 3.77); New Zealand, Philippines, and Vietnam collectively at 2.19 (1.38, 3.45); and other "mainly non-English speaking" countries at 1.59 (1.19, 2.13), when compared with Australian-born.Elevated age-adjusted odds ratios also applied for those with poor English-speaking proficiency, a low educational attainment below year 12, those recorded as having "no education", those renting from a housing authority, those living in a jobless household with children, those having a disability of 6 + months duration when aged < 70 years, and the more socioeconomic disadvantaged, measured at individual household or residential SA2 level.Odds of liver cancer also increased with numbers of medicated conditions.

Multivariable age-adjustment
These analyses indicated substantive elevated odds of liver cancer by country of birth for China at 1.88 (1.01, 4.48), and New Zealand, Philippines, and Vietnam collectively at 1.88 (1.16, 3.05), when compared with Australian-born (Table 3).Other elevations in odds ratios were indicated for poor English-speaking proficiency, renting from a housing authority, having a disability of 6 + months duration when aged < 70 years, and numbers of medicated conditions.

Distribution of stage (degree of spread) by age
Localized/regional spread became less common and distant/unknown spread became more common with increasing age at diagnosis (p = 0.004 for males; p = 0.038 for females) (Table 4).

Age-adjusted odds (95%CI) of distant/unknown versus localized/regional degree of spread 1. Males -Adjusting only for age
Results indicated an elevated odds ratio of 1.54 (1.14, 2.09) for regional and remote residential areas compared with a major city (Table 5).
Multivariable age-adjustment Results were similar with substantive elevated odds of 1.63 (1.20, 2.22) for regional and remote areas compared with a major city.Also, an elevation of 1.72 (1.10, 2.70) applied when having a disability of 6 + months duration at age < 70 years.

Females -Adjusting only for age
Results did not point to variations in odds of distant/ unknown stage for the independent variables when adjusting only for age (Table 6).
Multivariable age-adjustment This revealed elevated odds when 8 + other conditions were recorded of 2.91(1.25,6.79) compared with the reference of 3 conditions.By comparison, poor English-speaking proficiency was associated with reduced odds of distant/unknown degree of spread at 0.39 (0.18, 0.80).

Discussion
The present results complement those from previous studies with a broader range of sociodemographic and health characteristics that are associated with liver cancer and more advanced stage at diagnosis in NSW.These data will inform service planning and targeting.Repeating the process on a periodic basis, potentially in relation to future censuses, will indicate changes in risk profiles that may inform adjustments to service plans and priorities.
Previous studies have shown liver cancer rates in Australia to be associated with older age, male sex, lower area-based socioeconomic status, countries of birth outside Australia in Asia, and more recent time periods [1][2][3]7].The linked MADIP variables generally confirmed these earlier findings, supporting the likely validity of these data and the study design.
Results indicated higher age-adjusted odds of liver cancer by residential location in a major city, and by country of birth, with elevations for China, the Philippines and Vietnam, and less so, for Greece, Italy, and Lebanon, and New Zealand.Higher age-adjusted odds also applied to residents with poor English proficiency, low-income households, lower educational attainment, occupations of labourer or driver, those renting housing from housing authorities, sole person and sole female households, hose with poor English-speaking proficiency, low-income households, lower educational attainment levels than year 12, occupations of labourer or driver, those renting accommodation from    3 Age-adjusted odds ratios (95% CI) of females having a liver cancer diagnosed during Sept 2016 -Dec 2018, according to sociodemographic predictors and medicated conditions: linked NSW Cancer Registry and MADIP data housing authorities, sole person households, sole parent households (in females), jobless households with children, and those with a disability of six months or longer when aged under 70 years.These characteristics suggest that liver cancer generally occurs more frequently among residents experiencing other social and health disadvantage, thereby potentially compounding inequality.Irrespective of whether assessed at individual or SA2 residential area level, more socioeconomic disadvantage was associated with higher age-adjusted odds of liver cancer.The odds also were higher in those with a high number of medical conditions under medication.
Other results pointed to a higher proportion after age adjustment of distant or unknown degree of spread at diagnosis for residents of regional and more remote residential areas as opposed to a major city, those experiencing prolonged disability at age < 70 years, and in females, those with high numbers of other concurrent conditions.There were also some unexpected findings, including lower age-adjusted odds of liver cancer among those caring for a person with disability.These findings need further investigation and ideally, confirmation with data from other jurisdictions.They highlight the fact that correlates, while of potential value for service planning, may not have causal significance.
Data for Aboriginal and Torres Strait Islander residents were not presented in this paper due to small numbers.Larger numbers will be pursued in a further project in multi-jurisdictional analyses.
Preventive opportunities exist by targeting high-risk groups.These should be informed by the scientific literature.Examples would include, where relevant, the use of hepatitis B vaccination and promotion of healthy lifestyles to address risks from diabetes mellitus, being overweight and obese, having high alcohol consumption, being tobacco smokers, illicit drug users, and having unprotected sex [5,6].Carriers of hepatitis B and C infection should receive guidance on how best to avoid transmission of infection to uninfected partners and family members.Notably around half the burden of liver cancer in Australia have been attributed to hepatitis B and C infections, [5] such that treating and curing HCV and suppressing HBV infection with drugs would markedly reduce the risk if hepatocellular carcinoma [20][21][22].
The current study moves beyond behavioral risk profiles to document demographic and social characteristics that accompany liver disease.Those characteristics can be mapped throughout the community using census and other records and inform local area, preventive and early detection interventions.Local conversations on prevention can then focus on people and their circumstance rather than personal behavior at the outset.
Survival outcomes for liver cancer are poor, with a fiveyear relative survival approximating 22% in Australia [3,4].The potential to increase survival through earlier detection is indicated by evidence of smaller cancers diagnosed through surveillance that are more likely to have curative treatment, and where results indicate higher survival to persist after adjusting for lead time and related biases [22].In addition, it would be desirable to seek care early, where  possible from specialist clinical units experienced in the management of this disease.Five-year relative survival is higher with less extensive spread of this cancer at diagnosis, as indicated by NSW and international data, including USA SEER data indicating survival of 35% for localized stage, 12% for regional spread, and 3% where distant metastases apply [23].
Policy priorities for liver cancer include increasing awareness of risk factors, optimizing hepatitis B vaccine coverage for high-risk populations, including universal coverage for Aboriginal and Torres Strait Islander people, and for those migrating to Australia from the high-risk Asian and other countries identified in this report, and increasing access to early treatment and support for people infected with hepatitis B and C.
Positive features of this study were the ease with which deidentified linked routinely collected data could be used to better indicate the risk profiles of NSW population groups at increased risk of liver cancer, and those being diagnosed at a more advanced stage.NSW (and Australia) has long had well-developed population-based cancer and cancer management databases, [24] well defined data linkage protocols, and a network of data linkage units and remote-access laboratories to use these data safely [25,26].This framework needs further development to provide more comprehensive population-wide evidence to inform service planning, delivery and evaluation.
Limitations of this study include the combining of categories where numbers were low and point estimates were similar.This was a result-guided activity born of necessity to    5 Age-adjusted odds ratios (95% CI) of males having distant/unknown liver cancer spread compared with localized/regional spread diagnosed during Sept 2016 -Dec 2018, according to sociodemographic predictors: linked NSW Cancer Registry and MADIP 2016 data build numbers.Larger numbers should be pursued through multi-jurisdictional studies to gain more precise results without the need to combine categories.Another limitation was the lack of access in this study to linked population-wide data on prevalence of hepatitis B and C infection, hepatitis B vaccination, and risk factors such as overweight and obesity, diabetes, excess alcohol consumption, tobacco smoking, illicit drug use, unprotected sex, and medical conditions such as metabolic dysfunction associated fatty liver disease, cirrhosis and haemochromatosis [5,6].Linked data on diagnostic and clinical care pathways and supportive care were not available for use in this study to investigate disparities in service utilization and timeliness.Population-based data from biobanks and genomic databases also were not available, which could have increased the value of the linked data for exploring at population level the effects of biological factors at cellular and sub-cellular level.Future data collection and data linkage activity should address these limitations.
The present study included many comparisons, most of which were consistent in showing disparities in liver cancer incidence and staging patterns across population subgroups, and many of which would have added to social hardship and health disadvantage.Confirmatory data from other states and territories would have strengthened these findings, especially where numbers were small and more open to random variation.
This study describes subgroups at increased of liver cancer and advanced diagnostic stages to inform service planning and benchmarking in NSW.It was not designed to investigate causation or produce novel epidemiological insights, although with broader data linkages there would be the potential to do so.

Conclusions
This study demonstrated the ease with which linked routinely collected data could be used to identify sociodemographic disparities in liver cancer risk and more advanced stage in NSW.MADIP data holdings provide value for this purpose that add to that already available from the NSW Cancer Registry and Australia-wide through the Australian Institute of Health and Welfare and Australasian Association of Cancer Registries.The data assist insights into the social determinants and burdens of these cancers.In this study the data indicate that liver cancer places a heavier burden on those sectors of the NSW population already likely to be disproportionately affected by social hardships and health disadvantage.Less favorable stage-related prognostic profiles were observed in more remote populations and those already living with disability and numerous health conditions.These NSW-wide data are available to assist the planning of health and welfare services.NSW has advanced data linkage systems that can be used for this purpose.Capacity for data linkage is being extended across the Australian population to cover cancer management from primary prevention and screening through to cancer care and support along the cancer pathway.Further data development will support an evidence-based approach to cancer control, including assisting with priority setting, targeting of services, and establishing benchmarks.

Data Availability
The original data for this study were provided by the Australia Bureau of Statistics, the Australian Department of Health and the NSW Ministry of Health following approval by relevant ethics committees.These data may be available to other researchers meeting the relevant data access and ethical requirements.Requests and enquiries on the data processing and analyses code for this article can be made to DB.

Table 1
Age distribution of NSW liver cancer cases (Sept 2016-Dec 2018) and NSW community controls (2016 census) *

Table 2
Age-adjusted odds ratios (95% CI) of males having a liver cancer diagnosed during Sept 2016 -Dec 2018, according to sociodemographic predictors and medicated conditions: linked NSW Cancer Registry and MADIP data

Table 4
Number (%) of liver cancer cases by degree of spread (DOS) at diagnosis: NSW Cancer Registry, September 2016 to December 2018

Table 6
Age-adjusted odds ratios (95% CI) of females having distant/unknown liver cancer spread compared with localized/regional spread diagnosed during Sept 2016 -Dec 2018, according to sociodemographic predictors: linked NSW Cancer Registry and MADIP 2016 data