Skip to main content
  • Research article
  • Open access
  • Published:

Healing, surviving, or dying? – projecting the German future disease burden using a Markov illness-death model



In view of the upcoming demographic transition, there is still no clear evidence on how increasing life expectancy will affect future disease burden, especially regarding specific diseases. In our study, we project the future development of Germany’s ten most common non-infectious diseases (arthrosis, coronary heart disease, pulmonary, bronchial and tracheal cancer, chronic obstructive pulmonary disease, cerebrovascular diseases, dementia, depression, diabetes, dorsal pain and heart failure) in a Markov illness-death model with recovery until 2060.


The disease-specific input data stem from a consistent data set of a major sickness fund covering about four million people, the demographic components from official population statistics. Using six different scenarios concerning an expansion and a compression of morbidity as well as increasing recovery and effective prevention, we can show the possible future range of disease burden and, by disentangling the effects, reveal the significant differences between the various diseases in interaction with the demographic components.


Our results indicate that, although strongly age-related diseases like dementia or heart failure show the highest relative increase rates, diseases of the musculoskeletal system, such as dorsal pain and arthrosis, still will be responsible for the majority of the German population’s future disease burden in 2060, with about 25–27 and 13–15 million patients, respectively. Most importantly, for almost all considered diseases a significant increase in burden of disease can be expected even in case of a compression of morbidity.


A massive case-load is emerging on the German health care system, which can only be alleviated by more effective prevention. Immediate action by policy makers and health care managers is needed, as otherwise the prevalence of widespread diseases will become unsustainable from a capacity point-of-view.

Peer Review reports


The development of future patient numbers is an important concern for many stakeholders in the health systems. Rational decisions about the planning of hospital capacities, pharmaceutical investments, career choices of (future) healthcare professionals as well as the development of future health care expenditures itself depend on the precise knowledge of the future development of specific diseases.

Germany is one of the fastest ageing countries in the world due to constantly low fertility rates since the 1970s and a continuously increasing life expectancy. In the literature there are different rival theories and hypotheses how an increasing life expectancy will particularly affect the disease burden and the related health care expenditure. Gruenberg (1977) [1] and Verbrugge (1984) [2] hypothesise that a rising longevity goes hand in hand with an increase in years spent in illness and therefore with an expansion of morbidity in older age groups. In contrast, Fries (1980) [3] assumes that an increasing life expectancy leads to a compression of morbidity. Given these somehow contradictory hypotheses, the influence of proximity to death and treatment spending as a function of remaining life expectancy are controversially discussed among health economists [4,5,6,7].

However, even less evidence exists today concerning the (more epidemiological) question of specific diseases’ future development in the light of the different hypotheses. A systematic literature review on PubMed searching for projections (or synonyms) in context of demography and using the keywords prevalence, incidence or burden of disease for specific or chronic non-infectious diseases in general shows 160 relevant publications. There are three categories of studies by their projection methodology: trend extrapolations (99/160), multistate models (57/160) and studies using both methodologies (4/160). In 54 of the studies using trend extrapolation (103/160) indeed current prevalence or incidence rates are transferred to population projections, which excludes a specific modelling of the various theses. This so-called status quo analysis is also commonly used in projections of health expendituresFootnote 1. Out of the 61 studies using multistate modelling (61/160), 17 (17/61) are based on the classical structure of an illness-death model (even if only 7 explicitly define it that way). However, only nine of the studies (9/61) focus on an explicit modelling of a compression of morbidity, of them eight (8/9) related to dementia. Furthermore, just seven studies (7/61) compare the development of more than two different diseases, only one of them modelling compression scenarios [9] (see the appendix for more detailed information and results on the systematic database search).

In our paper, we present projections for ten common non-infectious diseases (arthrosis, coronary heart disease, pulmonary, bronchial and tracheal cancer, chronic obstructive pulmonary disease, cerebrovascular diseases, dementia, depression, diabetes, dorsal pain and heart failure). The selected diseases represent the intersection between the most common and most expensive disease patterns in Germany [10]. For the projections we use a time-discrete Markov illness-death model with recovery. Our model allows us to regard the different hypotheses in context of demographic transition and to quantify the influence of potentially changing variables (disease-specific survival, incidence and recovery rate) on the future frequency of diseases. In addition, we show the influence of successful prevention on long-term prevalence of the different diseases.

The population-related components used for modelling stem from Destatis, the German Federal Statistical Office, whereas the disease-specific components are computed on the data of a major sickness fund covering approximately four million insureds during the period from 2009 to 2017. Our data set is unique as we calculated the input data ourselves using disease-specific validation criteria selected for this purpose (shown in section Dataset). Hence, our study is one of the few that use insurance data (7/160), although the resulting treatment prevalence is of particular importance for decision makers and payers in the health care system. Data sources from other studies of the systematic literature review are surveys or other epidemiological studies (61/160), a literature review for the different input factors (34/160), registries (28/160) or mixed data sources (30/160).

The paper is organised as follows: we start with the presentation of our time-discrete Markov illness-death model with recovery as well as our data set. Then, we show our results for the future development of the ten diseases (average prevalence rates and number of patients) in different populations and scenarios, also considering the results of other publications. This is followed by a discussion of the results in view of the current state of research and the limitations, finishing with a concluding summary.


Markov illness-death model with recovery

We will calculate the future number of patients and the future average prevalence rates for the total population from 2018 to 2060Footnote 2 using a time-discrete Markov illness-death model with recovery. The model is based on the cohort-component-method [11], which is widely used for (official) population projections. Regarding epidemiologic modelling, it can be attributed to the work of Fix & Neyman (1951) [12] and is closely related to those of Manton et al. (1984), Brookmeyer et al. (1998), Brinks et al. (2012), and Andersson et al. (2015) [13,14,15,16], but differs inthe detail level of the rich routine data set used. The specific cohort data by age and gender with corresponding detail diagnosis allows us to vary different variables over time (future development of the disease-specific survival rate, incidence rate and recovery rate). In contrast to most other studies using an illness-death approach (16/17) including the work of Milan & Fetzer (2019) [17], on which our modelling is based, the model also includes the possibility of recovery.

The starting point of our model is the number of patients Pa,g (differentiated by age a between 0 and 100 and gender g which is men or women) in our starting year T. It results from the prevalence rate pa,g,T multiplied by the cohort size Ka,g,T.

$$ {\boldsymbol{P}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}}={\boldsymbol{K}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}}{\boldsymbol{p}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}} $$

In models extrapolating current prevalence rates (status quo analysis) pa,g,T is assumed to be constant over time and only the future cohort sizes determine the future development of patients. In contrast to this, for all following years, age- and gender-specific incidence and recovery rates as well as the mortality rates of patients are used in our model to calculate the (future) number of patients Pa,g,T + tAt this point we distinguish between the group of patients which are comprised of the surviving patients of the previous year \( {\boldsymbol{D}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}}^{\boldsymbol{T}+\boldsymbol{t}-\mathbf{1}} \) and the group of newly diseased patients Ia,g,T + t.

$$ {\boldsymbol{P}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}}={\boldsymbol{D}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}}^{\boldsymbol{T}+\boldsymbol{t}-\mathbf{1}}+{\boldsymbol{I}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}} $$

In order to calculate the surviving patients of the previous year \( {\boldsymbol{D}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}}^{\boldsymbol{T}+\boldsymbol{t}-\mathbf{1}} \) we use the disease-specific mortality difference mda −1,g,T + t −1 which is subtracted from the survival rate of each cohort sra −1,g,T + t −1Footnote 3. Also we consider disease-specific recovery rates ra −1,g,T + t −1 as followsFootnote 4:

$$ {\boldsymbol{D}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}}^{\boldsymbol{T}+\boldsymbol{t}-\mathbf{1}}={\boldsymbol{P}}_{\boldsymbol{a}-\mathbf{1},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}-\mathbf{1}}\left({\boldsymbol{sr}}_{\boldsymbol{a}-\mathbf{1},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}-\mathbf{1}}-{\boldsymbol{md}}_{\boldsymbol{a}-\mathbf{1},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}-\mathbf{1}}\right)\left(\mathbf{1}-{\boldsymbol{r}}_{\boldsymbol{a}-\mathbf{1},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}-\mathbf{1}}\right) $$

To determine the number of new patients Ia, g, T + t, the number of surviving non-diseased from the previous year is calculated as follows in a first step:

$$ {\boldsymbol{ND}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}}^{\boldsymbol{T}+\boldsymbol{t}-\mathbf{1}}={\boldsymbol{K}}_{\boldsymbol{a}-\mathbf{1},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}-\mathbf{1}}{\boldsymbol{sr}}_{\boldsymbol{a}-\mathbf{1},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}-\mathbf{1}}-{\boldsymbol{D}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}}^{\boldsymbol{T}+\boldsymbol{t}-\mathbf{1}} $$

In a second step the number of new patients Ia,g,T + t, which results from the age- and gender-specific incidence rate ia,g,T + t, is multiplied with the surviving non-diseased from the previous year:

$$ {\boldsymbol{I}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}}={\boldsymbol{ND}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}}^{\boldsymbol{T}+\boldsymbol{t}-\mathbf{1}}{\boldsymbol{i}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}} $$

The total number of patients PT + t in all years T + t is finally calculated as:

$$ {\boldsymbol{P}}_{\boldsymbol{T}+\boldsymbol{t}}={\sum}_{\boldsymbol{a}=\mathbf{0}}^{\mathbf{100}}\left({\boldsymbol{D}}_{\boldsymbol{a},\boldsymbol{women},\boldsymbol{T}+\boldsymbol{t}}^{\boldsymbol{T}+\boldsymbol{t}-\mathbf{1}}+{\boldsymbol{I}}_{\boldsymbol{a},\boldsymbol{women},\boldsymbol{T}+\boldsymbol{t}}\right)+{\sum}_{\boldsymbol{a}=\mathbf{0}}^{\mathbf{100}}\left({\boldsymbol{D}}_{\boldsymbol{a},\boldsymbol{men},\boldsymbol{T}+\boldsymbol{t}}^{\boldsymbol{T}+\boldsymbol{t}-\mathbf{1}}+{\boldsymbol{I}}_{\boldsymbol{a},\boldsymbol{men},\boldsymbol{T}+\boldsymbol{t}}\right) $$

In our model for all years T + t the future cohort sizes, Ka,g,T + t as well as the future survival rates sra,g,T + t of the total population are derived from a population projection, which we calculate via the cohort component method. Within this framework we consider the disease-specific components. The calculation of the survival rate of the patients as the difference sra −1,g,T + t −1 − mda −1,g,T + t −1 and the surviving non-diseased \( {\boldsymbol{ND}}_{\boldsymbol{a},\boldsymbol{g},\boldsymbol{T}+\boldsymbol{t}}^{\boldsymbol{T}+\boldsymbol{t}-\mathbf{1}} \) as the difference between all survivors of the cohort and the surviving patients from the previous period finally merge the population projection with the epidemiological developments. Thus, the design of our model also allows the use of input data from any other population projection or/and disease-specific statistic. This time-discrete approach is also more intuitive to understand for a broader audience, such as policy setters and health care decision makers.

Dividing the total number of patients by the total number of the population results in the average prevalence rate of the total population, apr, which we will present in addition to the total number of patients in the result section. Obviously, the apr highly depends on the share of the elderly and diseased within the total population. As the German demographic transition leads to an increasing proportion of elderly cohorts, we call this effect cohort effect, which can also be observed in models extrapolating current prevalence rates using the status quo analysis.

As for the further effects of our model, we will take a closer look at the future age- and gender-related prevalence rate pT +1, which can be obtained by dividing the number of patients (eqs. 2 to 5) by the total corresponding cohort Ka,g,T + t = Ka −1,g,T + t −1sra −1,g,T + t −1 and therefore is independent of future cohort sizes:

$$ {\boldsymbol{p}}_{\boldsymbol{T}+\mathbf{1}}=\frac{{\boldsymbol{p}}_{\boldsymbol{T}}\left(\mathbf{1}-\boldsymbol{i}\right)\left(\mathbf{1}-\boldsymbol{r}\right)\left(\boldsymbol{sr}-\boldsymbol{md}\right)+\boldsymbol{isr}}{\boldsymbol{sr}} $$

For reasons of simplicity we use time-independent incidence, recovery and mortality rates and abstract from the indices of age and gender in eq. (7). The total derivate can be used to determine the impact of changing incidence, recovery and mortality rates on the prevalence in year T + 1.

$$ {\boldsymbol{dp}}_{\boldsymbol{T}+\mathbf{1}}=\left(\left(\mathbf{1}-\boldsymbol{i}\right)\left(\mathbf{1}-\boldsymbol{r}\right)\frac{\left(\boldsymbol{sr}-\boldsymbol{md}\right)}{\boldsymbol{sr}}\right)\boldsymbol{d}{\boldsymbol{p}}_{\boldsymbol{T}} $$
$$ +\left(\frac{{\boldsymbol{P}}_{\boldsymbol{T}}\left(\mathbf{1}-\boldsymbol{i}\right)\left(\mathbf{1}-\boldsymbol{r}\right)+\boldsymbol{i}-{\boldsymbol{p}}_{\boldsymbol{T}+\mathbf{1}}}{\boldsymbol{sr}}\right)\boldsymbol{dsr} $$
$$ -\left(\frac{{\boldsymbol{P}}_{\boldsymbol{T}}\left(\mathbf{1}-\boldsymbol{i}\right)\left(\mathbf{1}-\boldsymbol{r}\right)}{\boldsymbol{sr}}\right)\boldsymbol{dmd} $$
$$ -\left({\boldsymbol{P}}_{\boldsymbol{T}}\left(\mathbf{1}-\boldsymbol{i}\right)\frac{\left(\boldsymbol{sr}-\boldsymbol{md}\right)}{\boldsymbol{sr}}\right)\boldsymbol{dr} $$
$$ +\left(\mathbf{1}-{\boldsymbol{P}}_{\boldsymbol{T}}\frac{\left(\boldsymbol{sr}-\boldsymbol{md}\right)}{\boldsymbol{sr}}+{\boldsymbol{P}}_{\boldsymbol{T}}\boldsymbol{hr}\frac{\left(\boldsymbol{sr}-\boldsymbol{md}\right)}{\boldsymbol{sr}}\right)\boldsymbol{di} $$

In our model specification, the variables pT, sr, md, r and i can take on values between 0 and 1 and the disease-specific mortality difference md is less (or in theory equal) than the survival rate of the entire population sr. As eq. (8) shows, a higher prevalence rate p in year T leads to a higher prevalence rate in year T +1. The theoretical one-to-one impact of this effect is lowered by the degree of the incidence and recovery rate as well as the disease-specific mortality difference.

An increase of the survival rate sr initially leads to an increase in both, the diseased and the non-diseased population. In conjunction with the incidence rate i, a positive impact on the prevalence rate in year T +1 can be observed as the rising survival rate leads to a higher “at risk” population. In contrast to this, a higher mortality difference md leads to a decline in the prevalence rate in year T +1. Both effects combined can be interpreted as follows: The smaller the difference in mortality between the diseased and non-diseased, the higher the positive impact of an increasing survival rate.

The influence of the recovery rate is negative and linked to the life expectancy of the patients. The more patients survive until the following year, the more can recover again. However, the higher the incidence rate and thus the proportion of new patients, the lower the proportion of persons who could potentially recover, which mitigates the negative effect of the recovery rate.

Considering the impact of increasing incidence rates also offers a connection between the incidence and the recovery rate. A higher proportion of recovered people leads to a higher “at-risk” population. The opposite effect results from a higher prevalence rate in year T which comes along with a lower “at-risk” population.


Regarding the effects outlined above, a change of one variable will always affect the future prevalence in interaction with the other components. To illustrate these effects and the sensitivity of the model, we model six scenarios of changing disease-specific variables mda,g, ia,g and ra,g for each of the ten diseases up to 2060, especially regarding the different hypotheses of expansion and compression of morbidity (see Table 1). In all scenarios we assume increasing survival rates sra,g according to the moderately increasing life expectancy scenario L2 [18]).

Table 1 Scenarios, assumptions and their effect on the future prevalence rate

In the first scenario, we hold all disease-specific variables constant over the time horizon. However, the assumption of a constant mortality difference and rising survival rates (sra,g,T + t> sra,g,T + t −1) leads to an increase in life expectancy of both the non-diseased and the diseased. In conjunction with constant incidence rates (iag= const), this results in an increasing duration of disease. Thus, the scenario Expansion 1 can be interpreted as a type of expansion of morbidity hypothesis. This scenario serves as our baseline scenario in the following. The scenario Expansion 2 is a more extreme scenario of the expansion of morbidity hypothesis, assuming an additional 30% increase in incidence rates until 2060 (ia,g,T + t> ia,g,T + t −1).

The compression of morbidity hypothesis is considered in two different scenarios: In the scenario Compression 1 only the healthy population benefits from the increasing life expectancy (\( {\boldsymbol{sr}}_{\boldsymbol{a},\boldsymbol{g}}^{\boldsymbol{D}}=\boldsymbol{const} \)) which leads to a continuous increase in the mortality difference between the diseased and the healthy population. In the scenario Compression 2 a shift of diseased cases in relation to increasing life expectancy is modelled which is in line with the “traditional” compression of morbidity hypothesis and leads to continuously decreasing incidence rates (ia,g,T + t< ia,g,T + t −1).

To highlight the long-term impact of effective prevention programmes, a scenario Prevention is modelled with temporarily decreasing incidence rates (ia,g,T + t< ia,g,T + t −1) up to 30% until 2035. In order to simulate possible effects of better medical care, e.g. due to disease management programmes, the scenario Extended Recovery assumes increasing recovery rates up to 50% until the year 2060 (ra,g,T + t> ra,g,T + t −1).

Interestingly (and as discussed in the section on the total differential of the prevalence rate), the total effect of the scenarios Compression 1 and 2 as well as of the scenarios Extended Recovery and Prevention on the future (age- and gender-related) prevalence rate is not defined a priori and depends on the numerical ratio of disease-related input data and the increase of survival rates.


The average disease-specific input data for each cohort and genderFootnote 5 derives from a routine dataset of around four million insureds of the AOK Baden-Württemberg from 2009 to 2017Footnote 6. Due to this large number of people insured by the AOK in Baden-Württemberg, this population is approximately representative of the German population regarding the disease-rates within the age cohorts. Table 2 shows the specific selection criteria for each of the ten diseases. Since there are no coding guidelines for outpatient diagnoses in Germany, we use the criteria of the AOK Research Institute published in various reports [19,20,21,22]. The M2QFootnote 7/M3Q criterion, for instance, only defines patients as diseased if they have a confirmed diagnosis in at least two and three out of four quarters of the year, respectively. Inpatient primary and secondary diagnosis are included without additional validation criteria. We complete missing data by the following procedure: If the selection criteria are satisfied the year before and the year after, insureds are classified as patients also in the incompletely coded year. Patients are classified as “new patients” when they fail to fulfil the prevalence criteria in any of the four previous years. The days of insurance of the patients identified by diagnosis are then set in relation to those of all insureds to calculate period prevalence pa,g and cumulative incidence ia,g for the years 2015 to 2017 [24]. For pulmonary cancer we use a five-year pre-observation period for the derivation of the incidence. To take into account the periodic character of depression, we use additional selection criteria for new cases and divergent diagnoses to determine prevalence and incidence.Footnote 8

For the calculation of recovery rates ra,g all surviving patients without a coded diagnosis in the following years are set in relation to the total of all surviving patients. For the definition of recovery we use a four-year follow-up period for diseases with realistic cure probabilities (dorsal pain, depression and CVD) and a five-year follow-up period for pulmonary cancer. The maximum follow-up period of 8 years is used for all other diseases since there are still no cure possibilities available for their most common manifestations. Since dementia is (as of yet) characterized by an irreversible disease progression, no recovery rates are considered in these calculationsFootnote 9. For chronic diseases, the recovery rates are to be interpreted as being symptom-free. A recurrence of the disease after years of asymptomatic illness is taken into account by the incidence rate. For each cohort, we calculate mortality differences mda,g as the difference between the 1-year survival rates of the diseased and all insureds in a given year and subtract them from the German population’s survival probability sra,g as described aboveFootnote 10. Table 3 shows the population weighted determined input data as the average value for different age groups and overall average in the base year 2018 for each disease, in parentheses differentiated by gender (female vs male). In addition, Table 4 illustrates the demographic characteristics of the study population as average values of all years analyzed in millions and as percentage compared to those of the entire German population in 2018.Footnote 11

Table 3 Determined disease-specific variables
Table 4 Projected average prevalence rates apr 2060 and percentage change compared to 2018

Inorder to derive the (future) cohort sizes Ka,g and survival rates sra,gwe build different population projections based on input data from Destatis and statistics of As our starting point serves a Stationary Population with constant absolute births and constant life expectancy to separate the effects resulting from disease-specific (epidemiological) components from the effects of the composition of future cohort sizes on the apr. In our second population projection Population (LE constant) we abstract from a further increase in life expectancy. This projection is based on the German population in 2018 under the assumption of a fertility rate of 1.55 children per woman of fertile age. For our third population projection, Standard Population (LE increasing), we further assume an increase of life expectancy from 83.3 to 88.1 years at birth for women and 78.5 to 84.4 at birth for men according to the moderate increase scenario L2 of the 14th population projection [18]. Migration movement is not taken into account, as too little is known about whether disease rates of the German population are transferrable to migrants [27, 28]. Hence, the Standard Population (LE increasing) represents an absolute decline in population from 83.0 to 66.2 million by 2060, accompanied by an increasing old-age dependency ratio from 35.9 to 69.7%.Footnote 12 However, for reason of comparability to other studies, we build a fourth population projection, Population (Migration), where future migration is integrated according to the scenario W2 of the 14th population projection [18].Footnote 13 In this case the total population is 79.1 million people in 2060 and the old-age dependency is 58.8%.


The presentation of our results starts in Table 4 with a comparison of the average prevalence rates apr (i.e. the total number of patients divided by the total number of the population) in the years 2018 and 2060 under the assumption of constant disease-specific variables over the time horizon. We use the three different population projections Stationary PopulationPopulation (LE constant) and Standard Population (LE increasing) to separate the effects resulting from disease-specific (epidemiological) components and those occurring from the demographic components (initial population structure and increasing life expectancy). The values resulting from Standard Population (LE increasing) correspond to the baseline scenario Expansion 1.

The results show a high increase in the apr for strongly age-related diseases like dementia, heart failure or CVD, with the ageing of the German population due to its current structure (Population (LE constant)) and rising life expectancy being the key factors driving the large growth rates. The ratio of people with dementia could more than double by 2060 within the Standard Population (LE increasing). In contrast, the increase of the apr of dorsal pain is mainly driven by the epidemiological effect. Regarding arthrosis and COPD, the increase of apr can be attributed to both, the epidemiological as well as the demographic effects. The smallest increase of apr emerges for diabetes and depression. For both, the epidemiological effect is comparatively low. However, an increase in the average prevalence rate is to be expected for all diseases given the baseline scenario Expansion 1. Even when abstracting from an increasing life expectancy, the ageing of the German population in conjunction with the epidemiological effects will lead to a substantial increase of all diseases.

Figure 1 presents the results for the apr in the year 2060 that occur under the different model scenarios (see Table 2) as well as under a simple extrapolation of age- and gender-related prevalence rates for the population of 2060 (status quo (SQ) principle). For this purpose, we use the Standard Population (LE increasing). The y-axis of Fig. 1 shows the relative change of the apr between 2018 and 2060 whereas the x-axis displays the value of the apr for the different scenarios in 2060. Additionally, the x-axis depicts the numbers of apr in 2018.

Fig. 1
figure 1

Relative change in apr until 2060 in the different scenarios. Source: Own depiction. Abbreviations: Exp1 = scenario Expansion 1, Exp2 = scenario Expansion 2, Comp1 = scenario Compression 1, Comp2 = scenario Compression 2, Rec = scenario Extended Recovery, Prev = scenario Prevention, CA = pulmonary, bronchial and tracheal cancer, CHD = coronary heart disease, COPD = chronic obstructive pulmonary disease, CVD = cerebrovascular diseases, HF = heart failure

As a first result, Fig. 1 illustrates that the ranking of the ten diseases with respect to the value of the apr in 2060 is the same as in 2018, even though the relative change of the apr differs significantly between the ten diseases. That means that dorsal pain and arthrosis are expected to be the two major diagnoses in 2060, although e.g. dementia offers a significantly higher change in the apr in all scenarios.

Second, the results show a different impact of the rival hypotheses regarding the consequences of increasing life expectancy on future disease burden: The expansion of morbidity scenarios Expansion 1 and 2 lead to a soaring increase of all diseases compared to the other scenarios. Especially the scenario of Expansion 2 (with an assumed increase of the incidence rate by 30% until 2060) offers a strong increase of the aprFor strongly age-related diseases such as dementia, CVD or HF, the Compression 2 scenario (shifting the incidence to higher age groups) has a stronger impact on the apr than the Compression 1 scenario, in which the life expectancy for patients is constant over time and only the healthy population benefits from the increasing life expectancy. Yet even in the compression of morbidity scenarios, an increase in all the common diseases can be expected. In other words: The increase in burden of disease due to increasing life expectancy and high incidence rates in older age groups can be mitigated but not fully compensated by a compression.

The assumption of continuously rising recovery rates (scenario Extended Recovery) has an even smaller impact on future apr, although this is also attributable to the low chances of recovery for the considered diseases in general. Only for depression an increasing recovery rate would lead to a constant prevalence rate in the long term. A diminishing effect on future long-term prevalence for all diseases can only be seen in the scenario Prevention. For diabetes and depression, the Prevention scenario even leads to a small decline in the apr. This highlights the importance of effective prevention regarding the upcoming demographic transition.

At a first glance a (simple) extrapolation of current prevalence rates should range between the expansion and compression scenarios, our results offer that this is not true for all diseases. In particular, for dorsal pain, arthrosis, COPD, and cancer the status quo principle leads to an apr in 2060 which is smaller than the scenarios of Prevention. Hence, our results show a wide range future developments of the different diseases depending on the chosen parameters for modelling.

Table 5 shows the absolute results of the projection for 2040 and 2060. As the Standard Population (LE increasing) neglects future migration, the total number of people in Germany will decline between 2040 and 2060. Thus, for the most scenarios and diseases the total numbers of patients are higher in 2040 than 2060. However, the results given the projection Population (Migration) in parentheses offer the opposite effect. Hereby we assume identical disease-related input data for migrants.

Table 5 Projected number of patients 2060 in the different scenarios

All in all, our calculations show that all of the ten diseases are expected to increase up until 2060: Diseases of the musculoskeletal system like dorsal pain and arthrosis will be responsible for the majority of the future disease burden within the German population, possibly affecting about 25–27 and 13–15 million people, respectively, by 2060. Diabetes, which is closely related to other diseases like CHD, is expected to impact at least 9.5 million patients in case of expanding morbidity. With up to 7.4 million people affected in 2060, CHD will continue to be the most common cardiovascular disease. The high growth rates of primarily age-related diseases such as CVD or HF are also steep in absolute terms. Only if prevention strategies are successful, the significant increase in number of patients could be alleviated in the long run.

Our results can be compared with other recent studies for Germany. From the 16 (16/160) studies for Germany in our literature review (concerning our ten most common non-infectious diseases) only six (6/16) were published in the last 5 years and most of them focussing on cancer (3/6), dementia (2/6) or diabetes (1/6)Footnote 14. For diabetes, Tönnies et al. (2019) [29] calculate with the help of an illness-death model and under the assumption of constant incidence rates a higher number of 11.0 million patients for 2040. The discrepancy to our projection (10.3 million) for 2040 is probably due to their older input data, which stem from 2010. The most recent study on dementia by Alzheimer Europe (2020) [30] project 2.7 million patients for 2050 with a status quo projection which lies in the interval of our forecast with 2.5 to 3.0 million people affected. Milan & Fetzer (2019) [17] project 2.6 to 3.3 dementia patients for 2060 by using the same model. The slight differences to their results are attributable to more recent population statistics and disease-specific input data. A comparison of our results with the three studies focusing on cancer is difficult as two of them consider the disease pattern of lung cancer and take a short-term perspective (up to the year 2020), whereas the third focuses on a trend projection of incidence rates (see Fig. 2, Fig. 3 and Table 6 in the appendix for more detailed information and results on the systematic database search).


A projection of ten common non-infectious diseases in concurrent scenarios based on a rich and consistent data set is expanding the literature on the developmentof future disease burden in light of the demographic transition. In this context, ours is one of the few studies using an illness-death approach with recovery and modelling compression of morbidity and prevention scenarios. Furthermore, due to its time-discrete specification, our model could be directly linked to any (official) population projection, and therefore adapted by institutions in the field of policy consulting.

In contrast to a naïve extrapolation (status quo principle), our analysis highlights the importance of focusing on the interdependence between demographic and disease-specific components in projecting future disease burden. Based on six different scenarios we show the possible future range of disease burden and reveal the large differences between the various diseases in interaction with the demographic components. Considering these differences, it becomes clear that the extrapolation of prevalence rates can only reflect the cohort effect caused by population structure and not epidemiologically induced changes in the burden of disease, as observed e.g. for dorsal pain. In contrast, for CHD the status quo projection ranges, as expected, between the compression and expansion scenarios due to minor epidemiological influences.

With regard to the probability of the different hypotheses on future disease burden, the study situation remains inconclusive. Chatterji et al. (2015) [31] show with their detailed review of studies across the world how much the results vary for observed compression or expansion in recent years. However, just looking on the prevalence of chronic diseases (not e.g. in the quality of life) resulted more frequently in an expansion. Considering very similar diseases as our study in connection with proximity to death, Beltrán-Sánchez et al. (2016) [32] show for the United States that those who died in recent times had a higher prevalence of chronic diseases in periods far from death, especially of those chronic diseases with low mortality and high frequency.

Interestingly, even in international studies there are only a few projections for the two major common diseases dorsal pain and arthrosis (1/160 dorsal pain, 10/160 arthrosis or joint replacement procedures), although these diseases are expected to increase the most in total numbers of patients according to our calculations. Our results can be compared with those of Kingston et al. (2018) [33], who use a population sample to model multimorbidity and prevalence of similar diseases for over 65-year-olds in England until 2035. In line with our findings, they predict a significant increase for all diseases considered except depression, but with the largest increases for cancer, diabetes and respiratory diseases. In line with our findings, the only study that also compares different compression scenarios, but with regard to disability due to similar diseases in the UK, by Jagger et al. (2006) [9], concludes that improvements in population health cannot fully compensate the effect of population ageing and that there will still be an increase in number of older people with disabilities.

Of course, our results are also subject to limitations. The Markov assumption of the illness-death model implies that the transition probabilities depend only on the current state and are not influenced by past events. But complex long-term studies, e.g. on the probability of re-disease after a successful recovery, would be necessary to heal this caveat, which are not available for such a large number of insureds. However, regarding the fit with observed incidence or prevalence rates, multistate models used in a retrospective analysis of epidemiological study data (in contrast to regression models) score well [34, 35].

Even if our discrete model has certain advantages, modelling in discrete time might be overestimating epidemiological effects. By comparing the results of a discrete-time model with those of a continuous model, Brinks & Landwehr (2014) [36] show that a projection in discrete time can overestimate future prevalence. However, the authors also state that smaller projection intervals lead to smaller deviations. Our chosen one-year interval leads to about a 10% overestimation in their model.

Nonetheless, this overestimation effect might be somehow offset by the conservative estimates generated by using insurance data, which constitutes another limitation of our measure. Insurance or routine data is primarily collected for invoicing medical services when patients visit a physician. Thus, the resulting prevalence and incidence rates can only be interpreted as treatment rates and are usually slightly lower than those obtained by surveys. In conjunction with the required validation procedures, the actual population incidence could be underestimated. Due to the incomplete coding observed for some diseases, it is also questionable whether the documented onset of illness corresponds to the real date of incidence.

A third limitation could be our data set: The rates determined from the AOK Baden-Württemberg might differ from the rates of the total German population. However, regarding gender-specific differences or frequencies in older cohorts that are particularly relevant for this analysis, various studies indicate that large AOK data sets are representative [37,38,39].

Further insights could be obtained by including multi-morbidity in our modelFootnote 15. Comorbidity analyses could also provide more detailed insights into causes of mortality differences, which would help limiting the range of possible future scenarios. Despite the limitations mentioned, our results can offer an important guide to rational decisions in health care, especially due to the actuality and detail level of the data used. Although the strongly age-related diseases such as dementia or heart failure show the highest relative increase rates, the enormous prevalence of musculoskeletal diseases and depression should not be ignored. Most importantly, for almost all considered diseases a significant increase in burden of disease can be expected even in case of a compression of morbidity.


We think that our approach is useful for consulting health care professionals and politicians in preparing for the upcoming pressure on health care capacities. As the current COVID-19 crisis is showing, health care capacities are quite scarce. Even in our most optimistic scenario we would have the same pressure – at least in numbers – from chronic diseases as currently experienced during the pandemic. The lesson from our analysis is clear: A massive case-load is emerging on the German health care system, which can only be alleviated by more effective prevention. Immediate action by policy makers and health care managers is needed, as otherwise the prevalence of widespread diseases will become unsustainable from a capacity point-of-view.

Availability of data and materials

Population data is available at Destatis, Germany’s official statistical office, and The aggregate claim data from the German sickness fund is available upon request depending on the permission of the data donor.


  1. See for example the Ageing Report published by the European Commission [8].

  2. We chose the year 2060 as the end point of the projection as the official population projection of the German Federal Statistical Office also ends in 2060.

  3. This mortality difference can be interpreted as the difference between the mortality rates of the diseased persons \( {\boldsymbol{mr}}_{\boldsymbol{a},\boldsymbol{g}}^{\boldsymbol{D}} \) and the population mra,g or as the (reverse) difference between the corresponding survival rates sra,g and \( {\boldsymbol{sr}}_{\boldsymbol{a},\boldsymbol{g}}^{\boldsymbol{D}} \).

  4. In the respective year under consideration, we still assume the survival rate of the diseased for the recovered persons before they are transferred to the healthy population in the following year.

  5. An exception are age cohorts between 95 and 100 years, whose disease rates were determined in groups because of relatively few data points.

  6. The disease-specific input data is determined in the pseudonymised database environment of the AOK Baden-Württemberg via SQL scripts, resulting in only anonymised rates being used for the model calculations. Further calculations are executed using Microsoft Excel.

  7. In Germany, this methodology is also used for allocating insureds to risk groups as part of the morbidity-based risk-adjustment scheme in the Statutory Health Insurance [23].

  8. Insureds with single diagnoses F34.1 or F38.1 (short depressive episodes) or isolated outpatient diagnosis in the previous year are not excluded from incidence calculation in order to identify new cases with a documented beginning depressive episode in the pre-observation year.

    Table 2 Diseases and selection criteria
  9. However, for dementia we will assume emerging recovery rates in the scenario Extended Recovery for reason of comparability to the other diseases.

  10. According to other studies, no mortality difference was found for arthrosis and dorsal pain [25, 26].

  11. The group of 0–17-year-olds is left out because the considered diseases are very rare in these cohorts.

  12. Age a is limited between 0 and 100 years and with regard to gender, a distinction is made between male and female cohorts.We model our own population projection as Destatis does not publish a scenario without a future shift in migration. For this purpose, we use the data of to model the survival rate for persons older than 100 years and calibrate the data on the life tables publishes by Destatis for the L2 scenario. In a last step we aggregate the numbers for all persons older than 100 years as our disease specific input data has only few data points for cohorts of age 100 and older.

  13. In line with the W2 scenario published by Destatis, we assume an average positive net migration of 220,000 persons and consider their composition of age groups published by Destatis.

  14. The calculations of the studies mentioned must be compared with the results of the scenario Expansion 1 with consideration of migration, because in all studies the disease rates are also transferred to migrants.

  15. For example, Kingston et al. (2018) [33] use a dynamic microsimulation model to project not only prevalence but also the number of diseases per patient and predicted an increase in complex multimorbidity with more than four diseases over the next 20 years.



Pulmonary, bronchial and tracheal cancer


Coronary heart disease


Chronic obstructive pulmonary disease


Cerebrovascular diseases


Heart failure


Status quo


  1. Gruenberg EM. The failures of success. The Milbank Memorial Fund quarterly. Health and Society. 1977;55:3–24.

    Article  CAS  Google Scholar 

  2. Verbrugge LM. Longer life but worsening health? Trends in health and mortality of middle-aged and older persons. The Milbank Memorial Fund quarterly. Health and Society. 1984;62:475–591.

    Article  CAS  Google Scholar 

  3. Fries JF. Aging, natural death, and the compression of morbidity. N Engl J Med. 1980;303:130–5.

    Article  CAS  PubMed  Google Scholar 

  4. Fuchs VR. "though much is taken": reflections on aging, health, and medical care. The Milbank Memorial Fund quarterly. Health and Society. 1984;62:143–66.

    CAS  Google Scholar 

  5. Seshamani M, Gray AM. A longitudinal study of the effects of age and time to death on hospital costs. J Health Econ. 2004;23:217–35.

    Article  PubMed  Google Scholar 

  6. Zweifel P, Felder S, Meiers M. Ageing of population and health care expenditure: a red herring? Health Econ. 1999;8:485–96.<485::aid-hec461>;2-4.

    Article  CAS  PubMed  Google Scholar 

  7. Breyer F, Lorenz N. The "Red Herring" after 20 Years: Ageing and Health Care Expenditures. Munich: CESifo Group; 2019.

    Google Scholar 

  8. European Commission. The 2018 ageing report: economic & budgetary projections for the 28 EU member states (2016–2070). Luxembourg: Publications Office; 2018.

    Google Scholar 

  9. Jagger C, Matthews R, Spiers N, Brayne C, Comas-Herrera A, Robinson T. Compression or expansion of disability?: forecasting future disability levels under changing patterns of diseases 2006. London.

  10. Federal Statistical Office. Federal Health Reporting. 2019. Accessed 15 Mar 2019.

  11. Whelpton PK. An empirical model of calculating future population. J Am Stat Assoc. 1936;31:457–73.

    Article  Google Scholar 

  12. Fix E, Neyman J. A simple stochastic model of recovery, relapse, death and loss of patients. Hum Biol. 1951:205–41.

  13. Manton KG, Liu K. Projecting chronic disease prevalence. Med Care. 1984;22:511–26.

    Article  CAS  PubMed  Google Scholar 

  14. Brookmeyer R, Gray S, Kawas C. Projections of Alzheimer's disease in the United States and the public health impact of delaying disease onset. Am J Public Health. 1998;88:1337–42.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Brinks R, Tamayo T, Kowall B, Rathmann W. Prevalence of type 2 diabetes in Germany in 2040: estimates from an epidemiological model. Eur J Epidemiol. 2012;27:791–7.

    Article  PubMed  Google Scholar 

  16. Andersson T, Ahlbom A, Carlsson S. Diabetes prevalence in Sweden at present and projections for year 2050. PLoS One. 2015;10:e0143084.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Milan V, Fetzer S. Die zukünftige Entwicklung von Demenzerkrankungen in Deutschland – ein Vergleich unterschiedlicher Prognosemodelle. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz. 2019;62:993–1003.

    Article  PubMed  Google Scholar 

  18. Destatis. Bevölkerung Deutschlands bis 2060: Ergebnisse der 14. koordinierten Bevölkerungsvorausberechnung. Wiesbaden; 2019.

    Google Scholar 

  19. Günster C, Altenhofen L, editors. Versorgungs-Report 2011: Schwerpunkt: Chronische Erkrankungen. Stuttgart: Schattauer GmbH; 2011.

    Google Scholar 

  20. Günster C, Klose J, Schmacke N, editors. Versorgungs-Report 2012. Stuttgart: Schattauer GmbH; 2012.

    Google Scholar 

  21. Klauber J, Günster C, Gerste B, Robra B-P, Schmacke N, editors. Versorgungs-Report 2013/2014: Schwerpunkt: Depression. Stuttgart: Schattauer GmbH; 2014.

    Google Scholar 

  22. Günster C, Klauber J, Robra B-P, Schmacke N, Schmuker C, editors. Versorgungs-Report Früherkennung. Berlin: Medizinisch Wissenschaftliche Verlagsgesellschaft; 2019.

    Google Scholar 

  23. Drösler S, Garbe E, Hasford J, Schubert I. Ulrich V, van de Ven W, et al. Bonn: Sondergutachten zu den Wirkungen des morbiditätsorientierten Risikostrukturausgleichs; 2011.

    Google Scholar 

  24. Swart E. Health care utilization research using secondary data. In: Janssen C, Swart E, von Lengerke T, editors. Health care utilization in Germany: theory, methodology, and results. New York: Springer; 2014. p. 63–86.

    Chapter  Google Scholar 

  25. Veronese N, Cereda E, Maggi S, Luchini C, Solmi M, Smith T, et al. Osteoarthritis and mortality: a prospective cohort study and systematic review with meta-analysis. Semin Arthritis Rheum. 2016;46:160–7.

    Article  PubMed  Google Scholar 

  26. Kuperman EF, Schweizer M, Joy P, Gu X, Fang MM. The effects of advanced age on primary total knee arthroplasty: a meta-analysis and systematic review. BMC Geriatr. 2016.

  27. Schouler-Ocak M, Aichberger MC. Versorgung von Migranten. Psychother Psychosom Med Psychol. 2015;65:476–85; quiz 485.

    Article  PubMed  Google Scholar 

  28. Robert Koch-Institut. Migration und Gesundheit: Schwerpunktbericht der Gesundheitsberichterstattung des Bundes. Berlin; 2008.

  29. Tönnies T, Röckl S, Hoyer A, Heidemann C, Baumert J, Du Y, et al. Projected number of people with diagnosed type 2 diabetes in Germany in 2040. Diabet Med. 2019.

  30. Alzheimer Europe. Dementia in Europe Yearbook 2019: Estimating the prevalence of dementia in Europe; 2020.

    Google Scholar 

  31. Chatterji S, Byles J, Cutler D, Seeman T, Verdes E. Health, functioning, and disability in older adults—present status and future implications. Lancet. 2015;385:563–75.

    Article  PubMed  Google Scholar 

  32. Beltrán-Sánchez H, Jiménez MP, Subramanian SV. Assessing morbidity compression in two cohorts from the health and retirement study. J Epidemiol Community Health. 2016;70:1011–6.

    Article  PubMed  PubMed Central  Google Scholar 

  33. Kingston A, Robinson L, Booth H, Knapp M, Jagger C. Projections of multi-morbidity in the older population in England to 2035: estimates from the population ageing and care simulation (PACSim) model. Age Ageing. 2018;47:374–80.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Barendregt JJ, Ott A. Consistency of epidemiologic estimates. Eur J Epidemiol. 2005;20:827–32.

    Article  PubMed  Google Scholar 

  35. Binder N, Balmford J, Schumacher M. A multi-state model based reanalysis of the Framingham heart study: is dementia incidence really declining? Eur J Epidemiol. 2019;34:1075–83.

    Article  CAS  PubMed  Google Scholar 

  36. Brinks R, Landwehr S. Age- and time-dependent model of the prevalence of non-communicable diseases and application to dementia in Germany. Theor Popul Biol. 2014;92:62–8.

    Article  PubMed  Google Scholar 

  37. Geyer S, Kowalski C. GKV-Routinedaten in der onkologischen Versorgungsforschung. Onkologie heute. 2018;2018:X70–2.

    Google Scholar 

  38. Jaunzeme J, Eberhard S, Geyer S. Wie "repräsentativ" sind GKV-Daten? Demografische und soziale Unterschiede und Ähnlichkeiten zwischen einer GKV-Versichertenpopulation, der Bevölkerung Niedersachsens sowie der Bundesrepublik am Beispiel der AOK Niedersachsen. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz. 2013;56:447–54.

    Article  CAS  PubMed  Google Scholar 

  39. Hartmann J, Weidmann C, Biehle R. Validierung von GKV-Routinedaten am Beispiel von geschlechtsspezifischen Diagnosen. Gesundheitswesen. 2016;78:e53–8.

    Article  CAS  PubMed  Google Scholar 

Download references


We thank Jana Wolf (Hochschule Aalen) for the detailed language review and the two anonymous reviewers for their valuable comments to improve the quality of our manuscript.


This research did not receive any grant or externalfunding. Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations



VM was responsible for selecting the disease-specific input data and calculating the scenarios, SF and CH supervised the calculations. Furthermore, VM conducted the literature survey according to the criteria chosen by all three authors. SF provided the population projection data. All authors were involved in developing the model and drafting the manuscript. The final version is read and approved by all authors.

Corresponding author

Correspondence to Valeska Milan.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

Valeska Milan is an employee of the AOK Baden-Württemberg, the donor of the data set. The theses and opinions shared do not represent those of the AOK Baden-Württemberg, but solely those of the authors.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



Table 6 Bibliography and study characteristics
Fig. 2
figure 2

Search filter PubMed

Fig. 3
figure 3

Flow chart of the literature selection process

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Milan, V., Fetzer, S. & Hagist, C. Healing, surviving, or dying? – projecting the German future disease burden using a Markov illness-death model. BMC Public Health 21, 123 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: