Healing, surviving, or dying? – projecting the German future disease burden using a Markov illness-death model

Background In view of the upcoming demographic transition, there is still no clear evidence on how increasing life expectancy will affect future disease burden, especially regarding specific diseases. In our study, we project the future development of Germany’s ten most common non-infectious diseases (arthrosis, coronary heart disease, pulmonary, bronchial and tracheal cancer, chronic obstructive pulmonary disease, cerebrovascular diseases, dementia, depression, diabetes, dorsal pain and heart failure) in a Markov illness-death model with recovery until 2060. Methods The disease-specific input data stem from a consistent data set of a major sickness fund covering about four million people, the demographic components from official population statistics. Using six different scenarios concerning an expansion and a compression of morbidity as well as increasing recovery and effective prevention, we can show the possible future range of disease burden and, by disentangling the effects, reveal the significant differences between the various diseases in interaction with the demographic components. Results Our results indicate that, although strongly age-related diseases like dementia or heart failure show the highest relative increase rates, diseases of the musculoskeletal system, such as dorsal pain and arthrosis, still will be responsible for the majority of the German population’s future disease burden in 2060, with about 25–27 and 13–15 million patients, respectively. Most importantly, for almost all considered diseases a significant increase in burden of disease can be expected even in case of a compression of morbidity. Conclusion A massive case-load is emerging on the German health care system, which can only be alleviated by more effective prevention. Immediate action by policy makers and health care managers is needed, as otherwise the prevalence of widespread diseases will become unsustainable from a capacity point-of-view.


Background
The development of future patient numbers is an important concern for many stakeholders in the health systems. Rational decisions about the planning of hospital capacities, pharmaceutical investments, career choices of (future) healthcare professionals as well as the development of future health care expenditures itself depend on the precise knowledge of the future development of specific diseases.
Germany is one of the fastest ageing countries in the world due to constantly low fertility rates since the 1970s and a continuously increasing life expectancy. In the literature there are different rival theories and hypotheses how an increasing life expectancy will particularly affect the disease burden and the related health care expenditure. Gruenberg (1977) [1] and Verbrugge (1984) [2] hypothesise that a rising longevity goes hand in hand with an increase in years spent in illness and therefore with an expansion of morbidity in older age groups. In contrast, Fries (1980) [3] assumes that an increasing life expectancy leads to a compression of morbidity. Given these somehow contradictory hypotheses, the influence of proximity to death and treatment spending as a function of remaining life expectancy are controversially discussed among health economists [4][5][6][7].
However, even less evidence exists today concerning the (more epidemiological) question of specific diseases' future development in the light of the different hypotheses. A systematic literature review on PubMed searching for projections (or synonyms) in context of demography and using the keywords prevalence, incidence or burden of disease for specific or chronic noninfectious diseases in general shows 160 relevant publications. There are three categories of studies by their projection methodology: trend extrapolations (99/160), multistate models (57/160) and studies using both methodologies (4/160). In 54 of the studies using trend extrapolation (103/160) indeed current prevalence or incidence rates are transferred to population projections, which excludes a specific modelling of the various theses. This so-called status quo analysis is also commonly used in projections of health expenditures 1 . Out of the 61 studies using multistate modelling (61/160), 17 (17/ 61) are based on the classical structure of an illnessdeath model (even if only 7 explicitly define it that way). However, only nine of the studies (9/61) focus on an explicit modelling of a compression of morbidity, of them eight (8/9) related to dementia. Furthermore, just seven studies (7/61) compare the development of more than two different diseases, only one of them modelling compression scenarios [9] (see the appendix for more detailed information and results on the systematic database search).
In our paper, we present projections for ten common non-infectious diseases (arthrosis, coronary heart disease, pulmonary, bronchial and tracheal cancer, chronic obstructive pulmonary disease, cerebrovascular diseases, dementia, depression, diabetes, dorsal pain and heart failure). The selected diseases represent the intersection between the most common and most expensive disease patterns in Germany [10]. For the projections we use a time-discrete Markov illness-death model with recovery. Our model allows us to regard the different hypotheses in context of demographic transition and to quantify the influence of potentially changing variables (disease-specific survival, incidence and recovery rate) on the future frequency of diseases. In addition, we show the influence of successful prevention on long-term prevalence of the different diseases.
The population-related components used for modelling stem from Destatis, the German Federal Statistical Office, whereas the disease-specific components are computed on the data of a major sickness fund covering approximately four million insureds during the period from 2009 to 2017. Our data set is unique as we calculated the input data ourselves using disease-specific validation criteria selected for this purpose (shown in section Dataset). Hence, our study is one of the few that use insurance data (7/ 160), although the resulting treatment prevalence is of particular importance for decision makers and payers in the health care system. Data sources from other studies of the systematic literature review are surveys or other epidemiological studies (61/160), a literature review for the different input factors (34/160), registries (28/160) or mixed data sources (30/160).
The paper is organised as follows: we start with the presentation of our time-discrete Markov illness-death model with recovery as well as our data set. Then, we show our results for the future development of the ten diseases (average prevalence rates and number of patients) in different populations and scenarios, also considering the results of other publications. This is followed by a discussion of the results in view of the current state of research and the limitations, finishing with a concluding summary.

Markov illness-death model with recovery
We will calculate the future number of patients and the future average prevalence rates for the total population from 2018 to 2060 2 using a time-discrete Markov illness-death model with recovery. The model is based on the cohort-component-method [11], which is widely used for (official) population projections. Regarding epidemiologic modelling, it can be attributed to the work of Fix & Neyman (1951) [12] and is closely related to those of Manton et al. (1984), Brookmeyer et al. (1998), Brinks et al. (2012), and Andersson et al. (2015) [13][14][15][16], but differs inthe detail level of the rich routine data set used. The specific cohort data by age and gender with corresponding detail diagnosis allows us to vary different variables over time (future development of the diseasespecific survival rate, incidence rate and recovery rate). In contrast to most other studies using an illness-death approach (16/17) including the work of Milan & Fetzer (2019) [17], on which our modelling is based, the model also includes the possibility of recovery.
The starting point of our model is the number of patients P a,g (differentiated by age a between 0 and 100 and gender g which is men or women) in our starting year T. It results from the prevalence rate p a,g,T multiplied by the cohort size K a,g,T . P a;g;T ¼ K a;g;T p a;g;T ð1Þ In models extrapolating current prevalence rates (status quo analysis) p a,g,T is assumed to be constant over time and only the future cohort sizes determine the future development of patients. In contrast to this, for all following years, age-and gender-specific incidence and recovery rates as well as the mortality rates of patients are used in our model to calculate the (future) number of patients P a,g,T + t . At this point we distinguish between the group of patients which are comprised of the surviving patients of the previous year D T þt − 1 a;g;T þt and the group of newly diseased patients I a,g,T + t .
In order to calculate the surviving patients of the previous year D T þt − 1 a;g;T þt we use the disease-specific mortality difference md a − 1,g,T + t − 1 which is subtracted from the survival rate of each cohort sr a − 1,g,T + t − 1 3 . Also we consider disease-specific recovery rates r a − 1,g,T + t − 1 as follows 4 : To determine the number of new patients I a, g, T + t , the number of surviving non-diseased from the previous year is calculated as follows in a first step: In a second step the number of new patients I a,g,T + t , which results from the age-and gender-specific incidence rate i a,g,T + t , is multiplied with the surviving nondiseased from the previous year: The total number of patients P T + t in all years T + t is finally calculated as: In our model for all years T + t the future cohort sizes, K a,g,T + t as well as the future survival rates sr a,g,T + t of the total population are derived from a population projection, which we calculate via the cohort component method. Within this framework we consider the diseasespecific components. The calculation of the survival rate of the patients as the difference sr a − 1,g,T + t − 1 − md a − 1,g,T + t − 1 and the surviving non-diseased ND T þt − 1 a;g;T þt as the difference between all survivors of the cohort and the surviving patients from the previous period finally merge the population projection with the epidemiological developments. Thus, the design of our model also allows the use of input data from any other population projection or/and disease-specific statistic. This timediscrete approach is also more intuitive to understand for a broader audience, such as policy setters and health care decision makers.
Dividing the total number of patients by the total number of the population results in the average prevalence rate of the total population, apr, which we will present in addition to the total number of patients in the result section. Obviously, the apr highly depends on the share of the elderly and diseased within the total population. As the German demographic transition leads to an increasing proportion of elderly cohorts, we call this effect cohort effect, which can also be observed in models extrapolating current prevalence rates using the status quo analysis.
As for the further effects of our model, we will take a closer look at the future age-and gender-related prevalence rate p T + 1 , which can be obtained by dividing the number of patients (eqs. 2 to 5) by the total corresponding cohort K a,g,T + t = K a − 1,g,T + t − 1 sr a − 1,g,T + t − 1 and therefore is independent of future cohort sizes: For reasons of simplicity we use time-independent incidence, recovery and mortality rates and abstract from the indices of age and gender in eq. (7). The total derivate can be used to determine the impact of changing incidence, recovery and mortality rates on the prevalence in year T + 1.
In our model specification, the variables p T , sr, md, r and i can take on values between 0 and 1 and the disease-specific mortality difference md is less (or in theory equal) than the survival rate of the entire population sr. As eq. (8) shows, a higher prevalence rate p in year T leads to a higher prevalence rate in year T + 1. The theoretical one-to-one impact of this effect is lowered by the degree of the incidence and recovery rate as well as the disease-specific mortality difference.
An increase of the survival rate sr initially leads to an increase in both, the diseased and the non-diseased population. In conjunction with the incidence rate i, a positive impact on the prevalence rate in year T + 1 can be observed as the rising survival rate leads to a higher "at risk" population. In contrast to this, a higher mortality difference md leads to a decline in the prevalence rate in year T + 1. Both effects combined can be interpreted as follows: The smaller the difference in mortality between the diseased and non-diseased, the higher the positive impact of an increasing survival rate.
The influence of the recovery rate is negative and linked to the life expectancy of the patients. The more patients survive until the following year, the more can recover again. However, the higher the incidence rate and thus the proportion of new patients, the lower the proportion of persons who could potentially recover, which mitigates the negative effect of the recovery rate.
Considering the impact of increasing incidence rates also offers a connection between the incidence and the recovery rate. A higher proportion of recovered people leads to a higher "at-risk" population. The opposite effect results from a higher prevalence rate in year T which comes along with a lower "atrisk" population.

Scenarios
Regarding the effects outlined above, a change of one variable will always affect the future prevalence in interaction with the other components. To illustrate these effects and the sensitivity of the model, we model six scenarios of changing disease-specific variables md a,g , i a,g and r a,g for each of the ten diseases up to 2060, especially regarding the different hypotheses of expansion and compression of morbidity (see Table 1). In all scenarios we assume increasing survival rates sr a,g according to the moderately increasing life expectancy scenario L2 [18]).
In the first scenario, we hold all disease-specific variables constant over the time horizon. However, the assumption of a constant mortality difference and rising survival rates (sr a,g,T + t > sr a,g,T + t − 1 ) leads to an increase in life expectancy of both the non-diseased and the diseased. In conjunction with constant incidence rates (i ag = const), this results in an increasing duration of disease. Thus, the scenario Expansion 1 can be interpreted as a type of expansion of morbidity hypothesis. This scenario serves as our baseline scenario in the following. The scenario Expansion 2 is a more extreme scenario of the expansion of morbidity hypothesis, assuming an additional 30% increase in incidence rates until 2060 (i a,g,T + t > i a,g,T + t − 1 ).
The compression of morbidity hypothesis is considered in two different scenarios: In the scenario Compression 1 only the healthy population benefits from the increasing life expectancy ( sr D a;g ¼ const ) which leads to a continuous increase in the mortality difference between the diseased and the healthy population. In the scenario Compression 2 a shift of diseased cases in relation to increasing life expectancy is modelled which is in line with the "traditional" compression of morbidity hypothesis and leads to continuously decreasing incidence rates (i a,g,T + t < i a,g,T + t − 1 ).
To highlight the long-term impact of effective prevention programmes, a scenario Prevention is modelled with temporarily decreasing incidence rates (i a,g,T + t < i a,g,T + t − 1 ) up to 30% until 2035. In order to simulate possible effects of better medical care, e.g. due to disease management programmes, the scenario Extended Recovery assumes increasing recovery rates up to 50% until the year 2060 (r a,g,T + t > r a,g,T + t − 1 ).
Interestingly (and as discussed in the section on the total differential of the prevalence rate), the total effect of the scenarios Compression 1 and 2 as well as of the scenarios Extended Recovery and Prevention on the future (age-and gender-related) prevalence rate is not defined a priori and depends on the numerical ratio of disease-related input data and the increase of survival rates.

Dataset
The average disease-specific input data for each cohort and gender 5 derives from a routine dataset of around four million insureds of the AOK Baden-Württemberg from 2009 to 2017 6 . Due to this large number of people insured by the AOK in Baden-Württemberg, this population is approximately representative of the German population regarding the disease-rates within the age cohorts. Table 2 shows the specific selection criteria for each of the ten diseases. Since there are no coding guidelines for outpatient diagnoses in Germany, we use the criteria of the AOK Research Institute published in various reports [19][20][21][22]. The M2Q 7 /M3Q criterion, for instance, only defines patients as diseased if they have a confirmed diagnosis in at least two and three out of four quarters of the year, respectively. Inpatient primary and secondary diagnosis are included without additional validation criteria. We complete missing data by the following procedure: If the selection criteria are satisfied the year before and the year after, insureds are classified as patients also in the incompletely coded year. Patients are classified as "new patients" when they fail to fulfil the prevalence criteria in any of the four previous years. The days of insurance of the patients identified by diagnosis are then set in relation to those of all insureds to calculate period prevalence p a,g and cumulative incidence i a,g for the years 2015 to 2017 [24]. For pulmonary cancer we use a five-year pre-observation period for the derivation of the incidence. To take into account the periodic character of depression, we use additional selection criteria for new cases and divergent diagnoses to determine prevalence and incidence. 8 For the calculation of recovery rates r a,g all surviving patients without a coded diagnosis in the Increasing sr a,g according to L2 scenario Increasing md a,g corresponding to increasing sr Effect -+ Increasing sr a,g according to L2 scenario Shift of i a,g corresponding to increasing sr a,g resulting in a continuous decrease of i Increasing sr a,g according to L2 scenario Linearly decreasing i a,g of 30% until 2035 Effect Increasing sr a,g according to L2 scenario Linearly increasing r a,g of 50% until 2060 Effect + -Source: Own depiction following years are set in relation to the total of all surviving patients. For the definition of recovery we use a four-year follow-up period for diseases with realistic cure probabilities (dorsal pain, depression and CVD) and a five-year follow-up period for pulmonary cancer. The maximum follow-up period of 8 years is used for all other diseases since there are still no cure possibilities available for their most common manifestations. Since dementia is (as of yet) characterized by an irreversible disease progression, no recovery rates are considered in these calculations 9 . For chronic diseases, the recovery rates are to be interpreted as being symptom-free. A recurrence of the disease after years of asymptomatic illness is taken into account by the incidence rate. For each cohort, we calculate mortality differences md a,g as the difference between the 1-year survival rates of the diseased and all insureds in a given year and subtract them from the German population's survival probability sr a,g as described above 10 . Table 3 shows the population weighted determined input data as the average value for different age groups and overall average in the base year 2018 for each disease, in parentheses differentiated by gender (female vs male). In addition, Table 4 illustrates the demographic characteristics of the study population as average values of all years analyzed in millions and as percentage compared to those of the entire German population in 2018. 11 Inorder to derive the (future) cohort sizes K a,g and survival rates sr a,g , we build different population projections based on input data from Destatis and statistics of mortality.org. As our starting point serves a Stationary Population with constant absolute births and constant life expectancy to separate the effects resulting from disease-specific (epidemiological) components from the effects of the composition of future cohort sizes on the apr. In our second population projection Population (LE constant) we abstract from a further increase in life expectancy. This projection is based on the German population in 2018 under the assumption of a fertility rate of 1.55 children per woman of fertile age. For our third population projection, Standard Population (LE increasing), we further assume an increase of life expectancy from 83.3 to 88.1 years at birth for women and 78.5 to 84.4 at birth for men according to the moderate increase scenario L2 of the 14th population projection [18]. Migration movement is not taken into account, as too little is known about whether disease rates of the German population are transferrable to migrants [27,28]. Hence, the Standard Population (LE increasing) 9 However, for dementia we will assume emerging recovery rates in the scenario Extended Recovery for reason of comparability to the other diseases. 10 According to other studies, no mortality difference was found for arthrosis and dorsal pain [25,26]. 11 The group of 0-17-year-olds is left out because the considered diseases are very rare in these cohorts.      r (% of patients)  represents an absolute decline in population from 83.0 to 66.2 million by 2060, accompanied by an increasing old-age dependency ratio from 35.9 to 69.7%. 12 However, for reason of comparability to other studies, we build a fourth population projection, Population (Migration), where future migration is integrated according to the scenario W2 of the 14th population projection [18]. 13 In this case the total population is 79.1 million people in 2060 and the old-age dependency is 58.8%.

Results
The presentation of our results starts in Table 4  The results show a high increase in the apr for strongly age-related diseases like dementia, heart failure or CVD, with the ageing of the German population due to its current structure (Population (LE constant)) and rising life expectancy being the key factors driving the large growth rates. The ratio of people with dementia could more than double by 2060 within the Standard Population (LE increasing). In contrast, the increase of the apr of dorsal pain is mainly driven by the epidemiological effect. Regarding arthrosis and COPD, the increase of apr can be attributed to both, the epidemiological as well as the demographic effects. The smallest increase of apr emerges for diabetes and depression. For both, the epidemiological effect is comparatively low. However, an increase in the average prevalence rate is to be expected for all diseases given the baseline scenario Expansion 1. Even when abstracting from an increasing life expectancy, the ageing of the German population in conjunction with the epidemiological effects will lead to a substantial increase of all diseases. Figure 1 presents the results for the apr in the year 2060 that occur under the different model scenarios (see Table 2) as well as under a simple extrapolation of age- Source: Own Data and depiction in combination with data of Destatis and mortality.org, in parentheses differentiated by gender (female/male) Abbreviations: p prevalence rate, i incidence rate, r recovery rate, md mortality difference, CA pulmonary, bronchial and tracheal cancer, CHD coronary heart disease, COPD chronic obstructive pulmonary disease, CVD cerebrovascular diseases, HF heart failure 12 Age a is limited between 0 and 100 years and with regard to gender, a distinction is made between male and female cohorts.We model our own population projection as Destatis does not publish a scenario without a future shift in migration. For this purpose, we use the data of mortality.org to model the survival rate for persons older than 100 years and calibrate the data on the life tables publishes by Destatis for the L2 scenario. In a last step we aggregate the numbers for all persons older than 100 years as our disease specific input data has only few data points for cohorts of age 100 and older. 13 In line with the W2 scenario published by Destatis, we assume an average positive net migration of 220,000 persons and consider their composition of age groups published by Destatis.
and gender-related prevalence rates for the population of 2060 (status quo (SQ) principle). For this purpose, we use the Standard Population (LE increasing). The y-axis of Fig. 1 shows the relative change of the apr between 2018 and 2060 whereas the x-axis displays the value of the apr for the different scenarios in 2060. Additionally, the x-axis depicts the numbers of apr in 2018. As a first result, Fig. 1 illustrates that the ranking of the ten diseases with respect to the value of the apr in 2060 is the same as in 2018, even though the relative change of the apr differs significantly between the ten diseases. That means that dorsal pain and arthrosis are expected to be the two major diagnoses in 2060, although e.g. dementia offers a significantly higher change in the apr in all scenarios.
Second, the results show a different impact of the rival hypotheses regarding the consequences of increasing life expectancy on future disease burden: The expansion of morbidity scenarios Expansion 1 and 2 lead to a soaring increase of all diseases compared to the other scenarios. Especially the scenario of Expansion 2 (with an assumed increase of the incidence rate by 30% until 2060) offers a strong increase of the apr. For strongly age-related diseases such as dementia, CVD or HF, the Compression 2 scenario (shifting the incidence to higher age groups) has a stronger impact on the apr than the Compression 1 scenario, in which the life expectancy for patients is constant over time and only the healthy population benefits from the increasing life expectancy. Yet even in the compression of morbidity scenarios, an increase in all the common diseases can be expected. In other words: The increase in burden of disease due to increasing life expectancy and high incidence rates in older age groups can be mitigated but not fully compensated by a compression.
The assumption of continuously rising recovery rates (scenario Extended Recovery) has an even smaller impact on future apr, although this is also attributable to the low chances of recovery for the considered diseases in general. Only for depression an increasing recovery rate would lead to a constant prevalence rate in the long term. A diminishing effect on future long-term prevalence for all diseases can only be seen in the scenario Prevention. For diabetes and depression, the Prevention scenario even leads to a small decline in the apr. This highlights the importance of effective prevention regarding the upcoming demographic transition.
At a first glance a (simple) extrapolation of current prevalence rates should range between the expansion and compression scenarios, our results offer that this is not true for all diseases. In particular, for dorsal pain, arthrosis, COPD, and cancer the status quo principle leads to an apr in 2060 which is smaller than the scenarios of Prevention. Hence, our results show a wide range future developments of the different diseases depending on the chosen parameters for modelling. Table 5 shows the absolute results of the projection for 2040 and 2060. As the Standard Population (LE increasing) neglects future migration, the total number of people in Germany will decline between 2040 and 2060. Thus, for the most scenarios and diseases the total numbers of patients are higher in 2040 than 2060. However, the results given the projection Population (Migration) in parentheses offer the opposite effect. Hereby we assume identical disease-related input data for migrants.
All in all, our calculations show that all of the ten diseases are expected to increase up until 2060: Diseases of the musculoskeletal system like dorsal pain and arthrosis  [29] calculate with the help of an illness-death model and under the assumption of constant incidence rates a higher number of 11.0 million patients for 2040. The discrepancy to our projection (10.3 million) for 2040 is probably due to their older input data, which stem from 2010. The most recent study on dementia by Alzheimer Europe (2020) [30] project 2.7 million patients for 2050 with a status quo projection which lies in the interval of our forecast with 2.5 to 3.0 million people affected. Milan & Fetzer (2019) [17] project 2.6 to 3.3 dementia patients for 2060 by using the same model. The slight differences to their results are attributable to more recent population statistics and disease-specific input data. A comparison of our results with the three studies focusing on cancer is difficult as two of them consider the disease pattern of lung cancer and take a short-term perspective (up to the year 2020), whereas the third focuses on a trend projection of incidence rates (see Fig. 2, Fig. 3 and Table 6 in the appendix for more detailed information and results on the systematic database search).

Discussion
A projection of ten common non-infectious diseases in concurrent scenarios based on a rich and consistent data set is expanding the literature on the developmentof future disease burden in light of the demographic transition. In this context, ours is one of the few studies using an illnessdeath approach with recovery and modelling compression of morbidity and prevention scenarios. Furthermore, due to its time-discrete specification, our model could be directly linked to any (official) population projection, and therefore adapted by institutions in the field of policy consulting.
In contrast to a naïve extrapolation (status quo principle), our analysis highlights the importance of focusing on the interdependence between demographic and disease-specific components in projecting future disease burden. Based on six different scenarios we show the possible future range of disease burden and reveal the large differences between the various diseases in interaction with the demographic components. Considering these differences, it becomes clear that the extrapolation of prevalence rates can only reflect the cohort effect caused by population structure and not epidemiologically induced changes in the burden of disease, as observed e.g. for dorsal pain. In contrast, for CHD the status quo projection ranges, as expected, between the compression and expansion scenarios due to minor epidemiological influences.
With regard to the probability of the different hypotheses on future disease burden, the study situation remains inconclusive. Chatterji et al. (2015) [31] show with their detailed review of studies across the world how much the results vary for observed compression or expansion in recent years.  [32] show for the United States that those who died in recent times had a higher prevalence of chronic diseases in periods far from death, especially of those chronic diseases with low mortality and high frequency. Interestingly, even in international studies there are only a few projections for the two major common diseases dorsal pain and arthrosis (1/160 dorsal pain, 10/ 160 arthrosis or joint replacement procedures), although these diseases are expected to increase the most in total numbers of patients according to our calculations. Our results can be compared with those of Kingston et al. (2018) [33], who use a population sample to model multimorbidity and prevalence of similar diseases for over 65-year-olds in England until 2035. In line with our findings, they predict a significant increase for all diseases considered except depression, but with the largest increases for cancer, diabetes and respiratory diseases. In line with our findings, the only study that also compares different compression scenarios, but with regard to disability due to similar diseases in the UK, by Jagger et al. (2006) [9], concludes that improvements in population health cannot fully compensate the effect of population ageing and that there will still be an increase in number of older people with disabilities.
Of course, our results are also subject to limitations. The Markov assumption of the illness-death model implies that the transition probabilities depend only on the current state and are not influenced by past events. But complex long-term studies, e.g. on the probability of redisease after a successful recovery, would be necessary to heal this caveat, which are not available for such a large number of insureds. However, regarding the fit with observed incidence or prevalence rates, multistate models used in a retrospective analysis of epidemiological study data (in contrast to regression models) score well [34,35].
Even if our discrete model has certain advantages, modelling in discrete time might be overestimating epidemiological effects. By comparing the results of a discrete-time model with those of a continuous model, Brinks & Landwehr (2014) [36] show that a projection in discrete time can overestimate future prevalence. However, the authors also state that smaller projection intervals lead to smaller deviations. Our chosen one-year interval leads to about a 10% overestimation in their model.
Nonetheless, this overestimation effect might be somehow offset by the conservative estimates generated by using insurance data, which constitutes another limitation of our measure. Insurance or routine data is primarily collected for invoicing medical services when patients visit a physician. Thus, the resulting prevalence and incidence rates can only be interpreted as treatment rates and are usually slightly lower than those obtained by surveys. In conjunction with the required validation procedures, the actual population incidence could be underestimated. Due to the incomplete coding observed for some diseases, it is also questionable whether the documented onset of illness corresponds to the real date of incidence.
A third limitation could be our data set: The rates determined from the AOK Baden-Württemberg might differ from the rates of the total German population. However, regarding gender-specific differences or frequencies in older cohorts that are particularly relevant for this analysis, various studies indicate that large AOK data sets are representative [37][38][39].
Further insights could be obtained by including multimorbidity in our model 15 . Comorbidity analyses could also provide more detailed insights into causes of mortality differences, which would help limiting the range of possible future scenarios. Despite the limitations mentioned, our results can offer an important guide to rational decisions in health care, especially due to the actuality and detail level of the data used. Although the strongly age-related diseases such as dementia or heart failure show the highest relative increase rates, the enormous prevalence of musculoskeletal diseases and depression should not be ignored. Most importantly, for almost all considered diseases a significant increase in burden of disease can be expected even in case of a compression of morbidity.

Conclusion
We think that our approach is useful for consulting health care professionals and politicians in preparing for the upcoming pressure on health care capacities. As the current COVID-19 crisis is showing, health care capacities are quite scarce. Even in our most optimistic scenario we would have the same pressureat least in numbersfrom chronic diseases as currently experienced during the pandemic. The lesson from our analysis is clear: A massive case-load is emerging on the German health care system, which can only be alleviated by more effective prevention. Immediate action by policy makers and health care managers is needed, as otherwise the prevalence of widespread diseases will become unsustainable from a capacity point-of-view.