A predictive model relating daily fluctuations in summer temperatures and mortality rates

Background In the context of climate change, an efficient alert system to prevent the risk associated with summer heat is necessary. The authors' objective was to describe the temperature-mortality relationship in France over a 29-year period and to define and validate a combination of temperature factors enabling optimum prediction of the daily fluctuations in summer mortality. Methods The study addressed the daily mortality rates of subjects aged over 55 years, in France as a whole, from 1975 to 2003. The daily minimum and maximum temperatures consisted in the average values recorded by 97 meteorological stations. For each day, a cumulative variable for the maximum temperature over the preceding 10 days was defined. The mortality rate was modelled using a Poisson regression with over-dispersion and a first-order autoregressive structure and with control for long-term and within-summer seasonal trends. The lag effects of temperature were accounted for by including the preceding 5 days. A "backward" method was used to select the most significant climatic variables. The predictive performance of the model was assessed by comparing the observed and predicted daily mortality rates on a validation period (summer 2003), which was distinct from the calibration period (1975–2002) used to estimate the model. Results The temperature indicators explained 76% of the total over-dispersion. The greater part of the daily fluctuations in mortality was explained by the interaction between minimum and maximum temperatures, for a day t and the day preceding it. The prediction of mortality during extreme events was greatly improved by including the cumulative variables for maximum temperature, in interaction with the maximum temperatures. The correlation between the observed and estimated mortality ratios was 0.88 in the final model. Conclusion Although France is a large country with geographic heterogeneity in both mortality and temperatures, a strong correlation between the daily fluctuations in mortality and the temperatures in summer on a national scale was observed. The model provided a satisfactory quantitative prediction of the daily mortality both for the days with usual temperatures and for the days during intense heat episodes. The results may contribute to enhancing the alert system for intense heat waves.

The weather component has mainly been considered on the basis of the temperatures recorded on various lag days. Other climatic parameters, such as humidity, wind speed and pressure, have also been considered as independent variables [17,18,22,24,25] or by constructing combined indices [21] or synoptic patterns [26]. Air pollution has also been included in some studies focusing on specific urban areas [18,20,22,23,25].
Although all the studies have concluded that long and intense heat episodes are responsible for major excess mortality, quantitative indicators that take into account both the intensity and duration of heat episodes have seldom been proposed and formally validated [7,8,21,27,28].
In August 2003, Western Europe experienced a heat wave that was exceptional in terms of its duration, intensity and geographic extent [7,8,27,29,30]. Unlike prior lessmarked heat waves, its health impact attracted considerable public interest and drew attention to the need for efficient alert systems. Moreover, the Intergovernmental Panel on Climate Change predicts an increase in extreme climatic events in the twenty-first century, [31] and several scenario studies, e.g. Beniston's study, [32] predict that heat waves like that in 2003 may occur every two or three years on average, by the third part of this century.
In that context, the objective of this paper is to describe and model the relationship between mortality and temperature in France over a 29-year period (from 1975 to 2003) and, more generally, to propose an approach for the selection of the most predictive combination of temperature factors with a view to predicting the risk of shortterm mortality in summer (June to September).

Methods
The study analysed the relationship between daily fluctuations in mortality and temperature for the whole of France, over the 122 summer days, from 1st June to 30th September of each year, from 1975 to 2003, i.e. 3,538 summer days in all.

Mortality data
The mortality data were provided by the French National Institute for Medical Research (Inserm). The daily counts of all-cause mortality (O t ) for people aged 55 years and over were analysed. The use of this mortality data in the frame of epidemiological studies has been authorised by the French National Commission for Data protection and theLiberties(CNIL).
The yearly population estimates were supplied by the French National Institute of Statistics and Economic Studies (INSEE). Mortality was expressed as the daily mortality rate per 100,000 subjects.

Climatic data
The daily minimum and maximum temperatures (Tmin and Tmax) and minimum and maximum relative humidities (Hmin and Hmax) were recorded by 97 weather stations considered representative of the climate affecting the populations of the 96 French départements by the national meteorological service (Météo-France). The national daily values of those climatic indicators were the average of those 97 values, weighted by the populations of the départements.
A 10-day moving average of the mean temperature (average of the daily minimum and maximum temperatures) was also calculated.
A cumulative temperature variable which was close to the total degree-days of excedance, was constructed [27]. For each day, the cumulative minimum/maximum temperature variable (CTmin and CTmax) was defined as the sum of the number of degrees above a cut-off point from the current day t to either day t-10 or the last day with a temperature higher than the cut-off point. This variable was equal to zero if the temperature was below the cut-off point on the day considered: in which: k is the lower of the value 10 and the value of the first previous day on which Tmax t fell below the cut-off point; is equal to 1 if Tmax t-d is higher than the cut-off and 0 otherwise.
The cut-off points were selected by minimising the deviance of the model including the minimum/maximum temperatures and the minimum/maximum cumulative variables over a grid of cut-off values. The cut-off point for maximum temperatures was found to be equal to 27°C (80.6°F). The cut-off point for minimum temperature was so close to 0°C that the cumulative variable for minimum temperature was very strongly correlated with the moving average of the mean temperature (0.95). Therefore, CTmin was not included in the model.

Statistical analysis
The daily mortality rates were modelled using a generalized estimating equations (GEE) approach, with a Poisson distribution. This model enables both specification of an over-dispersion term and a first-order autoregressive structure that accounts for the autocorrelation of the daily numbers of deaths within each summer and assumes the independence of the summers. A log-linear long-term mortality trend (Trend) and the seasonality of mortality during summer, using a quadratic time function by day (Season), were included in the model. The model was also adjusted for a dummy variable (Summer) which differentiated the 122 summer days (from June to September inclusive) from the other days of the year. The non-summer days provided useful information on the long-term trend of the baseline mortality.
The baseline model M 0 was: In which,PopJ was the population estimate for the year considered.
The temperature factors were added to the baseline model to yield the model M 1 : in which the temperature factors are the minimum and maximum temperatures (Tmin and Tmax), the moving average of the mean temperature (MA) and the cumulative variable for maximum temperatures (CTmax).
In order to distinguish the specific impact of temperatures up to 5 days before death, the lagged minimum/maximum temperatures and cumulative maximum temperature were also included in the model. Some interactions between minimum/maximum temperatures and the cumulative indicator, recorded on the day considered and the preceding two days, were also added. The full model M 1 thus contained 19 different temperature indicators and 10 interactions (Table 1).
In a sensitivity analysis, the model was also adjusted for the daily minimum/maximum relative humidities, both as individual factors and as interactions with temperature, as confounder indicators. However, the results did not change.

Definition of temperature variables
In order to select the most predictive temperature indicators among the 29 variables used in the present paper, a "backward" method was applied on model M 1 .
First, the decision was taken to divide the 29 temperature variables and interactions into 17 groups, in order to ensure that the interactions between two indicators were systematically included in the model with the main effects (Table 1).
Most groups contained indicators of the same lag day (G1, GCum1). Four groups contained one temperature indicator recorded on the day considered and another indicator recorded on the preceding day (G2, G2', GCum2 and GCum2').
The 17 groups of indicators were divided in three categories. The first category contained the moving average of the mean temperatures, which reflects the climatic environment in which the subjects lived over the preceding ten days (GMA). The second category contained the minimum and maximum temperatures recorded on various lag days and thus reflected the specific exposure for each day (G1, G2 and G2'). The last category characterised the long periods of high temperatures and therefore included the cumulative indicators (GCum1, GCum2 and GCum2').
Groups with maximum temperature and the cumulative variable of maximum temperature CTmax t-1 Tmax t CTmax t-1 × Tmax t MA: 10-day moving average of mean temperature. Tmin t-k : minimum temperature for day t-k; Tmax t-k : maximum temperature for day t-k; k = 0,..., 5 CTmax t-k : cumulative variable of maximum temperature for day t-k;

Selection of temperature variables
With a GEE approach, common likelihood-based measures of model fit, like the AIC criterion, cannot be used. Since the objective was to identify the indicators that would provide the best prediction of daily mortality, the criterion chosen for backward elimination of the groups was based on the change in the over-dispersion measured over the period (years) used for estimation.
Over-dispersion was defined as: In which: O t was the daily observed counts of all-cause mortality for subjects aged 55 years and over; the corresponding estimate; N the number of observed days; and Nvar the number of variables in the model.
At each step, the "backward" method excluded the group which decreased the over-dispersion least, until all the groups had been excluded from the model. As a sensitivity analysis, the QIC criterion, which is an extension of Akaike's information criterion for GEE models, was also used as a backward elimination criterion, but the results were little changed [33].

Adequacy and predictive performance of the model
In order to analyse the predictive performance of the model, the 29 years of the period were divided into two distinct groups: a calibration group (e.g., the 28 years from 1975 to 2002), which was used to estimate the parameters and measure the fit of the model estimates with the observations and a validation group (e.g., 2003) which was used for prediction. Using the validation group, comparison of the daily observed and predicted mortality rates enabled assessment of the predictive performance of the model.
In previous studies, a non-linear relationship between minimum/maximum temperatures and mortality has been modelled using spline or smooth functions of sameday temperatures or averaging short-lag values [20,22,23]. A model including natural cubic spline functions with 3 and 6 df for minimum and maximum temperatures for the day t and the 2 preceding days was built. The results were compared with those generated with the model obtained by backward selection.

Sensitivity analyses
The In order to measure the sensitivity of the results to the kind of data used, each of the 28 years from 1975 to 2002 was in turn excluded from the calibration group. Conversely, 2003 was also included in the calibration period and the model was estimated using all 29 years (1975-2003).
Lastly, the final model was also assessed separately for men and women, for people aged 55-74 years, for people aged 75 years and over, and for subjects whose causes of death appeared to play a major role in the heat-related excess mortality, i.e., "direct" causes (heatstroke, hyperthermia and dehydration), cardiovascular disease and respiratory disease [7,22,30,34].

Descriptive analysis
The average daily mortality rate for subjects aged 55 years and over was 8.5/100,000 person-day for the four summer months (June-September) from 1975 to 2003 (table  2). The highest daily mortality rate was recorded on 12 th August 2003 with 20.2 deaths/100,000 person-day.
The mean daily minimum and maximum temperatures in France for the 29 summer periods were 13.2 and 23.7°C, respectively (table 2).
As Figure 1 (left column) shows, the daily fluctuations in temperature and mortality rate were closely correlated. Marked peaks in the daily mortality rate occurred, in particular, in the summers of 1975 and 1976. The peaks were concomitant with increasing temperatures and a positive value of the cumulative variable for maximum temperature.

Selection of the most predictive climatic variables
The With regard to the change in overall over-dispersion in the backward elimination of temperature factor groups, the over-dispersion remained quite steady from the full model to that only containing the following four groups (table 3): -G2: the minimum temperature for a day t, the maximum temperature on the preceding day and their interaction, -GMA: the 10-day moving average of mean temperature, varˆÔ t -GCum1 t : the maximum temperature for a day t, the cumulative variable for maximum temperatures for a day t and their interaction, -GCum1 t-2 : the maximum temperature 2 days before death, the cumulative variable for maximum temperatures 2 days before death and their interaction.
Subsequently, the over-dispersion rose sharply until all the groups had been excluded from the model.
The interactions turned out to contribute strongly to explaining the variation in daily mortality (table 4), although caution is required in the interpretation of the estimated parameter coefficients, since there are correlations between covariates. Figure 1 (right column) shows the daily fluctuations in observed and estimated mortality rates for three summers (1975, 1976 and 1983) by the number of groups included in the model. Those summers were marked by excess mortality rates related to periods of extreme temperatures, which differed in terms of their intensities and temporal configurations.

Adequacy of the model
For days with usual temperatures, the estimates of the daily mortality rate generated by the model only incorporating the group of temperature factors G2 (minimum and maximum temperatures for a day t and preceding day and their interaction) were neither improved nor impaired by the inclusion of the other groups (GMA and GCum1 t ) (Figure 1).
In contrast, group G2 was not sufficient to estimate the daily mortality rates during extreme events and the inclusion of group GCum1 t in the model greatly improved the estimates. This finding was particularly marked for the heat episodes in 1975 and 1976 (Figure 1). The fourth group selected by the backward method (GCum1 t-2 ) improved the mortality estimates for the 1976 heat waves only.
For the whole 28-year period (1975-2002), the correlation between the observed and the estimated mortality ratios was 0.88 with the model with four groups.

Predictive performance of the model
For the validation year 2003, the model with 1 group provided a satisfactorily prediction of the daily number of deaths, compared to the observed deaths, for the days with usual temperatures (Figure 2b). The prediction for the 2003 heat wave (from 1 st to 20 th August) was greatly improved when the third and fourth groups were added (Figure 2b).
The model with four groups explained 97% of the extra-Poisson variability of the daily mortality rates observed during summer 2003 (Table 3).
For the days with usual temperatures, the mortality estimates obtained with the model including cubic spline functions were close to those obtained with the model with four groups (including minimum/maximum temperatures, the cumulative indicator of maximum temperatures and their interactions). However, the model with four groups provided much better estimates of the daily mortality rates during the 2003 heat wave than the model with splines (Figure 2b). Fluctuations in daily observed and estimated mortality rates during three summers (1975,1976,1983), France

Mortality rate
Observed MR 1 group 3 groups 4 groups

Sensitivity analysis
The sensitivity of the approach was first assessed by evaluating the change in the results when each of the 28 years was excluded in turn from the calibration group or by using calibration groups consisting of either the 14 even or 14 odd years from 1975 to 2002 (table 5). For the backward method, the three most predictive temperature groups were similar to those selected in the main analysis, irrespective of the year excluded from the calibration group. However, the regression coefficient of the temperature parameters was subject to change. In particular, the coefficient of the cumulative variable for maximum temperature was weaker when a year including an extreme climatic event, such as 1975 or 1976, was excluded.
When summer 2003 was included in the calibration period, only the estimates of the cumulative indicator parameters were higher. The daily mortality rate estimates for the 1976 and 2003 heat waves were improved but the estimates for the other days were unchanged.
Lastly, the same analysis was conducted separately for men and women, for subjects aged 55-74 years, and for subjects aged 75 years and over, for the three main medical categories of causes of heat-related excess mortality (cardiovascular disease, respiratory disease and directly heat-related deaths). The three groups, G2, GMA and GCum1 t , were again the most predictive of the daily fluctuations in mortality since 1975 (table 5).

Discussion
This paper describes the relationship between the daily fluctuations in mortality and temperatures over a 29-year period (1975-2003) in France as a whole. It also proposes and validates an approach to determining the optimum combination of temperature indicators to explain both the usual daily fluctuations in mortality and the excess mortality associated with intense summer heat episodes.
Although temperatures are heterogeneous in different places in France, the daily population-weighted average of temperatures on a national scale turned out to be highly correlated with the daily number of deaths in summer.
The major part of the daily fluctuations in summer mortality is explained by the minimum and maximum temperatures observed for a day t and the preceding days and their interaction. Both minimum and maximum high temperatures have been shown to have a significant impact on mortality in summer [2,7,8]. Cool summer nights have been reported to allow recuperation when daytime temperatures are high.
However, the daily absolute temperatures do not appear sufficient to explain both the daily fluctuations in the usual mortality rates and the excess mortality rates related to extreme events. The interaction between the cumulative effect of temperatures above a cut-off point over a period of consecutive days and the maximum temperature appeared more predictive of the mortality during heat episodes.
Three recent studies have drawn attention to the importance of using a cumulative indicator of hot days or degrees above a cut-off to measure the magnitude of a heat wave in terms of its intensity and duration [7,27,28]. While the cumulative indicator has rarely been studied, it may be of value in predicting mortality during heat waves.
The cumulative indicator depends on the choice of the cut-off point. In this paper, the cut-off (27°C) was determined by considering the national values for daily maximum temperatures and should not be interpreted as equivalent to the cut-off in a similar analysis of a single city. If the temperature on a given day in France is, say, 25°C, then many localities will obviously have temperatures above 27°C. Moreover, the cut-off point depends on the population considered. If the cumulative indicator is The groups of temperature indicators were selected using the backward method.
to be used for another population, the cut-off needs to be adapted to the data, in order to take into account the population's specificities.
It is also important to note that the long time period considered herein contained several heat episodes (particularly in 1975 and 1976), which differed in terms of intensity, duration, temporal configuration and geographic extent. Extreme events in the calibration period are necessary in order to enable satisfactory estimation of the cumulative temperature parameter.
This study was designed to enable fine analysis of the role of temperature in the fluctuations of the daily mortality rate. From the 29 temperature indicators and interactions used in the present study, 17 groups were formed and 10 variables were finally selected as the most predictive indicators. The fact that alternative combinations of indicators in groups, possibly including other climatic indicators, might yield an equally good or better predictive performance with respect to daily mortality cannot be ruled out. However, the selected temperature indicators explained 76% of the total extra-Poisson variability, demonstrating the great importance of the selected indicators with respect to summer mortality. Even though temperature has been shown to be the main indicator of mortality, other environmental factors may also influence the fluctuations in daily mortality.
Humidity has often been studied, either as an individual factor or in the form of an index combining temperature and humidity, such as the apparent temperature or discomfort index [15,16,18,19,22,24,25]. The results of those studies were not consistent and depended on the usual climatic characteristics of the countries in which the studies were conducted. Wind speed and pressure have also been studied, but have rarely been found to be significantly associated with mortality. In the present study, humidity did not improve the daily mortality estimates. This finding may reflect the fact that, in France, from 1975 to 2003, the particularly hot days were not highly humid.
Air pollution has also been reported to have an effect on mortality during extreme climatic events, particularly in urban areas [18,20,22,23,25]. To the authors' knowledge, the relationship between temperature, air pollution and mortality has not been studied on the wide geographic scale of a whole country. The present model did not Since the dramatic European heat wave in summer 2003, the awareness of the risk associated with summer heat, behavioural adaptation to high temperatures during extremely hot weather and the set-up of an alert system have probably modified the mortality-temperature relationship. A national Heat Health Watch Warning System has been created to prevent the mortality associated with extreme heat episodes. The system is operational every year from 1 st June to 1 st September on the national scale [35]. Thus, the model presented herein may be pertinent with respect to evaluating and, possibly, refining the existing warning system by providing a quantitative dimension to the prediction of the mortality risk on a wide geographic scale. The quantitative estimate could then be used by the health authorities to evaluate the magnitude of the impact in terms of short-term mortality when a heat wave is predicted by the meteorological services. The estimate would also enable set up of an emergency plan and operations that would be commensurate with the severity of the heat episode.

Conclusion
Although France is a large country with marked geographic heterogeneity both in mortality and temperatures, a strong correlation between the daily fluctuations in mortality and the fluctuations in temperatures in summer was observed on a national scale, over a 29-year period (1975-2003).

Mortality rate
Observed MR 1 group spline 4 groups A combination consisting in the minimum/maximum temperatures and the cumulative indicator of maximum temperature recorded over short-lag days as well as their interactions was obtained using a backward method. The combination explained 76% of the total extra-Poisson variability of the mortality. The model provided a satisfactory quantitative estimation of the daily mortality both for the days with usual temperatures in summer (June to September) and for days during intense heat episodes.