A systematic review of evidence for the added benefits to health of exposure to natural environments

Background There is increasing interest in the potential role of the natural environment in human health and well-being. However, the evidence-base for specific and direct health or well-being benefits of activity within natural compared to more synthetic environments has not been systematically assessed. Methods We conducted a systematic review to collate and synthesise the findings of studies that compare measurements of health or well-being in natural and synthetic environments. Effect sizes of the differences between environments were calculated and meta-analysis used to synthesise data from studies measuring similar outcomes. Results Twenty-five studies met the review inclusion criteria. Most of these studies were crossover or controlled trials that investigated the effects of short-term exposure to each environment during a walk or run. This included 'natural' environments, such as public parks and green university campuses, and synthetic environments, such as indoor and outdoor built environments. The most common outcome measures were scores of different self-reported emotions. Based on these data, a meta-analysis provided some evidence of a positive benefit of a walk or run in a natural environment in comparison to a synthetic environment. There was also some support for greater attention after exposure to a natural environment but not after adjusting effect sizes for pretest differences. Meta-analysis of data on blood pressure and cortisol concentrations found less evidence of a consistent difference between environments across studies. Conclusions Overall, the studies are suggestive that natural environments may have direct and positive impacts on well-being, but support the need for investment in further research on this question to understand the general significance for public health.


Background
The relationship between the natural environment and human health and well-being is of current interest to a number of organisations within the public health and environmental sectors. Many have already invested resources in initiatives which use the natural environment, in some way, as a means of improving public health (e.g. British Trust for Conservation Volunteer's Green Gym; Parks Victoria's Graded Walks). These initiatives may be a means to simultaneously promote public health, tackle health inequalities and conserve biodiversity [1][2][3].
There are various possible ways in which natural areas may promote public health. A natural environment may provide an environmental setting for an activity or exercise programme, thus promoting increased physical activity [4,5]. The evidence that exercise and physical activity alone have positive impacts on health is well established. Physical activity has been shown to lead to improved physical fitness and health [6][7][8]. There is also some evidence that physical activity can have positive benefits for mental health, for instance, lowering depression. This may be through a combination of the physiological effects as well as participation in social activities and engagement with others [9][10][11]. A recent study that combined data from a range of different green exercise programs found consistent positive benefits [12].
Apart from the promotion of physical activity, it has been suggested that a natural environment may have intrinsic qualities which enhance health or well-being. Various theories have been proposed to explain these potential direct effects of nature. Kaplan and Kaplan's attention restoration theory proposes that nature provides the particular environmental stimuli to allow restoration from attention fatigue, which occurs during the performance of cognitive tasks that require prolonged maintenance of directed attention [13]. This is postulated to occur through restorative qualities of the environment that promote feelings of 'being away' from routine activities and thoughts and 'soft fascination' with features in the natural environment that attract attention without requiring effort [14]. In a complementary, 'psycho-evolutionary' theory, Ulrich has proposed that nature may allow psychophysiological stress recovery through innate, adaptive responses to attributes of natural environments such as spatial openness, the presence of pattern or structure, and water features. The theory proposes that the perception of these characteristics triggers positive emotional reactions related to safety and survival [15].
Cross-sectional studies have suggested positive relationships between green space and health [16,17]; however, identifying the causal pathway can be complex. In order to objectively assess whether or not there is an 'added benefit' from green space, research studies need to investigate if there is a difference in the health benefits of an activity in a natural environment (e.g. a park) compared with the same activity in a more synthetic environment (e.g. a gym). If it is found that the natural environment does bring added benefits to health and well-being over and above those arising from the activity being undertaken, it is important to understand what benefits are realised, by whom, and in which environments. Findings from research into such questions might enable public health planners to better target scarce resources so as to improve health and reduce health inequalities. Despite widespread discussion of this topic, a systematic and quantitative synthesis of the evidence for added benefits of nature on health has not been undertaken although narrative reviews have been produced [18][19][20].
Systematic review methodology is widely employed in medicine and public health [21,22], and more recently within environmental management [23,24], as a way of synthesising the evidence from research for the effectiveness of particular interventions. We conducted a systematic review to collate and synthesise the evidence on whether there are added benefits of activities in natural environments over and above those in more synthetic environments.

Methods
Searching for relevant data was conducted within 19 electronic libraries/databases. Articles were also searched for using web search engines and within the websites of public health and environmental organisations. A range of activity/health/well-being-associated keywords (e.g. exercise, health, restoration, depression) in combination with a range of environment-related keywords (e.g. park, green, outdoors, countryside) were used to search databases. The bibliographies of included articles were also checked for any additional references. Full details of the search strategy are available (see Additional file 1). Full background to the conduct of this systematic review can be found at http://www.environmentalevidence.org/SR40.html.
Articles were included in the review if they met the following criteria: collection of data on any measure of health or wellbeing after direct exposure to a natural environment and after exposure to a synthetic environment. 'Natural environment' was used in a broad sense to include any environment that, based on author descriptions, appeared to be reasonably 'green': this ranged from gardens and parks through to woodland and forests, and also included environments such as university campuses. Synthetic environments included non-green outdoor built environments or indoor environments. 'Direct exposure' could comprise physical presence within the environment (i.e. some form a passive/sedentary activity) or the use of the environment as a setting for a form of physical activity. We did not include studies that only compared pictures, slides or views of natural and synthetic environments. Both observational and experimental studies were included. Excluded from the review were: studies which investigated the effects of environmental hazards (e.g. air pollution), studies focusing on hypotheses regarding athlete/exercise performance, and studies that were purely descriptive. Title and abstract inclusion criteria were applied by three reviewers (DB, LBA & TK) with consultation in cases of uncertainty. Full text inclusion was repeated by two reviewers on all those identified as potentially relevant (DB & TK).
From all articles that met the review criteria, basic information was extracted into a standardised spreadsheet, which included details of the environment, activity, participants, types of outcomes being measured, and the methodology used to collect data. A methodology quality checklist was devised, guided by items from an available quality assessment tool [25]. Six binary criteria were used to summarise study quality: definition of target population or details provided on participants in study; random recruitment/third-party referral of participants (as opposed to self-selection); randomisation of participants to environments (or order of environments in the case of a crossover trial); base-line data collection to assess pretest comparability; credible data collection tools; and control of potential confounding factors between environmental settings.

Data synthesis
Quantitative synthesis was focused on any comparisons of the same activity in each environment (natural and synthetic) to investigate the specific effect of environmental setting. This was to ensure consistency in the interpretation of effect sizes from different studies. Four articles which met the review inclusion criteria were not included in the meta-analysis on this basis [26][27][28][29]. In addition, given that the review included studies measuring a broad range of different outcomes, a threshold number of four studies measuring the same outcome was chosen in order to decide whether to pursue a meta-analysis on a particular outcome.
Numeric data on health/well-being outcomes could usually be extracted from articles in the form of means and standard deviations (or standard errors) from their presentation in a table or a figure (using TechDig 2.0). If data were not available in the article, an attempt was made to contact the author by email for the relevant data. In order to ensure consistency in data extraction, the following rules were specified: in cases where individuals had been measured more than once before an activity, the values taken when individuals were still in similar environments [30,31] were extracted; in cases when individuals had been measured more than once after an activity, the values taken at a time closest to the end of the activity [30,31] were extracted. This was to enable comparison with the remaining studies, as most took measurements shortly following the end of the activity. The standardised mean difference between the outcome after activity in a natural environment versus after activity in a synthetic environment was calculated. All effect sizes were calculated using Hedges g and were corrected with the multiplication factor 1-3/(4(n 1 +n 2 )-9) where n 1 and n 2 is the sample size of groups 1 and 2 respectively to account for the known bias of this formula as a population estimator. The sign of the effect size was changed for some outcomes (e.g. anger, anxiety) to reflect the benefit on health/well-being.
In most cases, studies also presented data before exposure to each environment. These data were used to calculate a pretest effect size and we tested the effect of adjusting the posttest effect size by this value to account for any base-line differences. We present the statistics on unadjusted effect sizes and note when the adjustment affects the result. We test the sensitivity of the effect size to this adjustment rather than only presenting the adjusted effect sizes to avoid the possibility that effect sizes are only due to pretest differences, which may simply represent a return to "normal" levels in the group that started off with higher values rather than any effect of the environment.
When data within a study were presented separately for different subgroups, we calculated the effect size for each subgroup and create an average effect size for the study when combining data in the meta-analysis. Similarly, when the same outcome had been measured with more than one test (e.g. different attention tests), we calculated the effect size for each test and used their average. We calculated the overall pooled effect size and its confidence interval as a weighted average of all studies based on a random effects model. Arguably, fixed effects models could have been used when the heterogeneity test indicated an insignificant amount of between-study variance ('heterogeneity'); however, in these cases, similar results were obtained either way. We identified statistically significant effects as those where the confidence interval of the pooled effect size did not overlap zero. Heterogeneity was tested using the Q-statistic, which is calculated as the weighted sums of squares. Studies varied in a number of features (participants, design, environments, etc.), any of which could potentially explain any observed heterogeneity. Due to the low number of studies available, we limited our investigation of heterogeneity to comparator environment type (indoor or outdoor built), which represented the main dichotomy, when heterogeneity was significant (p < 0.05). Egger's tests were used to investigate any evidence for publication bias.

Results
The electronic database search yielded a large number of articles (over 20,000), which reflects the widespread discussion on nature and health. Many articles were rejected based on title and/or abstract as the articles could be classed as either clearly irrelevant, concerned with a more general discussion, or were promotional material on health and activity in nature. Based on title and/or abstract, 70 articles were deemed potentially relevant and the full text of all but 7 of these were successfully retrieved from either Bangor University Library, the British Library or from the web/ authors. After full-text viewing, 24 articles were included in the review (one article contained two relevant studies [32]). All articles identified as relevant were published in peer-reviewed journals except one charity report [33].

Methodology
Six criteria were used to summarise the methodology and reporting quality of studies. Many, but not all studies, described the characteristics of individuals participating in their study (16 studies) in terms of their age, sex, and health condition and/or amount of previous physical activity; the remaining studies only provided part of this information [26,[31][32][33]37,38,41,46,47]. Most studies recruited participants as volunteers (21 studies) rather than them being referred from a third party or independently selected [except [26][27][28]46]]. Thirteen studies were crossover trials. In ten of these, individuals were randomised and/or counter-balanced to determine the order of environments [27,30,31,34,[40][41][42][43][44][45]; while in three other studies, participants were exposed to the environments in the same order [33,36,39]. Seven other studies were randomised controlled trials [26,28,32,35,37,38,47]. Across all studies reporting randomisation, apart from one case, the method of randomisation was not described. Five other studies used an observational study design that did not involve experimental control of exposure to different environments [29,32,46,48,49]. Most studies (20 studies) took pretest measurements before exposure to the environment, which allowed investigation of the baseline comparability of participants [except [29,34,[47][48][49]. Thirteen studies were potentially affected by confounding variables in their comparison of different environments, which arose from various factors such as the presence of additional stimuli in the synthetic environment (e.g. a video of the outdoor walk [37]; internal/external stimuli received through headphones [45]). In other cases, there were differences in the activity [26][27][28][29]32], potential environment order effects in a crossover trial [33,36,39], or other potential differences arising from an observational study design [46,48,49]. However, in several of these cases, this was because the hypothesis of the study was not the effects of nature and therefore additional factors were manipulated or present according to the particular question of the study.
Different measurement tools and techniques were used to collect data on the different outcomes and there was variation in the methodological information provided. Assessment of concentration or attention was usually based on standard tests such as Digit Span Test; Symbol Digits Modalities Test; Necker Cube Pattern Control or another test e.g. proof reading task. However, in two cases, effects on attention were only based on parental perceptions (of ADD/ADHD) [48,49]. Information on emotions was based on self-reported data, obtained through use of various psychological questionnaires/ scores (using a Likert scale), which asked participants to rate how close their mood matched statements of mood.

Data synthesis
Differences between natural and synthetic environments after the activity Effect sizes were calculated for the most commonly measured outcomes, with between four and eight studies measuring the same outcome. Additional file 3 presents the effect sizes that could be calculated from each study, and where appropriate, effect sizes for different subgroups within a study, derived from data measured after activity in each environment. Self-reported emotions (energy/revitalization, tranquillity/calmness, anxiety/tension, anger/aggression, fatigue/tiredness and sadness/depression), tests of attention, blood pressure and cortisol concentrations were synthesized (see Figure  2). We analysed different self-reported emotions separately for the purposes of interpretation. Combining these effect sizes, using average data per study, provided evidence for beneficial effects of activity in a natural environment compared to the synthetic environment in terms of reduced negative emotions such as anger (Hedges g = 0.46, 95% CI = 0.23, 0.69), fatigue (Hedges g = 0.42, 95% CI = 0.07, 0.76) and sadness (Hedges g = 0.36, 95% CI = 0.08, 0.63) (Figure 2). There was a marginally positive effect on energy scores (Hedges g = 0.28, 95% CI = -0.01, 0.57). Data on anxiety (Hedges g = 0.12, 95% CI = -0.34, 0.58) and tranquillity (Hedges g = 0.39, 95% CI = -0.08, 0.86) were less consistent with greater variation in the observed effect. A positive effect was also found on tests of attention, based on the average effect across studies (Hedges g = 0.32, 95% CI = 0.06, 0.58). We also tested the effect of adjusting these effect sizes by any pretest differences (see Additional file 4). In most cases, the results were similar, which supports the comparability of participants at base-line, however, this adjustment moves the confidence intervals for fatigue so that they overlap zero (95% CI = -0.1, 1.47). This effect also occurred for the meta-analysis of attention after accounting for pretest differences (95% CI = -0.12, 0.60) although only three of the five studies present pretest data.
Synthesis of the results from blood pressure (systolic: Hedges g = 0.07, 95% CI = -0.22, 0.36; diastolic: Hedges Figure 1 The number of studies that measured health or well-being data within different categories (total number of studies = 25). 'Emotions' included self-reported emotions based on questionnaire scores; 'Attention' included tests of attention (e.g. Digit Span test) and symptoms of ADD/ADHD; 'Cardiovascular' included blood pressure and pulse; 'Endrocrine' included measurements of hormone concentrations; 'Immune function' included measurements of factors involved in immune function and 'Others' are detailed within the text. g = 0.07, 95% CI = -0.24, 0.38) and cortisol concentrations (Hedges g = 0.03 95% CI = -0.53, 0.58) found little difference in the effect of different environmental settings with confidence intervals of the pooled effect sizes overlapping zero (Figure 2) although the trends are similar across outcomes.
For studies with significant heterogeneity, which included tranquillity (Q = 17.52, df = 6, p = 0.01) and anxiety (Q = 14.14, df = 5, p = 0.02), we aimed to investigate the effect of comparator environment type (indoor or outdoor built) on effect size. Feelings of tranquillity after exposure to a natural environment were more positive than after exposure to an outdoor built environment (Q = 5.55, df = 1, p = 0.02; 95% CI = 0.18, 1.68, number of studies = 4), but not in comparison to an indoor environment (95% CI = -0.68, 0.54, number of studies = 3). All studies recording anxiety compared a natural environment with an indoor environment and so the impact of comparator type could not be investigated. There was non-significant heterogeneity for all other health or well-being outcomes (all p > 0.1). There was no evidence for publication bias as assessed with Egger's tests (all p > 0.1) but statistical power is limited by the low number of studies.

Changes before and after exposure to a natural environment
The previous analysis compared the differences in outcomes after exposure to each environment. It is possible that, in this analysis, a positive effect size for nature could arise even if the outcome declined in both environments, as long as this decline was smaller in nature. In order to investigate this possibility, we compared outcomes before and after exposure to a natural environment to investigate changes over time using the subset of studies that presented pretest data. This analysis found beneficial changes on feelings of energy, anxiety, anger, fatigue and sadness (Table 1). For other variables, which included attention, tranquillity, blood pressure and cortisol concentrations, there were no consistent changes between measurements before and after the activity in the natural environment as assessed by whether the confidence interval of the pooled effect overlapped zero. This analysis supports the interpretation that the positive effect sizes observed in selfreported emotions when comparing a natural to a synthetic environment are based on greater improvements over time in the natural environment rather than a smaller decline.

Other health or well-being outcomes
A limitation to quantitative synthesis of the studies included in this review is the variety of different health or well-being outcomes measured. Due to small numbers of studies measuring other outcomes, insufficient data points were available to attempt more powerful meta-analyses. Two studies conducted in Japan investigated the effects of walking in a forest on measures of immune function [31,36], which included measuring variables such as secretory immunoglobin A, NK activity, number of T-cells and white blood cells. Other hormones, or measures of hormone activation, apart from cortisol, have also been investigated such as adrenaline Figure 2 The pooled (weighted average) effect sizes (Hedges g) and 95% CI for each outcome. The sign of the effect size reflects the benefit on health (positive effects indicate greater attention, energy and tranquillity but lower values for the other outcomes). In brackets are shown the number of studies that was used to calculate the effect size and an asterisk is used to denote significant heterogeneity (p < 0.05) within a particular group. and noradrenalin [36,45] and salivary amylase [39,40]. Across these different studies and outcomes, their results provide mixed findings, with no clear, consistent difference emerging in the effect of different environments.
Hartig et al. [47] investigated the effects of a natural (garden) and urban environment on memory recall, and found that, despite an improvement in mood in the natural environment, there was no evidence of a difference in the recall of positive, negative or neutral memories between environments. Two cross-sectional studies used questionnaires to ask parents of children with ADHD/ADD to rate their child's symptoms after different activities and within different environmental and social settings [48,49]. Based on the parental assessment, the results support a positive impact of a natural environment compared to both an indoor and a built environment. The reliability of parental assessment as a measure of ADD/ADHD symptoms is, however, unclear. Cuvo et al. [27] compared the effects of an indoor living room and multisensory room, with outdoor activities in the grounds of an institution in a rural area on adults described as having 'profound mental retardation'. Three adult participants were observed, specifically for behaviour such as mouthing and body rocking, as well as engagement, and there was some indication of an improvement in behaviour during the outdoor activity compared to the indoor environments. In another study, Scholz and Krombholz [29] compared the motor performance of children from 10 forest kindergartens and from four 'regular' kindergartens, and concluded that the motor performance of the children from forest kindergartens was superior. In a longer-term trial, Isaacs et al. [28] compared 10 week programmes of leisure-centre based activities with instructor-led walking programmes through parks and open spaces, and also with an advice-only group. This study included follow up assessments at 10 weeks, 6 months and 1 year and measured a range of physical and mental health, and physical fitness outcomes. The results show that there was generally little difference in health/well-being benefits between the two activity groups, even in comparison with the advice-only group [28].

Principal findings
Our review identified 25 relevant studies, which measured a wide range of different health or well-being outcome measures. Meta-analysis of data from different studies on self-reported emotions provides evidence of a positive health benefit. This is manifest as lower negative emotions, such as anger and sadness, after exposure to a natural environment in comparison to a more synthetic environment. There is also some support for greater attention after exposure to a natural environment but not when effect sizes are adjusted for pretest differences. Meta-analyses of other variables, which include physiological parameters such as blood pressure and cortisol concentrations, are less supportive of a consistent difference. Each analysis was based on between four and eight studies.

Strengths and weaknesses of the available evidence
Most of the studies were experimental studies that involved a crossover design or different comparison groups and provided tests of the effect of different environments. In most cases, data were collected before and after the trial, which allowed investigation of the comparability of participants at baseline. These features may improve the internal validity of these studies, however, for a number of reasons, we would caution against generalisation of the effects observed in these studies to other contexts.
It is important to consider the possibility that there may have been differential effects not detected by only looking at pre-and posttest results; an absence of posttest effects does not necessarily mean that the environments did not affect the variable. Few studies presented data measured 'during' exposure. In a study using ambulatory blood pressure monitoring, Hartig et al. [35] found no significant posttest differences in blood pressure and the posttest means differed little from the means obtained prior to the environment. However, the measures obtained during the experiment demonstrated that systolic and diastolic blood pressure did vary as a function of environment [35].
The most common participants of these studies were college students, adult males, and physically active adults, and therefore they are not representative of all subsets of the human population. In addition, most participants were volunteers, which may have introduced self-selection bias. Given that the effects of exposure to a natural environment may vary among different subsets of the population, the effect sizes for less active individuals, children and individuals with specific health conditions warrant further investigation. In addition, many, but not all, experimental studies randomised participants between environments. It could be argued that randomised experimental exposure, and a removal of individual choice, would remove associations with leisure and enjoyment of the environment.
A number of the studies did not specifically test the hypothesis that exposure to nature is beneficial for health. Thus, in some studies, the natural environment being compared included environments such as green paths on a university campus. In these cases, it is not clear whether the environment was sufficiently green to provide a test of the effect of nature. Different types of natural environment could be hypothesised to have differential effects, which may further interact with the type of participant, but this could not be investigated with the low number of studies available in our metaanalyses. None of the studies investigated more than one type of natural environment.
The most common type of study outcome was selfreported measures of different emotions. Given these data were self-reported, they were therefore potentially open to bias depending on prior beliefs of the participants. The blinding of participants to the research question in these studies is problematic as in many cases the hypothesis could be guessed by participants based on the study design. Thus, it cannot be ruled out that findings may have been affected by participants' pretest opinions/beliefs on the likely effects of a natural environment rather than any actual changes in their mental health or well-being.
A final limitation of the studies included in the review is that most were focused on very short-term effects of different environments, making assessments of the participants shortly before and after the activity, and in some cases, during the activity. The longer-term implications of repeated exposure to different environments cannot be fully assessed but it is an assumption that repeated short-term exposure will bring cumulative health benefits. We did identify one longer-term study, which demonstrates that this sort of trial is feasible [28].
How should we test for health benefits of nature?
The results of our review are suggestive that certain types of natural environment may provide particular benefits for specific groups of people, however, this warrants further investigation. Testing for direct health benefits of nature is problematic given the variety of aspects of a natural environment and the ways in which they might impact on health. The principle factor that prevented the inclusion of many studies in this review was the necessity of a comparator group that allowed comparison of the effectiveness of a particular activity in nature with that in a different environment. However, what constitutes the most appropriate comparator is debatable and differences between a natural environment and an alternative environment could arguably be due to factors of the alternative environment rather than those of the natural environment. For instance, an outdoor built environment might provide additional stresses, such as traffic, which do not feature in a natural environment. There was some evidence of the importance of the comparator environment in this review; meta-analysis of data on tranquillity found that exposure to a natural environment was more positive when compared with an outdoor built environment but not with an indoor environment. The specific type and "quality" of the natural environment, for instance, its biodiversity value [50], could also be important, as well as the level of engagement of the individual with the environment [51] but this requires further study. Investigation of the effect of different natural environments across a range of alternative environments could aid in understanding which specific attributes of the environment are important. We did not include in our review studies which compared the effect of pictures of nature versus pictures of more synthetic environments, however, such studies have been conducted [52,53], and this approach may prove useful in providing some indication of the most relevant environmental attributes.
Consideration of the specific measures used to investigate health or well-being impacts could also be strengthened in future studies. Self-reported emotions were the most common outcomes in the studies we identified, with physiological outcomes less common and more variable in the specific outcome type. Standard measurement of relevant physiological outcomes would facilitate further meta-analyses as more datasets become available. Assessment of concentration or attention is a developing field of inquiry and there may be variation in the sensitivity of the instruments being employed. The approach of simultaneous assessment of both psychological and physiological outcomes that has been undertaken by several studies may prove useful in understanding the relationship between different outcomes, and the most relevant timescales of responses.
Hypotheses proposed to explain positive effects of nature have emphasized the role of nature in recovery from stress and mental fatigue [13,15]. Thus, effects of exposure to a natural environment may only be apparent, or at least be greater, following mental fatigue or a stressful event. These sorts of "context-dependencies" have begun to be studied, for instance, Hartig et al. [35] varied the completion of a task prior to exposing participants to different environments to investigate any interactions. Similarly, responses to natural environments may depend on past experiences; the social context [38] and the gender of an individual [37,42]. There is scope to investigate these factors further with well-designed empirical studies. Qualitative research methods may also help understand the role of the context in determining the effect of nature, and variation in effects among people [e.g. [54,55]].
The more simplistic hypothesis proposed on the impact of nature is that it simply promotes health-enhancing behaviour rather than having specific and direct benefits for health. For instance, the types of activities that occur specifically in a natural environment may be particularly beneficial, or a natural environment may encourage the initiation and continuation of physical activity, for instance jogging through a park [ [4] but see [56]]. Under this hypothesis, nature does not necessarily have a direct benefit for health itself but rather promotes health through preferences for particular environments and activities. However, our meta-analyses indicated a beneficial effect of a natural environment on well-being after controlling for activity and type of activity, which suggests that this more simplistic hypothesis cannot fully account for the patterns observed. Hartig [57] proposes that there is an "intertwining of the mechanisms" whereby the extent to which people are attracted to green spaces when taking physical activity is related to the restoration that they experience within them.

Conclusions
Public health planning needs to be informed by the evidence base for the effectiveness of interventions. This systematic review contributes a rigorous and objective synthesis of the evidence for 'added benefits' to health from activities in natural environments and has identified research which has measured specific health/wellbeing outcomes in a number of different settings. Based on self-reported measures of emotions there was some indication that an activity in a natural environment could have more positive effects than similar activities in a synthetic environment. There was also some support for greater attention after exposure to a natural environment but not after adjusting effect size for pretest differences. The evidence was weaker for any 'added value' of exposure to a natural environment on physiological outcomes however few studies were available for analysis. Public health decision-makers might wish to target resources towards interventions found, in this review, to be effective for specific outcomes for specific target groups. They might also use the review to justify a demand for more rigorous and objective evaluation of interventions which aim to use the natural environment for health promotion. Further research is necessary to investigate whether comparable effects are observed in different populations, environments and social contexts, and the longer-term significance of repeated exposure on health. Policy makers should therefore be wary of translating the findings of studies which have been conducted only in specific settings, for defined indicators and subjects, into generalised statements of universal benefits.

Additional material
Additional file 1: Details of search strategy.
Additional file 2: References of articles included in the review and summaries of their basic characteristics.
Additional file 3: Effect size (and standard error of the effect size) calculated from each article for the most commonly reported outcomes.
Additional file 4: Adjusted effect sizes (posttest -pretest effect size) and 95% CI to investigate sensitivity of results to any pretest differences.