A Walk in the Park: Models of BMI with Green Space Density and Footpaths CURRENT

Background : While exposure to urban green spaces has been associated with various physical health benefits, the evidence linking these spaces to lower BMI, particularly among older people, is mixed. We suggest that dimensions of green space accessibility, which are generally unobserved in the existing literature, may be a source of this volatility in results. Methods: We conduct a cross-sectional study combining data from The Irish Longitudinal Study on Ageing and detailed land use information. We proxy respondents’ exposure to green spaces at their residential addresses in network buffers, which are adjusted to account for the density of local footpaths. Generalised linear models are used to test the association between exposure to accessible green space and BMI. Results: Relative to the third quintile, exposure to the two lowest quintiles of green space, as measured within a 1600m accessible network buffer, is associated with slightly higher BMI (marginal effect for lowest quintile: 0.88; 95% CI: 0.24-1.53; marginal effect for second quintile: 0.61; 95% CI: 0.02-1.2). The results, however, are not robust to small changes in how green space is measured and no statistically significant association between green spaces and BMI is found under other variants of our regression model. Conclusion : The relationship between green spaces and BMI among older adults is highly sensitive to the characterisation of local green space. While footpath availability offers a partial explanation to some unintuitive empirical results previously found in the literature, our results suggest that there remains some other unobserved factor which mediates the relationship between green spaces and weight status.

healthcare systems across the world. While its cause is undoubtedly multifaceted (5), it is possible that the form of the modern built environment has a role in promoting negative health behaviours that ultimately result in adiposity (6). Several aspects of the urban environment might be relevant, including land use mix, the extent of urban sprawl, the food environment, crime, walkability, and access to green spaces (7). Given that two-thirds of the world's population is expected to live in urban areas by 2050 (8), it is important that research aims to understand the interconnections between urban living and health behaviours. Of particular interest in the current work is the potential association between weight status and the availability of accessible urban green spaces.
While many studies have identified positive associations between urban green space and various dimensions of individual health (9,10), the evidence linking greenness to decreased obesity rates remains equivocal. A recent review of the literature by Browning & Lee (11) find that just 50% of reviewed analyses (n = 26) produce significant results in favour of a green space-obesity link. Indeed, some counterintuitive positive associations have also been found (12). The literature which specifically looks at associations between urban green space and obesity among older people remains limited but is equally divided. Using a large sample of those aged 45 and over in Australia, Astell-Burt et al. (13) find that higher exposure to urban green space is associated with reduced risk of obesity among women but that the protective effect is absent for men. Li et al. (14) find no association between green spaces and adiposity in a US-based sample of people aged 50-75. Using Irish data, Dempsey et al. (15) find a u-shaped relationship between urban green space and obesity in older adults, with those receiving the lowest and highest exposures to green space in the vicinity of their residential address exhibiting an increased probability of being obese.
The apparent conflict in the existing evidence could be attributable to various methodological concerns: over-reliance on cross-sectional data (16), absence of objective obesity measurements in some studies, use of aggregate rather than individual-level data, or insufficient control for potentially confounding factors (9,10). We posit that a further, relatively unexplored issue might also be relevant.
That is, while standard approaches used to objectively measure urban greenness generally quantify the availability of green spaces, they often disregard the issue of accessibility of the same spaces to individual study participants. Assuming that the primary channel through which green spaces may affect health is by facilitating physical activity, it is likely that spaces need to be easily accessible to the target population in order to effectively promote positive health behaviours. As such, the interaction between green spaces and local footpath networks may be of particular relevance. For example, living in a locality with extensive green coverage may not be associated with any physical health benefits if the same area lacks footpaths to access the green spaces on foot. Conversely, an area which has sufficient footpath access to a limited set of green spaces may effectively promote physical activity and accrue health benefits for residents, despite the fact that it is observationally 'less green'. This paper contributes to the existing literature by explicitly controlling for footpath availability in an analysis of green spaces and BMI. We exploit a novel data source that combines individual-level geocoded survey microdata with detailed land-use information from which the density of both local green spaces and footpaths can be extracted. While the analysis does rely on cross-sectional methods, the data source contains objective BMI measurements as well as a wealth of information on variables which may confound the relationship between green spaces and obesity. Our analysis thus overcomes many of the methodological challenges cited above.

Methods
This paper combines two distinct datasets in order to examine the relationship between accessible urban green space and BMI: The Irish Longitudinal Study of Ageing, and a land-use database known as Prime2. The datasets and the methods used to link and analyse them are outlined below.
The Irish Longitudinal Study on Ageing (TILDA) TILDA is a nationally representative survey of those aged over 50 in the Republic of Ireland. Data for Wave 1 (W1), which forms the basis of the analysis in the current study, was initially collected between October 2009 and July 2011. During this period, 8,175 individuals from a sample of 6,279 households were recruited to participate in the study. Respondents' spouses and partners were also invited to participate, regardless of their age, and so the full W1 sample size is 8,504. The data were primarily collected using Computer Assisted Personal Interviewing (CAPI) carried out by trained interviewers, face-to-face at each individual's home. Sensitive questions were included in a supplemental self-completed questionnaire (SCQ), which respondents returned by mail. Wave 1 respondents were also invited to attend a nurse-administered health assessment at a dedicated centre or, where attendance was infeasible or impractical, to complete a modified partial assessment in the home. Follow-up data have been collected at two-year intervals (17,18) but are not used here.
TILDA recruitment followed the RANSAM protocol (19), a method which samples households from the population of residential addresses in the Republic of Ireland. The geo-location of each respondent's residential address is thus known and can form the basis of spatial links to additional external data sources.
Outcome: Body Mass Index (BMI) BMI, calculated as a person's weight in kilograms divided by the square of their height in metres (kg/m2), serves as the health outcome of interest in this paper. The index is widely used as a tool to classify adult obesity based on the cut-off values outlined in Table 1 (20). Objective measurements for both components of the BMI calculation were collected as part of the TILDA health assessment. After each participant had removed footwear and any heavy outer garments, SECA 240 wall mounted rods, and SECA electronic floor scales were used to record height and weight, respectively (21). Since the health assessment was an optional component of the study, a valid BMI measurement is unavailable for 2,302 respondents in our sample, necessitating their exclusion. The top half per cent of the BMI distribution (n = 30) is further excluded from the analysis as the recorded values appear biologically implausible. See Figure 1 for full details on how the final sample was constructed. The distribution of BMI values among TILDA respondents in this final sample is presented in Figure 2. The observed range of BMI scores is 16.46-47.33, with a mean value of 28.55.
[Insert Table 1  The geography of urban green spaces may be systematically associated with socioeconomic characteristics (22). In particular, those with favourable economic circumstances may have the ability to self-select into more attractive and potentially greener neighbourhoods (23). While the structure of our combined data source does not allow us to capture all such factors, the richness of the TILDA dataset allows us to control for many socioeconomic, demographic, and health-related factors that may jointly determine BMI and exposure to green space. Importantly, we control for income category in all our econometric models. Failure to do so could lead to overestimation of a positive relationship between greenness and health (11). Our full set of control variables closely follows Dempsey et al. (15) and includes age category, urban location, gender, income category, employment status, marital status, highest level of educational attainment, medical cover, smoking status, and a dummy variable that indicates reported difficulty walking 100m. Descriptive statistics for these variables appear in Table 2.
Consistent with the overall cohort, females are slightly over-represented, making up 54% of our final sample (21). Despite TILDA's focus on older people, the W1 cohort is relatively young and active in the labour market, with 59.7% of the sample under the age of 65 and 38% in employment at the time of interview. A broad spectrum of educational attainment and income levels are captured in the data.
Smoking habits are prevalent among the cohort with past and current smokers combined accounting for 55.1% of respondents. Mobility-limiting disabilities are relatively uncommon at W1, with 6.1% indicating that their ability to walk 100m would be impeded by some physical or mental health condition. Nevertheless, it is important to control for such difficulties as the relationship between greenness and BMI is likely mediated by an ability to access and utilise the relevant spaces.

Land Use Data: Prime2
The spatial information used to derive the amount of green space in the vicinity of TILDA residential addresses is drawn from 'Prime2', an object-oriented digital mapping model which standardises a wealth of spatial data for Ireland. The dataset was developed by Ordnance Survey Ireland (OSI), the country's national mapping agency. Prime2 includes three features that are particularly relevant to the current study: 1) a detailed land-use data from which green areas can be identified, 2) a fully connected road network from which the theoretical accessibility of green areas can be imputed, and 3) a complete (albeit disjoint) set of urban footpaths from which the feasibility of walking along a particular route may be approximated. Data covering extensive areas surrounding the country's five primary urban centres (Dublin, Cork, Galway, Limerick, Waterford) were made available for the purposes of the current study. These areas, however, contain large commuting zones that may be quite rural in nature. In order to focus on distinctly urban areas, we restrict our green space analysis to regions identified as 'urban settlements' in the 2011 Irish Census. Figure 3 provides a map of the areas analysed.

Characterising Local Green Space
The strategy we employ to determine greenness of each urban TILDA respondent's locality builds on existing methods from the literature with the specific aim of accounting for urban accessibility factors, which may be omitted under traditional research designs. Broadly, we use Geographic Information Systems (GIS) to define a buffer zone around each respondent's residential address, and subsequently calculate the share of land area within the buffer that is made up of green spaces as a measure of exposure. It is ultimately an empirical question how best to specify these buffer zones such that the green space metric captures what has the greatest potential relevance to respondents' health outcomes. Indeed, past research has shown that observed associations between greenness and health can be sensitive to researchers' choice of green space characterisation (6).
Basing the analysis on circular buffers ignores various dimensions of connectivity within the urban space and may misrepresent the extent of the area accessible to a respondent. For example, if the urban landscape does not offer a straight-line path between the buffer centre and its edge, then an individual wishing to travel between the two locations necessarily transverses a distance greater than the buffer radius. In such cases, a circular buffer can capture green space that lies beyond an assumed maximum walking distance from the residential address. This issue is accentuated in regions where urban layouts do not follow grid systems (as is the case in Ireland) since straight-line paths between locations are generally uncommon. To overcome this issue, we follow a number of recent studies, which have carried out green space analysis within network buffers (25)(26)(27)(28). Such buffers are drawn based on a maximum distance travelled across a road network (See Panel A of Figure 4) While network buffers offer an improved characterisation of the maximum accessible area around a given residential address, they cannot alone account for accessibility issues within the chosen buffer space. For example, it may be impractical to walk along certain roads even when they are proximal to one's residential address. To this issue, we offer a novel solution. We produce network buffers using only roads with which footpaths are associated. Specifically, a junction-to-junction road segment is only included in a network buffer in this study if a set of footpaths, with a combined length which exceeds half that of the road segment, can be identified within 25 metres of the road segment centreline. As a result, our analysis is restricted to geographic areas where the density of local footpaths is high and, on average, green spaces that are not accessible on foot are excluded. A more formal description of our methodology is provided in the Appendix.
Even within these 'accessible network buffers', the proximity of green space to the road network itself might have a mediating role in any association between greenness and health. For example, recent work has identified explicit associations between street-side greenery and health outcomes (29). To test the relevance of such greenery in our context, we define second set of buffer zones which restrict the classification of relevant green spaces to those that fall within 50m of accessible roads (See Panel B of Figure 4). A comparison of results using the two alternative buffer definitions will allow us to identify which set of green spaces, if any, is most associated with BMI.
The appropriate size to draw the accessible network buffers is also unclear. A recent survey of the literature by Browning & Lee (11) suggests that, on average, larger buffers sizes (up to 2000m) best predict dimensions of physical health, but that for studies which centre the zones on exact residential addresses (as is the case in the current study), this predictive power might plateau at a much smaller buffer size (500-1000m). Since our observed results may be sensitive to this choice, we perform our analysis using multiple buffer extents. Our main specification follows Dempsey et al. (15) in using a 1600m buffer, which creates a zone roughly appropriate for a 20-minute walk from one's home address. We then repeat the statistical analysis with a smaller 800m buffer.
Our final analysis thus utilises four varied characterisations of local green space: "Accessible network buffers" covering 1600m and 800m spaces and "accessible street-side buffers" of the same sizes. In order to preserve the anonymity of individual TILDA respondents, the final variables enter our statistical models in categorical form. Specifically, the variables used represent the quintile of green space exposure which a respondent receives. Respondents who reside in non-urban settlement areas are coded as a separate category.

Model
We test the association between urban green space and BMI using regression techniques, specifically, using a generalised linear model (GLM). The GLM framework offers additional model flexibility compared to traditional Ordinary Least Squares (OLS). In particular, the researcher may specify a functional form that links the outcome variable to a linear index of explanatory variables, and make a distributional assumption about the variance of the estimator. The model, as it applies to the current context, is as follows: where g(.) is a function that links BMI to our independent variables of interest, green is a categorical representation of local green space, and the vector Xcontains the socioeconomic and health-related control variables discussed above. We perform a specification search to identify the most appropriate functional form for g(.) (link function) and value for v (estimator family). In the search process, we allow the link to be the identity, natural log, and square root functions, and v = 0,1,2 (equivalent to

Discussion
Our results, taken together, imply that if accessible green space and BMI are associated, then the link is highly sensitive to the characterisation of the former. While we do find that estimated exposure to the lowest two quintiles of accessible green space in a 1600m accessible network buffer is associated with higher BMI scores, it is clear that an adjustment for footpath accessibility, as we have defined it, has not offered a complete explanation for the u-shaped relationship previously identified in these data by Dempsey et al. (15). In this context, it remains possible that other unobserved elements of the urban environment, or indeed of green spaces themselves, may affect the individual decision to utilise green areas for physical activity. For example, inadequate lighting, restricted opening hours, or the presence of anti-social behaviour may, at times, impede usage of some spaces. Equally, the decision to use green spaces may be driven by individual preferences that cannot be captured through analysis of the urban environment alone (32). It remains for future work to incorporate such hypotheses into an analysis of accessible green space.
It is also striking that adjustments to the extent of the area in which green space is measured can substantially alter, and in our context statistically nullify, the association with BMI. While this volatility in results is not unusual within the literature, it serves to reaffirm the sensitivity of findings in this area to research design choices. The question of how best to characterise local green space such that the area analysed are those which has the greatest possible relevance to individual behaviour and ultimately health outcomes remains broadly unanswered, and should also be further addressed in future work.
The current study is subject to several limitations, primarily related to the green space exposure metrics used in the analysis. The process of building a 'walkable' road network based on proximity to footpaths is one which undoubtedly contains at least some measurement error. It is possible that some road segments excluded because of a lack of identifiable footpath may actually be walkable.
This, in turn, could exclude some accessible green spaces from our analysis. Conversely, our data lack detailed descriptions of individual footpaths, so our analysis can say little about the quality of the footpath network used. It is plausible that some areas marked accessible in our data could contain poor-quality paths on which it would be impractical for an older person to walk. This, in turn, may lead to an overestimate of green space exposure in the affected areas. In addition, the measure of accessibility developed in this paper utilises green spaces that are proximal to a publicly accessible road network. Given the current data, we cannot account for the ownership of these green spaces.
Some green areas that lie within a respondent's accessible buffer zone may not be available for public use. This could also lead to an overestimation of green space exposure for some respondents in our analysis.
Two broader limitations are also noteworthy: First, it is unclear whether or not any measure of green space which is centred on a residential address can be considered an accurate proxy of the amount of exposure the resident receives. Exposure to green space may instead be determined by unobserved dimensions of one's lifestyle. For example, if a respondent as a particular preference for spending time in green spaces, they may be willing to use other forms of transport to travel to spaces that are beyond walking distance from their home. Equally, if a respondent's routine includes activities that take place away from their reported residential address, then the area in which we measure green space may not be the most relevant. Second, since we only observe land use data at one point in time, we are precluded from using the longitudinal dimension of TILDA in our analysis. We cannot, therefore, fully account for the possibility that respondents systematically self-select into areas with specific levels of green space exposure. No causality can be assigned to the results presented in this paper.

Conclusions
This study contributes to the literature by explicitly incorporating a measure of footpath availability into characterisations of local green space. The relationship between the adjusted green space variables and BMI are subsequently tested. We find suggestive evidence that being exposed to the two lowest quintiles of green space, as measured by a 1600m accessible network buffer centred on one's residential address, is associated with increased BMI scores. This result is, however, highly sensitive to changes to the area in which green space is measured, and should not be considered robust. No statistically significant relationship between green spaces and BMI is observed when the size of the accessible network buffer is halved to 800m. Equivalently, focusing on green areas that are located adjacent to walkable roads produces no statistically significant association with BMI when using either 1600 or 800m buffer sizes. While the associations we report are not statistically significant in most cases, our model coefficients do broadly follow a u-shape, consistent with previous work carried out by Dempsey et al. (15) in a similar context. This similarity in our results suggests that the inclusion of footpath availability measures into the analysis does not offer a full explanation for the apparent empirical regularity identified in their paper. We suggest that future work could incorporate additional features of the built environment or dimensions of individual preferences for green space usage into a similar analysis.   Standard errors in parentheses. * p<0.1 ** p<0.05 *** p<0.01 Standard errors in parentheses. * p<0.1 ** p<0.05 *** p<0.01 Figure 1 Construction of the final sample. Map of Ireland indicating regions in which `urban` green space is analysed in this paper Comparison of network and street-side buffer strategies

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download. APPENDIX.docx