Participants
For this cross-sectional study, data were collected in six municipalities in the Netherlands (Amsterdam, Utrecht, Alphen aan den Rijn, Heerlen, Berkelland, and Roerdalen) in September 2014. These municipalities were selected based on their differences in size, population density, and geographical location in the Netherlands (i.e., more central vs. more peripheral) to have sufficient variation in presence, type, and accessibility of sports facilities.
Eighteen thousand adults (3000 adults per municipality), aged 18–80 years old, were randomly selected from municipal population registers. An information letter was sent to the home addresses of these adults, in which they were invited to participate in an online survey on sports participation and the use of sports locations. The online survey was used to obtain data on sports participation characteristics, principal sports location, and the personal characteristics of the respondents. Adults were asked to fill in their main type of sport, that is, the sport in which they participated most frequently during the 12 months prior to the online survey. Subsequently, they were asked where that sports activity mostly occurred (e.g., a public space, a sports club, or a registered sports facility) and if they participated as member of a sports club individually or in an informal group.
In total, 1663 respondents completed the survey (9.2% response rate). Data of respondents who met the following characteristics were excluded for analyses: participation in an inactive form of sports (e.g., bridge) (N = 20), participation in sports at home (N = 40) or at non-official sports facilities (e.g., community centres) (N = 64), unknown or incomplete data with regard to the postal code of their home address (N = 236), and other socio demographical characteristics (N = 69). Adults who were unable to participate in sports due to disabilities (N = 21) or health constraints (N = 12) were also excluded. Complete data were available for 1201 adults, and these respondents were included in further analyses.
Measures
Sports participation at specific locations
Based on survey questions about sports participation (at least once per month vs. less than once a month), the sports location that was used most often for sports participation over the past month (i.e., a public open space, a sports club, or a sports facility) and sports club membership (yes or no), the independent variable ‘sports participation at specific locations’ was categorized in four groups: 1) no sports participation (i.e., no sports participation at all or less than once per month), 2) sports participation in public spaces, 3) sports participation as a member of a sports club, using sports club facilities, and 4) sports participation at indoor (private or public) sports facilities, without club membership, e.g., health centres/gyms and swimming pools. Sports participation in public spaces included both unorganized sports (e.g., individually, with a friend, or in a small informal group), and organized sports (e.g., in a running group but without club membership). The four different types of sports participation are further referred to as no sports participation, sports participation in public spaces, sports participation at sport clubs, and sports participation at sports facilities.
Objective physical and socio-spatial neighbourhood characteristics
The independent variables were objectively measured neighbourhood characteristics and included land-use data, number of sports facilities, and socio-spatial data. Land-use data of respondents’ home environments were obtained using ArcMAP 10.3.1. The coordinates of the 6-digit postal codes of respondents’ home addresses were uploaded in ArcMAP. Mean coordinates of the 6-digit postal codes (i.e., polygon features) were calculated, and Euclidean buffers of different sizes (i.e., 400, 800, 1600, and 2000 m) were drawn around these coordinates. The buffers were used to calculate the proportions of different types of land use (available from Statistics Netherlands, 2012) and the number of sports facilities (available from the Dutch Facility Monitor Sport (FMS), see [20]). The following types of land use were distinguished, as it is plausible that these may be related to sports behaviour: roads, facilities (e.g., churches, hospitals, shops, restaurants, and educational institutes), green space (e.g., parks, allotments, forest, and moorland), and blue space (e.g., rivers, lakes, and sea). These land use types were chosen based on associations shown in previous literature with physical activity [21, 22] and sports participation [23], as well as based on their potential relation with sports behaviour, for instance due to the exercise friendly design of public spaces. As there is no consensus on buffer size for assessing associations between environmental characteristics and sports participation, we assessed models with various Eucledian buffers (i.e., 400, 800, 1600, and 2000 m). Based on the following reasons we decided to use the 2000 m buffers. First, the model with the 2000-m buffer had the best model fit compared to the models we have tested with other buffer sizes (McFadden R2, see Table 2). Second, the 2000-m buffer size corresponded best to our assumptions. We assumed that sports participants using the public space (for sports such as running et cetera) usually go further than their immediately neighbourhood of 400 or 800 m around their homes. For instance, a previous study found that runners not only use the public space in their neighbourhood, however, they also go outside the neighbourhood and out of town [24]. Moreover, our data showed that sports participants using sports clubs or private sports facilities on average travelled 3082 m (SD = 3.843 m) to their sports activities (see also [16] based on the same data), and those who use the public space for sports such as running will probably use an even wider area around their homes.
The socio-spatial data included urban density, neighbourhood socio-economic status (SES), and safety. Urban density was estimated as the average number of addresses within a radius of one square kilometre (available from Statistics Netherlands [25]). Three categories of address density were distinguished: rural (< 500 addresses per km2), hardly to moderately urbanized (500–1500 addresses per km2), and strongly to extremely urbanized (> 1500 per km2). Objectively measured neighbourhood safety (on 4-digit postal code level) was obtained from the ‘Leefbaarometer 2.0’ [26]. This measure includes items such as reported demolitions, crime and theft. Neighbourhood safety (mean = −0.002, SD = 0.13) was defined relative to the Dutch average score that had a standardized score of zero and was classified into three categories: safety level below the national average (score ≤ −0.05), approximately equal to the national average (score − 0.049–0.049) and above the national average (score ≥ 0.05). Neighbourhood SES (mean = −0.043, SD = 1.20), on 4-digit postal code level, was obtained from The Netherlands Institute for Social Research [27]. The SES scores were based on an aggregated indicator consisting of the following variables derived from Statistics Netherlands: average neighbourhood income, proportion of residents with a low income, proportion of residents with a low education level, and proportion of unemployed residents. We categorized the SES scores relative to the Dutch average score into three categories: neighbourhoods with an SES below (< −-1), approximately equal to (−-0.99–0.99) and above (> 1) the national average.
Confounders
We controlled for the following demographical characteristics: age, gender, education, having children who live at home (yes or no), and employment (yes or no). Education was classified into three levels based on the self-reported highest level of completed education: 1) lower education (i.e., no education, primary education, and lower professional education), 2) middle education (i.e., intermediate and higher general education), and 3) higher education (i.e., higher professional education and university).
Statistical analysis
SPSS 23.0 was used to provide descriptive statistics on respondents’ personal characteristics and objective neighbourhood characteristics (i.e., socio-spatial characteristics and proportions of land use). The influence of the discussed determinants on the use of different locations for sports participation (i.e., participants belonging to one of the four distinguished type of sports participant categories) was analysed through the application of a discrete choice modelling approach. In this approach, type of sports participant is considered to be a choice out of four alternatives available: non-participants, public space participants, sports club participants, and other sports facility participants. In discrete choice modelling, individuals are assumed to choose the alternatives that provide the highest utility [18, 19]. The utility of alternative j (j = 1,…,J) for individual n can be represented by the following function:
$$ {U}_{nj}={\beta}^{\hbox{'}}{X}_{nj}+{\varepsilon}_{nj} $$
(1)
Here, is a vector of the observed characteristics (objective physical and socio-spatial neighbourhood characteristics as well as individuals’ socio-demographic characteristics), which is the deterministic part of the utility function and in the context of this study only includes individual-specific variables, and an error term , which is the stochastic component of the utility function. The probability that individual n will choose alternative i is the probability that the utility derived from alternative i exceeds the utility of the other alternatives, which can be represented by:
$$ {P}_{ni}=P\left(i|j\right)=P\left({\varepsilon}_{nj}-{\varepsilon}_{ni}<{\beta}^{\hbox{'}}{X}_{ni}-{\beta}^{\hbox{'}}{X}_{nj},\forall j\ne i\kern0.5em \right) $$
(2)
Under the assumption of independently and identically distributed (IID) error terms, the logit probabilities underlying the popular multinomial logit (MNL) model become:
$$ {P}_{ni}=\frac{{\mathit{\exp}}^{\beta^{\hbox{'}}{X}_{ni}}}{\sum_j{\mathit{\exp}}^{\beta^{\hbox{'}}{X}_{nj}}} $$
(3)
Central to the MNL model is the independence of irrelevant alternatives (IIA) property, which implies that the preference for an alternative is not affected by the inclusion or exclusion of other alternatives in the choice set. This property allows the use of independent standard normally distributed error terms and thus is fundamentally related to the IID assumption. However, many choice situations do not comply with IIA, as alternatives often share certain attributes that are unobserved by the researcher and therefore lead to correlations in the error terms of these alternatives. Additionally, in the case of sports location choice, such correlations can potentially be present. For example, being a ‘public space participant’ or a ‘other sports facility participant’ might be a shared preference of persons motivated to participate in individual sports but with a dislike for joining formal sports clubs. The Hausman-McFadden test [28] offers a procedure to test the IIA hypothesis for an MNL model, and applying this test, our data showed that the IIA property was violated for the estimated MNL model. In the presence of only individual-specific variables, the multinomial probit model (MNP) offers an attractive alternative model specification that can handle dependence across alternatives. The MNP model assumes that errors follow a multivariate normal distribution with mean 0 and covariance matrix ∑ [18]. The probabilities can be written as:
$$ {P}_{ni}=P\left(i|j\right)={\int}_{-\infty}^{\beta \ast {X}_1}\dots {\int}_{-\infty}^{\beta \ast {X}_{j-1}}f\left({\varepsilon}_{i1}^{\ast },\dots, {\varepsilon}_{ij-1}^{\ast}\right)\partial {\varepsilon}_{i1}^{\ast },\dots, \partial {\varepsilon}_{ij-1}^{\ast } $$
(4)
where f(.) is the probability density function of the multivariate normal distribution.
For the estimation of our MNP model, the software platform ‘R’, with the ‘mlogit’ package, has been used [29, 30].