Field sites
Staten Island (Richmond County) is one of five boroughs in New York City (NYC), with an estimated population of 476,000 people as of 2018 [35, 36]. The island is composed of neighborhoods exhibiting differences in housing structure and demographic and socioeconomic composition; overall, 75.2% of the population identifies as White or Caucasian, 11.7% Black or African American, 10.2% Asian, and 18.7% identifies as Hispanic or Latino [37]. Known as the “borough-of-parks”, 18% of the total area is covered by urban parks and forests [35], and an assessment of tick populations in NYC showed that most tick species were established on Staten Island and only a few focal areas in the Bronx borough [38]. The rate of locally-acquired Lyme disease cases on Staten Island has increased from 4 to 25 per 100,000 between 2000 and 2016 [39].
Three public parks were selected (see Fig. 1): Clove Lakes Park (40°37′07.2″N 74°06′41.7″W), Willowbrook Park (40°36′03.2″N 74°09′29.9″W), and Conference House Park (40°30′02.1″N 74°15′04.4″W). These parks were selected due to their previously observed high volume of park visitors and a range in tick density, with the lowest density of ticks at Clove Lakes in the north and the highest density of ticks at Conference House in the south [19, 31]. Open spaces and hiking trails were selected in each of the parks to assess human use and tick density, and these areas were selected based on site availability in the park (smaller parks had fewer open areas and trails) and presence of potential tick habitat (Additional file 1). We selected 14 sites in total: 6 sites in Clove Lakes, 4 sites in Willowbrook and 4 sites in Conference House, with 1:1 distribution between open spaces and trails. Prior to conducting tick sampling and park use assessments, we defined the boundaries of the open areas by determining the field of view from various points and identifying landmarks that could act as limits (i.e., paths, woodlines, and waterlines). For hiking trails, we identified entry/exit points or intersections with high pedestrian traffic.
Tick collections
From 20 May to 19 August 2019, we collected questing ticks using a 1 m2 white corduroy cloth (tick drag). Depending on the size of the habitat available per site, transects of 100 m were sampled, otherwise shorter transects were sampled. Attached ticks were removed every 20 m along the length of every transect [19, 31, 40]. Ticks were placed in 70% ethanol and later identified to species and sex using a dissecting microscope and appropriate taxonomic keys [41]. Drag lengths were measured using BasicAirData GPS Logger ver. 2.2.4 app for Android and GPS Tracker Pro app for iPhone 6 s, and lengths were verified using the program Garmin BaseCamp 4.8.3.
The 14 sites (Additional file 1) were sampled once a week between 8 AM and 7 PM, and during each visit, at least three drag samples were collected at each site. In open areas, drags were performed at the edges of the open area and within the open area (mowed lawn space). These edges often consisted of strips of vegetation along wood margins, waterlines, or natural paths. Impervious surfaces such as paved paths were excluded from sampling. For trail transects, drags were performed on the sides of the trail within the vegetation, and a 10 m buffer was kept between consecutive transects. Drags were not performed if vegetation was wet. Sampling sites were restricted to areas park visitors frequent to gauge risk for tick interaction (public trails or open lawn spaces, excluding inaccessible forested areas) and corresponded to the areas where human behavioral observations were performed. At Clove Lakes, two trails were unavailable for sampling after 21 July due to construction, vegetation removal, and inaccessibility.
Habitat classification
At each site, habitats were classified into five categories: maintained grass (regularly mowed lawn), unmaintained herbaceous, leaf litter, impervious, and edge (Additional file 2). Edge habitats (i.e., vegetation bordering an open area) consisted of unmaintained herbaceous or leaf litter habitats and occurred between 1) impervious and forest (i.e., strip of vegetation next to paved paths at the limit of the open area and the forest), 2) maintained grass and forest, and 3) maintained grass and water (i.e., brush/natural path at the waterline). Varying numbers of drags were performed in each site type and habitat, depending on habitat availability in each park (Additional file 3). Vegetation in unmaintained herbaceous and leaf litter habitats included a variety of species (Additional file 4). Maintained grass habitats comprised various graminoid species.
Park visitor observations
From 20 May to 19 August 2019, park usage by visitors was determined by observing each park site for 30 min within set time intervals: 9 am-12 pm, 12 pm-3 pm, 3 pm-6 pm (adapted from Goličnik & Ward Thompson) [42]. During these 14 weeks, we conducted observations Monday through Sunday at the three time intervals specified above (i.e., morning, afternoon, and evening). We collected park usage data at least two times for each day of the week (7) and each time interval (3) during the study period, reaching a minimum of 30-min observations performed for each of the 14 sites (294 observation hours total). In circumstances when observations for each site were incomplete (e.g., due to weather conditions), sites were observed again later in the season for the respective missing interval.
For open space sites, paper maps with landmark locations were used to record the specific location of participants within the site (i.e., data point or observation) (see Additional file 5). For each data point, we recorded the habitat used by the visitor and the time elapsed in minutes that the visitor spent in. A new data point was assigned every time a person being observed moved to a different location within the site (approximately more than 2 m from the previous focal point) during the 30 min period, so one person could have one observation during that time period or several depending on their movement (see T1 to T2 in Fig. 2A). If a visitor entered and exited the site in under 1 min, they were given an elapsed time of 0.1 minutes to reflect presence in the site. The following was recorded for each observed visitor: entrance/exit time of individual, dominant activity (e.g., walking, exercising, socializing, etc.), if they were with a dog, estimated age range in 10-year intervals, and observed gender. Ages were estimated in four categories: child (0-10), teen (10-20), adult (20-60), and senior (60+). For trails, entrance/exit time, age range, gender, and dominant activity was recorded. With this information, visitor counts (the number of unique visitors present during a specific observation session) by park, site type, and habitat were totaled and used for analysis. Furthermore, in open spaces where the activity of all individuals was visible, the mean number of minutes spent in each habitat for each age group and gender was calculated. During the process, observers did not engage with visitors to avoid influencing their behavior, and any individuals who approached observers were removed from the observational section of the study.
Knowledge, attitudes, and practices survey
We administered a 10 min questionnaire to assess knowledge, attitudes, and practices regarding ticks and tick prevention (Additional file 6). The questionnaire was based on a previous KAP survey tool developed for vector-borne disease assessments on Staten Island using the health belief model framework [43]. The KAP survey was first implemented in the field in 2018 and was tested in focus groups with participants from the general population and adjusted accordingly based on the 2018 results and feedback from the focus group participants. It comprised 27 questions related to park use, knowledge of ticks and tick-borne diseases, attitudes about perceived risk and severity, tick prevention behavior, and demographics. Questions involved a mix of open-ended, close-ended ordered, and close-ended unordered responses. Demographic and background questions included age, gender identity, race/ethnicity, highest level of education received, park visitation frequency, activities engaged in at the park, and source of information for ticks and tick-borne diseases. Knowledge questions included tick identification (Additional file 7), tick habitat, tick exposure, acquisition of the Lyme disease bacterium, prevention methods, and tick removal. Attitude questions included perceived severity of tick-borne diseases on Staten Island, perceived likelihood of tick encounter, reasons for not checking for ticks, and concerns about repellent use. Practice questions included frequency of repellent use, personal protection measures against ticks, and frequency of tick checks.
Participants were recruited by convenience sampling when 30 min observations were concluded so as to not interfere with the observations. Individuals who were not actively engaged in an activity (e.g., talking on the phone, running, playing sports, etc.) were approached for the survey, and all refusals and refusal reasons were recorded. We explained the purpose of the study and interviewed individuals over 18-years-old who gave oral consent. Individuals were able to stop the survey at any time, and the responses of only one visitor, if in a group, were recorded. Prior to administering the questionnaire, our team was trained on how to approach participants and avoid selection bias, to deliver the questionnaire in a uniform way, and to avoid potential ways of biasing participant responses. Prior to its implementation, the questionnaire was piloted to improve delivery length. Survey administrators wore clothing with institutional logos and name tags to improve response rate.
Open-ended responses were categorized, and since the responses were non-mutually exclusive, the responses were turned into dummy variables (i.e., given a 0 or 1 if a given response was verbalized). Closed-ended |unordered questions were also turned into dummy variables. Closed-ended ordinal questions (perceived severity of ticks and tick-borne diseases and perceived likelihood of tick encounter) were recorded on a 5-point Likert scale.
Data analysis
Analyses were performed using R 2019 (R Foundation for Statistical Computing, Vienna, Austria. URL: https://www.R-project.org/) with the following packages: ‘MASS’ [44], ‘emmeans’ [45], ‘MuMIn’ [46], ‘rms’ [47], and ‘car’ [48].
Tick density
We evaluated the association between tick counts per 100 m2 and park identity, site type (open spaces, edges of open spaces, trails), and habitat type (maintained grass, unmaintained herbaceous, leaf litter) as predictor variables. Our sampling strategy was timed to encompass the nymphal peak for all species, therefore the models only included unadjusted nymphal counts. We did not collect I. scapularis after week 10, so weeks 11 and 12 were removed from the analysis for this species. For the other tick species, data collected over 12 weeks was considered. Because tick counts were over-dispersed, a generalized linear model with a negative binomial error structure was selected for determining variables that best predicted tick counts for each species. The length of the transect was included as an offset in the model to account for differential effort. The models were performed separately for each tick species collected. Reference categories for park, site type, and habitat were selected based on the category with the most observations (number of drags) as the normative category to facilitate interpretation. For I. scapularis and A. americanum models, the park reference variable was Clove Lakes, the habitat reference was unmaintained herbaceous, and site type reference was trails. For H. longicornis models, habitat reference was unmaintained herbaceous and site type reference was trails; park was removed from the model since H. longicornis was only found in the Conference House location.
To account for model selection uncertainty, model averaging was implemented using the MuMin package by ordering competing models based on the Akaike’s information criterion (AIC) value [49]. If there was more than one model with a ∆AIC < 2 from the best ranked model, model averaging was performed [50]. Otherwise, the best ranked model was selected. Models were evaluated for multicollinearity issues by evaluating if the variance inflation factor (VIF) < 4, using the vif function from the rms package in R. If VIF > 4, we evaluated which variables were redundant, and decided on their inclusion based on our knowledge of the causal structure. The model coefficients were back-transformed from the log scale to determine the relative abundance of ticks per 100 m2 with respect to the reference category. The mean predicted tick number per 100 m2 (density of nymphs, DON) with confidence intervals was calculated for all model variables using emmeans: pairs and type = “response” in R.
Park visitor observations
Visitor counts and elapsed time were compared by park, site type, habitat, and habitat exposure time; they were examined by gender and across age groups. Counts were modeled using a generalized linear model with a Poisson error distribution with an interaction effect between age and habitat and gender and habitat. The significance of the interactions was determined with an Analysis of Deviance (type III) table and chi-square test. Interactions were analyzed using emmeans: pairwise and type = “response” with a Tukey method adjustment. To determine whether the mean elapsed time that visitors spent in different habitats differed by age group and gender, an additive linear regression model was used with the main effects of age group, gender, and habitat for each park. Subsequently, post-hoc pairwise comparisons between age groups and habitats were performed using emmeans: pairwise and type = “response” with a Tukey method adjustment.
Probability of human-tick encounter
We estimated the cumulative probability of a person encountering a nymph during a 30 min period as a measure of exposure risk per site type (open spaces vs. trails) and park (hereafter, probability of human-tick encounter). This measure was derived from the park visitor observations to assess the amount of time people spend in each specific location (or “focal point”) during the 30 min observation period and the mean nymphal density in that location (see Tick Density section). The probability of human-tick encounter was estimated for each focal point within each site assuming that the probability of human-tick encounter was independent among focal points, even for one individual recorded in multiple locations during the observation period. In other words, the probability of an individual encountering a tick in a focal point was independent of the same individual encountering a tick in another focal point. Therefore, the estimated probabilities of human-tick encounters can be considered a characteristic of the site type and the park, rather than a characteristic of the individuals. In each focal point, we estimated the proportion of time spent by any individual over the observed time period (30 min) and the mean nymphal density given the habitat, site type, and park (Fig. 2).
To estimate the probability of human-tick encounter, we used a passive sampling model that has been traditionally used to explain the species-area relationships [51]. This model assumes that the probability of finding a species in a defined area depends on the size of the “target” area and the number of randomly distributed individuals of said species (i.e. “darts”) [51]. Similarly, we can consider relative exposure time over an observation period (30 min) as the “target” and tick density as “darts”. We assumed the probability of any nymph missing a person in 30 min in a focal point was inverse to the fraction of the 30 min interval a person spent in the area (i.e., the longer the person stayed, the lower the probability they will ‘miss’ encountering a tick):
$$P(miss)=1-\frac{\Delta t}{P}$$
(1)
were ∆t is the time elapsed in that focal point (in mins) and P is the total observational period (30 mins).
The probability of missing encountering all n ticks is thus:
$$P(miss)={\left(1-\frac{\Delta t}{P}\right)}^n$$
(2)
where n is the number of ticks in a defined area, and thus we can replace n (number of ticks) by d (density of ticks):
$$P(miss)={\left(1-\frac{\Delta t}{P}\right)}^d$$
(3)
Since we recorded a different data point when the person moved to a distinct location (more than 2 m) within the site, we used the estimated tick density per 10 m2 (~ 2 m radius around the focal point) determined by the habitat, site type and park (Fig. 2).
Finally, the probability of human-tick encounter for each observed focal point in a 30 min observation period can be estimated from the observed data as:
$$P(tick)=1-{\left(1-\frac{\Delta t}{P}\right)}^d\propto \int_{t0}^{t1-t0}e(t) dt$$
(4)
where e(t) is the probability of encountering a tick, which is bounded by the time spent in a specific location [29].
This probability was estimated in each focal point where a person was observed in the 14 sites were behavioral observations and tick drags were conducted, and comparisons were conducted between site types (open spaces vs. trails) and among parks.
For trails, since we were not able to record exit times in all cases, we simulated a negative binomial distribution for the time elapsed on trails using the variance from the observed data in open spaces (i.e., the variance in time elapsed per site), to obtain a more accurate measure of variability, and we used a median of 15 min (half of estimated period). We derived the time spent in a trail as random draws from this distribution.
The probability of human-tick encounter was estimated for each tick species individually. The code for the probability of tick encounters can be found at https://piliffq.github.io/tick-risk-index/.
Knowledge, attitudes, and preventative practices of park visitors
In this study, we focused on predictors of performing tick checks among park users given that tick checks can be conducted by nearly every member of the public, irrespective of access or attitudes towards personal protective equipment, clothing, and repellent. CDC guidelines indicate tick checks are one of the most effective ways to prevent tick attachment and the transmission of tick-borne pathogens, making it the single most important tick-borne disease preventative method to practice. The KAP survey questions were organized into five sections: demographics, prior experience, knowledge, attitudes, and practices. Race/ethnicity was converted into a binomial variable (white or other) given most respondents identified as white. Education responses were grouped into “High school or less”, “Some college/Associate”, “Bachelors”, and “Graduate”. Age was categorized into six groups: 18-28, 29-39, 40-50, 51-61, 62-72, and 73-83. Questions related to prior tick experience were grouped and given a score from 0 to 4 (if yes to all, score was 4). This included whether the respondent had seen a tick before, found a tick on a pet or household member, and whether someone in the home had been diagnosed with Lyme disease. Knowledge questions were scored based on the number of correct responses, and individuals received one point per correct response. Identification knowledge score was out of sixteen points, and individuals received one point for every specimen they correctly determined was a tick and one point for every non-tick they identified correctly. Respondents received a tick habitat and bacterium acquisition score for every correct habitat they identified where ticks could be found and every correct response for how ticks can become infected with the Lyme disease bacterium. They also received a score for the total number of correct tick prevention methods they could identify, with one point for each correct method. Questions regarding knowledge and practices for tick prevention methods were open-ended. Some individuals reported practicing certain prevention methods in the practices portion of the survey but failed to report knowing about these methods in the knowledge section. In these cases, individual knowledge scores were categorized to reflect practicing the behavior. Attitude questions involving perceived severity of tick-transmitted diseases on Staten Island and perceived likelihood of tick encounter were ordered on a Likert scale and scored out of five, with five being the most severe or most likely.
Generalized linear models were used to determine which variables influenced previous tick exposure, perceived severity of tick-borne diseases, and perceived probability of tick encounter. Explanatory variables included in each model (excluding the variable being modeled) were park, park visitation frequency, perceived severity of tick-borne diseases, perceived probability of tick encounter, number of prevention methods used, knowledge scores for tick identification, habitat, and prevention methods, owning a dog, education level, gender, and age group. Linear models were performed to assess factors influencing the knowledge scores for tick identification, habitat, and prevention methods with the same explanatory variables included as described above. Emmeans was used for post-hoc comparisons for categorical variables, and p-values were adjusted using the Tukey method. Tick check frequency was converted into a binomial variable (“yes” to always or sometimes checking for ticks and “no” for never checking for ticks) and analyzed using a generalized linear model with a binomial regression. Model selection was conducted as described above.