In this study, we summarize four potential measurements of neighborhood police-reported crime exposure using geocoded crime data, drawing attention to large disparities in crime exposure based on race and ethnicity and highlighting important considerations associated with seasonal and spatial variation in reported crime levels. We estimated gestational crime exposure to demonstrate these approaches with regard to a sensitive period of human development that is directly tied to multiple health outcomes, but note that exposure could be estimated within any period of interest to public health researchers. Taken together, our findings emphasize several key challenges that epidemiologists and other public health researchers should consider when incorporating geospatial police-reported crime exposure data into work regarding the neighborhood environment.
First, our study demonstrates pronounced disparities in neighborhood crime exposure, with Black and Latinx residents of Durham exposed to systematically higher police-reported crime than White residents. Due to centuries of entrenched racial residential segregation  and structural racism that continue to shape Black and Latinx neighborhoods, racial and ethnic minority communities experience higher concentrations of poverty, unemployment, police surveillance, and numerous other social stressors  that contribute to disparate police-reported crime exposure. This has immediate implications for researchers who choose to model the effect of crime exposure on a health outcome while including terms for race and ethnicity in their statistical models, as they will smooth over data that fundamentally does not exist (ex. White residents with very high police-reported neighborhood crime) and generate estimates of effect that do not correspond to any real-world police-reported crime exposure . To overcome this issue, researchers should consider utilizing stratified models that incorporate crime data based on participant race and ethnicity.
Model stratification by race and ethnicity, while addressing violations of positivity caused by the outsize burden of police-reported crime exposure among Black and Latinx Americans, adds an additional layer of complexity to public health research utilizing reported crime data: the question of what exposure, exactly, police-reported crime actually measures. While police-reported crime may be statistically correlated with true rates of crime (particularly for more serious violent offenses such as robbery, aggravated assault, and homicide) , it is also, by definition, highly correlated with police presence in a neighborhood or area. Due to the long and ongoing history of punitive policing in Black, Latinx, and other minority communities in the United States , police-reported crime exposure represents a distinct set of psychosocial and physical stressors in a neighborhood based on the neighborhood’s ethnoracial composition. For Black, Latinx, and other over-policed populations in the United States, police-reported crime data captures exposure to crime and exposure to police in the neighborhood; both bring with them the potential for stress, violence, and even death [22, 26,27,28, 42, 43]. Schwartz and Jahn recently found that the incidence of fatal police violence against Black residents in the Durham-Chapel Hill metropolitan-statistical area (MSA) was 3.58 (95% CI: 0.82, 15.68) times the incidence of fatal police violence against White residents . For Black and Latinx NEST participants living in the Durham-Chapel Hill MSA, fear of police violence is not an abstraction.
Furthermore, White people in the United States are less likely to be reported, charged, or prosecuted for criminal behavior than any other group in the United States , meaning that reported crime in majority White residential areas may represent an undercount of actual criminal behavior. White people in the United States are also far less likely to suffer violence or death at the hands of police than Black or Latinx people [26, 27, 44], and are more likely to support increased funding for municipal police departments . Thus, while epidemiologic research on the effect of police-reported crime in majority-White neighborhoods may be a reasonable estimate of stress caused by actual criminal activity in the absence of stress caused by interactions with police, the same research in majority Black or Latinx neighborhoods will capture the layered effect of exposure to crime and exposure to potential police violence. These are fundamentally non-comparable exposures, and are impossible to disentangle with police-reported crime data alone. Ideally, future studies should include participant-reported crime victimization data alongside participant-reported data on interactions with the police in order to investigate simultaneous exposure to crime and policing in minority communities . The inclusion of citizen complaints regarding police behavior may also be a useful control to capture policing rather than crime. Unfortunately, these data were not available in NEST.
Second, we demonstrate two methods to measure police-reported crime exposure in the residential area (crime within 800 m of residence and crime within the block group of residence), and two methods to estimate crime density (crime per km2 and crime per 1000 people per km2). Both methods have been utilized in the epidemiologic literature to date, although estimated crime counts or rates within the Census tract or block group of residence appear to be more common. Both methods are also prone to error if researchers do not explicitly acknowledge, and attempt to control for, the effects of non-overlapping boundaries of crime data, Census tracts or block groups, and residential buffers. As demonstrated, when counting crime that occurred within the Census tract or block group of residence, crime counts or rates will be underestimated for participants in a Census tract or block group that extends beyond the city limits due to missing crime data outside of the city (or any other municipal boundary with police or sheriff recorded crime data). The same “edge effects” or “boundary problems”  will occur when using a residential buffer to estimate area-level crime exposure, as any participant whose residential buffer extends beyond the limits of the available crime data will have falsely low reported crime exposure. This finding held true in our data, leading to exaggerated disparities in uncontrolled crime count exposures because White participants were more likely to live near the Durham city limits than Black and Latinx participants (as an example, 29.2% of White participants in our sample lived within 800 m of the Durham city limits, versus 27.3% of Black participants and only 14.0% of Latinx participants). To control for this bias, we recommend that researchers calculate the area of the Census tract, block group, or residential buffer that overlaps with the available crime data, and divide crime counts by this area to generate a measurement of spatial crime density in the neighborhood environment (crime per km2). We also demonstrate how to apply similar methods to estimate the population that overlaps with the available crime data to generate a population crime density estimate (crime per 1000 people per km2).
The choice between these two crime density metrics (crime per km2 versus crime per 1000 people per km2) should be made based on the hypothetical mechanism through which reported crime exposure is expected to impact health, keeping in mind that neighborhood racial and ethnic composition will modulate these effects based on experiences of police violence. As a hypothetical example, reported homicide may be so stressful that participants experience a psychosocial and physiological fear response regardless of the surrounding population density. In this context, crime per km2 may be most appropriate, as participants experience a dose-response stress effect with increasing homicide exposure that is not buffered by a larger population in the surrounding area. For less physically dangerous or stressful crime, such as larceny, researchers may hypothesize that participants’ stress response is modified by the population density in such a way that participants feel more secure in busier or more populated areas. In this context, crime per 1000 people per km2 would be most appropriate. As has been done before, researchers may choose to compare these approaches head-to-head . We provide these hypothetical examples to encourage public health researchers to critically engage with police-reported crime data, acknowledge how different crime density metrics may be more or less fit for particular models of crime exposure, and evaluate distinct mechanisms of effect for distinct types of crime. Hypothetical mechanisms must also account for different experiences of punitive policing and police violence in the population.
While information on the perception of neighborhood crime was not available for all participants in our study, we also recognize the relevance of using distinctly defined, community-derived neighborhood boundaries as opposed to arbitrary spaces derived by Census block groups and buffers. Such “place-based” analyses  are context-specific and require community knowledge to conduct, but may provide additional insight into the effects of crime density beyond those derived from larger epidemiologic studies. The estimation methods outlined here can be used with any set of geographic boundaries, and we encourage researchers to use locally-derived and -recognized neighborhood boundaries when appropriate for their research question.
Lastly, aside from a critical evaluation of different methods to estimate crime density and exposure, we also highlight strong seasonal variation in crime that public health researchers should consider. Depending on the time of year, duration of a cross-sectional or longitudinal cohort study, and the type of crime under investigation, seasonal variation in reported crime may cause researchers to miss peak periods of crime exposure and weaken their ability to infer the effect of crime on population health. This seasonal variation, combined with the concentration of crime and policing in the city center , may also induce problems related to structural confounding , or structural violations of positivity .
The estimated disparities in reported crime exposure based on participant race and ethnicity that we present are not without limitations. First, we are currently unable to offer concrete methodological steps for public health researchers interested in disentangling police exposure from crime exposure using only police-reported crime data. We hope that acknowledging the limitations of police-reported data as a marker of actual crime will encourage other researchers to investigate how to better model the health effects of crime and policing as separate exposures. Second, of the 2681 parent-child pairs in NEST, 1493 pairs (55.7%) were excluded. While exclusion based on participant residence outside of the city of Durham (n = 820), gestation outside of the temporal availability of crime data (n = 252), and non-singleton birth (n = 75) would not be expected to bias the representativeness of our analysis regarding gestational exposure to crime in the city of Durham, exclusion based on missing data regarding race/ethnicity (n = 73) and residential address (n = 273) may cause unpredictable bias in our results by creating a non-representative sample of retained participants. Third, the use of donut masking by the Durham Police Department means that any given crime report during our study period may have been shifted slightly in space. This randomly introduced error would be expected to cause non-differential bias in our findings, but we are unable to access unmasked data to analyze the impacts of donut masking in our sample. Fourth, we would ideally have access to both racial and ethnic data for NEST participants (i.e. non-Latinx Black, Latinx Black, non-Latinx White, etc.), but were limited to categories of “Black,” “Latinx,” and “White” in the provided data. Collapsing racial and ethnic data into one ethnoracial construct smooths over salient health differences at the population level , as the “Latinx” label available in NEST almost certainly combines individuals who are racialized as Black, White, and more . This is further complicated in NEST due to the requirement that participants spoke English at enrollment. We encourage epidemiologic researchers collecting data on neighborhood crime exposure (or any other socioenvironmental exposure) to collect and incorporate data on both race and ethnicity (as well as sexuality, gender identity, etc), as individuals occupying multiple marginalized identities  may face the highest levels of interpersonal and neighborhood crime and violence.