More Injuries Occur Near Tourist Attractions and Subway Stops


Hypothesis

I hypothesize that there are more injuries in places where pedestrians congregate. 2 examples of these in New York City are around tourist attractions such as museums and subway stops. As a preliminary analysis I look at the density of traffic injuries in Manhattan, taken from the NYPD motor collisions dataset, at points with 3 or more injuries. On this overlay, the locations of subway stations, and popular tourist attractions using the Google Places API.



By plotting on the map it is difficult to tell whether exactly there is a higher density of injuries near tourist attractions or subway stops. To this question I calculated the distance from the location of an injury to either the closest subway stop, or the closest tourist attraction. To determine statistical signifcance I repeated the analysis with uniform random locations, and compared the two distributions.

Random locations in Manhattan were determined by taking a random uniform distribution or latitude and longitude points in the general new york area (latitude: [40.695, 40.88], longitude: [-74.025, -73.9]), and determining whether they lie in Manhattan borough boundaries using a shapefile of New York City boundaries. If so the location was added to the distribution, and a uniform random distribution of longitude and latitude points was obtained that only lie in the Manhattan boundary.


Switch between the Places of interest in the app below to switch between distances to 'Subway' or 'Tourist attraction'.



There is a statistical signicance between the distributions so we can reject the null hypothesis that the distance to the closest tourist attraction or subway stop is independent.

Interestingly we see a small peak at large distances from subway stops or tourist attractions which occurs because of large areas such as Randalls island that are geographically isolated. Also, the mean distance to the closest subway stop is less than to the closest tourist attraction, which means subway stops are more evenly distributed than tourist attractions.


What's Next

Google places API, radar search outputs many search results (upto 200), but with little detail in each result. The nearby search function only out outs 20, but with more detail. To continue this project further I would obtain use the nearby function and attempt to classify the types of business in the area. For example, areas that have high concentrations of bars, such as the east Village, may harbor many pedestrian injuries on Friday and Saturday nights. Alternatively, areas with high concentration of banks, such as midtown Manhattan, may harbor many pedestrian injuries on weekdays, during the daytime. The type of business would act as a feature in my machine learning model.

Other areas in New York City that pedestrians may congregate are parks, and I anticipate that the boundaries of parks have more pedestrian injuries than average.

Overall, such features may increase the accuracy of my machine learning model, and suggest methods to achieve Vision Zero in New York City.