Skip to content

Using geospatial data, this repository explores whether the location of popular clubs and bars in San Diego County is related to traffic collisions reported between 2015 and 2019.

Notifications You must be signed in to change notification settings

SarahAmiraslani/san-diego-collisions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 

Repository files navigation

San Diego County Collisions Exploratory Analysis

Overview

This project explores the relationship between traffic collisions and various factors in San Diego County, focusing on data from 2015 to 2019. We investigate the correlation between collision locations and popular nightlife areas, temporal patterns of accidents, and potential police biases in traffic stops.

Table of Contents

  1. Research Questions
  2. Hypotheses
  3. Datasets
  4. Methods
  5. Key Findings
  6. Ethical Considerations
  7. Limitations
  8. Team Members
  9. Acknowledgements

Research Questions

  1. What are the most common types of traffic collisions in San Diego County?
  2. Is there a relationship between high bar density areas and traffic collision frequency?
  3. Which police beats and geographic divisions experience the most severe accidents?
  4. Are there any demographic biases in police traffic stops?

Hypotheses

  1. Minor, non-fatal accidents will be most prevalent.
  2. More collisions will occur near nightlife hotspots (e.g., Pacific Beach, Gaslamp).
  3. Lower-income neighborhoods will experience more severe accidents.
  4. Younger drivers will be stopped and questioned more frequently.

Datasets

  1. Traffic Collisions (2015-2019): 28,122 observations
  2. Police Stops (2018-2019): 179,725 observations
  3. Yelp Bars: 50 observations
    • Source: Yelp API
  4. Yelp Clubs: 49 observations
    • Source: Yelp API

Methods

  • Geospatial analysis of collision locations relative to nightlife areas
  • Temporal analysis of collision frequency by time and day
  • Demographic analysis of police stops
  • Statistical testing of hypotheses

Key Findings

  1. Most common violations: Traffic signal and sign violations
  2. Highest collision frequency: Pacific Beach (1500 collisions)
  3. Severe accidents: Northwestern San Diego (highest average injuries), Southern San Diego (highest average fatalities)
  4. Demographics: Younger people stopped more frequently and for longer durations

Ethical Considerations

  • Implemented Safe Harbour protocol to protect individual privacy
  • Careful interpretation of results to avoid reinforcing stereotypes or biases
  • Consideration of socioeconomic factors in analyzing collision patterns

Limitations

  • Incomplete bar and nightclub data from Yelp API
  • Overlapping violation categories in the dataset
  • Broad geographic divisions may obscure local patterns
  • Limited timeframe (2015-2019) may not capture long-term trends

Team Members

Acknowledgements

This project was completed as part of COGS 108: Data Science in Practice at the University of California, San Diego. We thank our instructors and the San Diego Data Portal for providing resources and data.

For more information on San Diego police beats, visit the San Diego Police Department website.

About

Using geospatial data, this repository explores whether the location of popular clubs and bars in San Diego County is related to traffic collisions reported between 2015 and 2019.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published