Author: Rachael Gilbert <rachaelgilbert@gmail.com>
License: Public domain
Source: https://github.com/rcg798/NEISS
This is a short script to play around with pandas and numpy using the National Electronic Injury Surveillance System data from 2014. You can find the full set of data here: http://www.cpsc.gov/en/Research--Statistics/NEISS-Injury-Data/
I wanted to explore a dataset to get more comfortable using Python. Here, I answer the following questions:
- What are the top three body parts most frequently represented in this dataset?
- What are the top three body parts that are least frequently represented?
- How many injuries in this dataset involve a skateboard?
- Of those injuries, what percentage were male and what percentage were female?
- What was the average age of someone injured in an incident involving a skateboard?
- What diagnosis had the highest hospitalization rate?
- What diagnosis most often concluded with the individual leaving without being seen?
- Caveats to analysis?
- How would I visualize the relationship between age and reported injuries?