Skip to content

Latest commit

 

History

History
4 lines (3 loc) · 1.35 KB

README.md

File metadata and controls

4 lines (3 loc) · 1.35 KB

python_project_2

#Question : What is the difference between peoples's race and gender with diabetes? Who is at a higher risk?

#Working with the CDC Diabetes Data Set was great because it was much more organized than the previous School regents Scores Data. My question was “What is the difference between race and gender with diabetes? Who is at a higher risk?". I started off by checking what was in the data, then cleaning the columns by dropping the ones with no values or just special characters. I set the index as the one that is already in the data frame by checking if the ID was unique then changing it. I then removed the special character "?" in other columns, I got a warning message when I used a different code that worked as well. I found that when "regex=True" is used, it gets rid of the warning and the gets rid of the special character as well. I used ‘Groupby’ to see how race and gender were impacted by diabetes. More women have diabetes, around 5,4708 were Women and 4,7055 were Men. African Americans and Caucasian have Diabetes the most compared to other races, African American 19,210 Caucasian 76,099. The ‘Groupby’ really helped to see the difference between race and gender, Caucasian women were at a higher risk and Caucasian men were also in 2nd place of being at higher risk. The graph I created with ‘Seaborn’ shows this clearly.