Real time census data has the potential to generate timely insights for urban policy makers, allowing them to capture important urban issues such as population displacement and neighborhood change. This study, building on top of the 2015 paper “Studying user income through language, behavior and affect in social media” by Preotiuc-Pietro et al. will show how twitter data can be used to predict user income level while using random forest selected top 20 features. In our study, we trained a Gaussian Process, a Support Vector Machine and a Random Forest model for prediction, achieving 0.42 for highest 10 class income level prediction and 0.88 for highest 3 class income level prediction. In conclusion, this paper shows how using relatively few features we can predict twitter user income level, and it provides a road map for policy makers to use twitter data to generate real time insights. [Keywords: twitter, natural language processing, income prediction]
-
Notifications
You must be signed in to change notification settings - Fork 1
DishT/Machine_Learning_City
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published