Team Members:
- Amlan Mohanty
- Anwesha Mohanty
- Saurabh Badkas
- Pranav Bhushan
- I-Hung Ko
This project aimed to predict the booking rates of Airbnb listings using advanced predictive analytics. Our team developed and evaluated multiple models, focusing on feature engineering, addressing class imbalance, and data visualization to provide actionable insights for Airbnb hosts.
-
Developed Predictive Models:
- Created and evaluated 6 predictive models to optimize booking rates for Airbnb listings.
- Implemented XGBoost, achieving a training performance of 0.8574 and a test performance of 0.902, outperforming other groups by approximately 5% on average.
-
Addressed Class Imbalance and Feature Engineering:
- Tackled class imbalance by setting the scale_pos_weight parameter to 3.9, significantly enhancing model accuracy for minority classes.
- Conducted exploratory data analysis (EDA) on 50 features and engineered 29 additional features, improving model robustness and predictive power.
-
Data Visualization and Analysis:
- Visualized data trends using word clouds, geographical maps, box plots, scatter plots, bar charts, histograms, density plots, and violin plots to identify key factors influencing booking rates.
- Analyzed market trends impacting Airbnb hosts, providing actionable insights for optimizing pricing and listing strategies, potentially increasing rental revenue by up to 20%.
- Programming Languages: R
- Libraries: xgboost, glm, tree, glmnet, ranger, randomForest
- Visualization Tools: ggplot2, wordcloud, maps
Our project successfully identified key factors affecting Airbnb booking rates and provided practical recommendations for Airbnb hosts to optimize their listings. The use of advanced predictive models and comprehensive data analysis has demonstrated significant potential for improving rental revenue.
We ended up winning the contest with a test AUC score of 0.902.