This project analyzes the impact of video content on social media engagement using a dataset of Facebook posts from Thai fashion and cosmetics retailers. The analysis aims to determine whether video content truly drives higher engagement compared to other formats. Techniques such as Principal Component Analysis (PCA), k-means clustering, and logistic regression are applied to identify trends, classify posts, and derive actionable insights for social media marketing strategies.
- Engagement Insights: Comparative analysis of video content versus other post types in terms of reactions, shares, and comments.
- Principal Component Analysis: Developed PCA for dimensionality reduction and identified key drivers of engagement.
- Segmentation: Utilized k-means clustering to classify posts into meaningful segments based on their engagement metrics.
- Logistic Regression Models: Created models to predict whether a post is a video based on engagement data.
- Python: Core programming language
- Pandas, NumPy: Data processing and manipulation
- Matplotlib, Seaborn: Data visualization
- Scikit-learn: Machine learning, including PCA and clustering
- Jupyter Notebook: Interactive development environment
|-- Data/
|-- facebook_live_data.csv # Dataset with Facebook posts and engagement metrics
|-- Data_Article.pdf # Reference document on engagement patterns
|-- Code/
|-- STEFANO_A1.ipynb # Jupyter Notebook with full analysis and modeling
|-- STEFANO_A1.html # HTML export of the Jupyter Notebook
- Video Performance: Videos show higher engagement metrics overall but are not always leading for specific reaction types like "likes."
- Dimensionality Reduction: PCA revealed distinct engagement dimensions related to shares, reactions, and comments.
- Segment Characteristics: Segmentation highlighted clusters with unique engagement profiles, including high-share posts and low-engagement text updates.
- Best Model: The logistic regression model based on PCA retained components provided the best predictive performance, balancing accuracy and interpretability.
- Clone this repository to your local machine.
- Install required Python libraries (pandas, numpy, matplotlib, seaborn, scikit-learn).
- Open STEFANO_A1.ipynb in Jupyter Notebook or any compatible environment.
- Execute the notebook to reproduce the analysis, visualizations, and models.
- Review the HTML file (STEFANO_A1.html) for a static summary of the analysis.
Feel free to reach out for any queries or feedback:
Stefano Compagnone
stefanocompagnone98@gmail.com | +1 617-251-3853