Implementation of statistics & data science algorithms for Machine Learning & Data Mining in python, using Numpy, Pandas & SKLearn. Data were extracted from MovieLens and Google Trends.
- Data Visualization:
- Bar Plots
- Histogram Plot
- LogLog Plot
- Line Plot
- Pie Plot
- Distribution Visualization
- Gaussian Distribution Plot
- Power Law Plot
- QQ Plot
- Correlation Analysis
- Covariance
- Pearson Correlation
- Spearman Correlation
- Fisher-Z Transformation
- Kendall Correlation
- Weighted Kendall
- Cosine Similarit
- Anomaly Detection (Outlier detection & removal)
- Isolation Forest
- Local Outlier Factor
- Elliptic Envelope
- DBSCAN
- PCA + DBSCAN
- Data Scaling
- Min-Max
- Max-Abs
- Z-Score
- Robust-Scaling
- Dimensionality Reduction (Image Compression Example)