This Anomaly Detection project which i have done during my tenure at the Infosys Springboard Internship.
- In this milestone have some basic pre-processing of dataset like handling missing values and replacing it with different techniques.
- Done some visualizations to get better insights of data like univariate and bivariate analysis.
- Used K-means and DBSCAN clustering to analyze the different distinct groups/cluster between data.
- used different techniques to get optimum no of clusters like Elbow Plot , Variance calculation and others.
- Done some Festure Engineering Part before applying ML models.
- Encoded the categorical column with low unique value (having only 2 distinct value) with Binary encoding and columns with high cardinal values with frequency encoding to overcome from problem of Sparse data which can be occured by using other techniques.
- Used Standard Scalar to Scale all The values.
- Machine Learning Models
- 1 . Isolation Forest
- 2 . Elliptic Envelope
- 3 . One-Class SVM
- Done viualization for each Machine Learning Model like ScatterPlots for numerical columns to visualize anomalies and normal points for categorical columns used Barplots and Piecharts
- used simple feed-forward neural network for detecting anomalies
- consisting 5 layers
- used reconstruction error as metric to differentiate between Normal and anomalous entries based on threshold.
- Done viualization for Deep Learning Learning Model like ScatterPlots for numerical columns to visualize anomalies and normal points for categorical columns used Barplots and Piecharts