- Analyzing Numerical & Categorical data
- Count plots of Categorical variables
- Histogram plots of Numerical variables
- Box plots of numerical variables based on Categorical variable
- Histogram of numerical variables based on Categorical variable
- Transformation of Numerical variables
- Normalization or Standardization
- We can impute the missing values of numerical variables by its median. For categorical variables, missing values are imputed by mode.
- We can also try several group be method to decide how to impute.
- Handling of outliers
- dummy variable creation for categorical variables (escaping multicollinearity)
- EDA is extensively applied and some Scikit-Learn based model such as Logistic Regression, Support Vector Machine, K-Nearest Neighbour, Random Forest is used.
MIT