This repository contains code for implementing various machine learning and deep learning models for multiclass text classification. The models implemented in this repository include support vector machines(SVM), Multinominal naive Bayes, logistic regression, random forests, ensembled learning, adaboost, gradientboosting, convolutional neural networks(CNN), and recurrent neural networks(RNN) an gted recurrent unit(GRU).
-
Python
-
Scikit-learn
-
TensorFlow
-
Keras
The dataset used in this project is the bbc-tex dataset, which consists of approximately 2225 text.
The results of each model on the bbc-text dataset are as follows:
Model | Accuracy |
---|---|
Logistic Regression | 96.58% |
Support Vector Machine | 96.94% |
Multinomial Naive Bayes | 94.97% |
Randomforest | 95.15% |
GradientBoostingClassifier | 94.25% |
Ensemble Classifier | 97.12% |
AdaBoost | 94.43% |
LSTM 1-Layer | 99.22% |
LSTM 2-Layers | 97.78% |
GRU | 91.74% |
CNN+LSTM | 98.73% |
BERT | 99.60% |
XLNet | 99.46% |
This web application for multiclass text classification using machine learning and deep learning would allow users to input text data and receive a prediction of the most likely category or label for that text.