This project is my solution to annotate sentiments of NPS comments using pre-trained language models.
This is not meant to work in one single environment. I switched between two environments (one using Pytorch, one using Tensorflow) to make everything run smoothly.
- Pytorch
- Fastai
- Tensorflow 2.0
- Tensorflow hub
This is not a fair comparison and please take it only as a reference. I personally like the small model (gnews-swivel) provided by Tensorflow hub as it is very easy to train and still powerful enough to give decent result also it makes deployment much easier.
Model | Accuracy | Precision | Recall | F1 Score |
---|---|---|---|---|
ULMFIT (Fastai) | 0.97 | 0.98 | 0.92 | 0.95 |
gnews-swivel (TF Hub) | 0.93 | 0.93 | 0.84 | 0.88 |
nnlm-en-50 (TF Hub) | 0.96 | 0.96 | 0.90 | 0.93 |
nnlm-en-128 (TF Hub) | 0.96 | 0.94 | 0.93 | 0.94 |
AutoML (Google) | 0.96/0.93 | 0.98/0.81 | 0.87/0.96 | 0.92/0.88 |
Note: Since (AA custom trained) AutoML spills out Neutral
on top of Positive
and Negative
, so it is very challenging to do unbiased comparison. Hence I recorded two values where 1st values are metrics considering Neutral
to be Negative
while the 2nd values are metrics considering Neutral
to be Positive
. Also, AutoML is used to test against only half of the human labelled dataset.
Personally, I'm also more into the idea of putting Neutral
as Negative
as most of the Neutral
comments are suggestions to further improve customers experience and we should take that seriously and improve accordingly.
Shoutout to all AA ASL 2018 members!
This project is not associated with my current employer. This project is solely initiated in attempt to beat Google AutoML.