Easy-to-start code base to make use of BERT for ಕನ್ನಡ (Kannada Language).
Used the pretrained multilingual BERT to generate sentence embeddings and built a sentence classifier.
Dataset was taken from here
classes | precision | recall | f1-score | support |
---|---|---|---|---|
entertainment | 0.85 | 0.93 | 0.89 | 282 |
sports | 0.85 | 0.79 | 0.82 | 177 |
tech | 0.82 | 0.64 | 0.72 | 58 |
- Use of better classifier
- Fine-tune the BERT model on larger corpus
- Use BERT for other NLP tasks in Kannada