Skip to content

(v2) Rework of the Javascript-based web application Multilabel Tagalog Hate Speech Classifier (MLTHSC)

Notifications You must be signed in to change notification settings

syke9p3/retrain-mlthsc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLTHSC Javascript Web App (v2)

Now works online!

This repository contains all related files for the rework of the JavaScript-based web app of Multilabel Tagalog Hate Speech Classifier (MLTHSC). Also includes the Jupyter Notebooks written for retraining the classifier model in Python.

(v1): GitHub: syke9p3/mlthsc-thesis

The project started as a college thesis proposal - a hate speech classifier that can classify Tagalog text based on different categories like age, gender, physical, etc.

Our time was mostly spent on writing the manuscript, gathering text data, implementing the software architecture of our model, training, testing, etc. Time was running short for the upcoming defense at that time so we needed to build something fast - a simple user interface that would demonstrate the functionality to the panelists. The v1 web app wasn't very polished as a result hence the need for a rework. Also, the greatest challenge we had was making the classifier functional when being deployed online. Since the original model was large (about 500+ mb), hosting sites are not able to accomodate it because they have storage size limitations for uploading files. That is why the model had to be retrained and quantized so its size can be reduced as to not be too heavy to load and perform inference faster albeit sacrificing classification accuracy. Regardless, the original model is still available to try which demo I'll be deploying in Huggingface Spaces soon.

What changed?

  • now works online (even on mobile)
  • deployed on GitHub Pages
  • hosted the classifier model on Hugging Face
  • model was quantized because the loading the original larger model in the application would take too long to actually perform the classifications. This was the challenge we had from the start when trying to deploy the model online.
  • refactored Javascript code to make adding features faster
  • changed input limit from between 3 to 280 words to between 15 to 280 characters
  • added "last classified text" section
  • enhanced appearance of saved post cards
  • changed the appearance of the buttons
  • outlines for accessibility
  • FAQ section
    • overview of the tool
    • definitions for each labels
    • how the classifier works
  • source code link

Still working on

  • features section in FAQ
  • filtering
  • pagination
  • dark mode (restructuring CSS)
  • implementing accessibility (ARIA) standards
  • (might try to rewrite this again in React)

About

(v2) Rework of the Javascript-based web application Multilabel Tagalog Hate Speech Classifier (MLTHSC)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published