Skip to content

Anmol2059/NLPineers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

arXiv

Accepted at COLING 2025, part of CHiPSAL: Challenges in Processing South Asian Languages


COLING Logo CHiPSAL Banner

COLING 2025 CHiPSAL


This repository hosts the code for the project titled "Hate Speech Detection using Ensembling of BERT-based Models" for Devanagari script languages (Hindi, Nepali). The aim is to leverage state-of-the-art techniques like BERT for hate speech detection in South Asian languages.

About the Project

This project focuses on developing an ensemble-based model for hate speech detection using BERT-based architectures, specifically tailored for languages that use the Devanagari script, such as Hindi and Nepali. The goal is to improve the detection of hate speech and offensive content in social media posts, comments, and other online platforms in these languages.

Scripts for:

  • Data Augmentation: Augmentation techniques are applied to both Hindi and Nepali dataset to address class imbalance and enhance model performance.
  • Multiple Models: The repository includes different models (referred to as m1, m2, m3, etc.) to handle various configurations and techniques. Please refer to the papers table for detailed descriptions of each model.

Setup

  1. Install Dependencies:

    pip install -r requirements.txt
  2. Dataset Configuration:

Change the dataset location as per your setup. Ensure that the dataset path is correctly configured in the script files.

  1. Running the Models:
    cd models
    
    python m1_chipsal.py
    

For citation

Will release later,


Contact

For any queries, feel free to reach out via email 📧:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published