NLPineers@ NLU of Devanagari Script Languages 2025: Hate Speech Detection using Ensembling of BERT-based Models

Accepted at COLING 2025, part of CHiPSAL: Challenges in Processing South Asian Languages

This repository hosts the code for the project titled "Hate Speech Detection using Ensembling of BERT-based Models" for Devanagari script languages (Hindi, Nepali). The aim is to leverage state-of-the-art techniques like BERT for hate speech detection in South Asian languages.

About the Project

This project focuses on developing an ensemble-based model for hate speech detection using BERT-based architectures, specifically tailored for languages that use the Devanagari script, such as Hindi and Nepali. The goal is to improve the detection of hate speech and offensive content in social media posts, comments, and other online platforms in these languages.

Scripts for:

Data Augmentation: Augmentation techniques are applied to both Hindi and Nepali dataset to address class imbalance and enhance model performance.
Multiple Models: The repository includes different models (referred to as m1, m2, m3, etc.) to handle various configurations and techniques. Please refer to the papers table for detailed descriptions of each model.

Setup

Install Dependencies:
```
pip install -r requirements.txt
```
Dataset Configuration:

Change the dataset location as per your setup. Ensure that the dataset path is correctly configured in the script files.

Running the Models:
```
cd models
```
```
python m1_chipsal.py
```

For citation

Will release later,

Contact

For any queries, feel free to reach out via email 📧:

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
chipsal-datasets		chipsal-datasets
combined_datasets		combined_datasets
images		images
models		models
scripts/dataAugmentation		scripts/dataAugmentation
.gitignore		.gitignore
READme.md		READme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLPineers@ NLU of Devanagari Script Languages 2025: Hate Speech Detection using Ensembling of BERT-based Models

About the Project

Scripts for:

Setup

For citation

Contact

About

Releases

Packages

Contributors 2

Languages

Anmol2059/NLPineers

Folders and files

Latest commit

History

Repository files navigation

NLPineers@ NLU of Devanagari Script Languages 2025: Hate Speech Detection using Ensembling of BERT-based Models

About the Project

Scripts for:

Setup

For citation

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages