- IPV stands for Intimate Partner Violence.
- It is Mostly manifested in physical, sexual and emotional forms.
- The project aims to study the prevalence of violent and abusive behaviour in an intimate relationship (on ChildSafeNet's end) and also create an AI tool to help identify potential IPV containing texts in digital platform (on our end).
- There is an increasing prevalence in dating relationships in teenagers and youths, especially in online contexts.
- No research has been conducted for IPV in digital platform.
- TASK: Identify aspect terms and assign aspect category + polarity.
- Studies the presence of IPV content in a phrase level.
- Specific. May lose context in the mix.
- TASK: Sentence level IPV classification.
- Studies the presence of IPV content on a sentence level.
- Context somewhat preserved. But may not be specific when a single sentence carries two or more sentiment bearing component.
The project consists of two main tasks, Aspect Term Extraction and IPV Polarity Classification.
- A sequence labelling problem, in which each individual tokens are mapped to their corresponding aspect categories.
- Note that the aspect categories may be one of:
- Profanity
- Character assassination
- Emotional abuse
- Physical threat
- Rape threat
- General threat
- Violence based on ethnicity
- Violence based on religion
- Sexism
- Others
- Here, we map the text into one of three categories: {IPV, non-IPV, Unknown}.
- There are three modes under which we can train the model.
- Text --> Polarity
- Text + aspect_term --> Polarity
- Text + aspect_term + aspect_category --> Polarity
- Used a web-based annotation tool called WebAnno.
- Methodology:
- Created batches of 10 sentences from the curated data.
- Annotations created by the two interns at ChildSafeNet and exported into a TSV file.
- A subset (almost 15%) of the total data annotated by both to check the inter-annotator agreement.
- A parser constructed to convert the data into a feasible input format for the aspect extraction model and the IPV classifier.
-
Curated search terms (queries) for twitter scraper.
-
Created Term Frequency in IPV data.
-
Excluded stopwords.
-
Selected highest freq words.
-
Handpicked some from the low freq words too.
-
Used variations of words.
- अलच्छिनी OR अलक्षिनि
- बलात्कारी OR बलत्कार OR बलात्कार
-
Total 36 search terms.
- 1st lot: IPV
- 2nd lot: non-IPV