The objective of this task is to detect hate speech in tweets. Tweet contains negative/hate sentiments as well as positive sentiments. So, the task is to classify negative tweets from other tweets. Given a training sample of tweets and labels, where label '0' denotes the tweet is negative and label '1' denotes the tweet is not negative. The objective is to predict the labels on the test dataset.
- Sentiment analysis and Opinion mining is the computational study of User opinion to analyze the social, psychological, philosophical, behavior and perception of an individual person or a group of people about a product, policy, services and specific situations using Machine learning technique. Sentiment analysis is an important research area that identifies the people’s sentiment underlying a text and helps in decision making about the product.
1. Data loading.
2. Checking Distribution of Data.
3. Data Preprocessing
-
Remove Punctuations,special symbols and special characters.
-
Tokenization
-
Stemming
1. Bag Of Words
2. TF-IDF
3. Word2Vec
Building Logistic Regression and Support Vector Machine (SVM) on BOW, TF-IDF & Word2Vec features.
Logistic Regression and SVM on Word2Vec features outperforms the other model.
Thank You!