The dissemination of Fake news always beat out the truth with significant growth. Fake news and false rumors are spreading further and faster, reaching more people, and penetrating deeper into social networks. The objective of program is to address the problem of detecting deceiving information in Urdu language from digital media text using Natural Language Processing techniques. The goal is to perform Fake News Detection: classifying news articles as fake or real. The is to look at a news story and classify it as fake or real.
Naıve Bayes is used from scratch using the Laplace smoothing. This classifier will use words as features and make a binary decision between fake and real.
You will also explore:
(1) The effects of stop-word filtering. This means removing common words from train and test sets.
(2) The effects of Boolean Naive Bayes. This means removing duplicate words in each document (review) before training.
We can clearly see from the results that Boolean Naive Bayes improves our performance as compared to simple Naive Bayes Classifier. Accuracy is as follows:
Naive Bayes = 75%
Binary Naive Bayes with stopwords = 76%
Binary Naive Bayes without stopwords = 77%