Skip to content

Using NLTK library various preprocessing steps has been applied to feed the input to different models after which vote classifier approach has been adapted to classify sentiments of movie reviews.

Notifications You must be signed in to change notification settings

RheagalFire/Sentiment-Analysis-of-Movie-Reviews

Repository files navigation

Sentiment Analysis of Movie Reviews

This notebook is a demonstration to implement various classifiers to classify the sentiment of a review as negative or positive.

Tooling

  • Python
    • Sklearn
    • NLTK

Concepts

  • Classification Algorithms Used
    • Naive Bayes Algorithm
    • Multinomial Naive Bayes
    • Logistic Regression
    • Bernoulli Naive Bayes Classifier
    • Linear SVC classifier
    • Custom made Vote classifier

Techniques in Handling Data

  1. The given two txt files has to be manually handled to classify them as positive or negative according to the label of file name.

  2. While reading the file, words that are adjectives are only appended to the our all_words list. This operation is achieved by using pos_tag attribute of NLTK. Example-

    from nltk.tokenize import PunktSentenceTokenizer
    example_text="I love my country , I am grateful for all the things it has given me"
    sample_text='I love biscuits. I am grateful to shops.'
    custom_token=PunktSentenceTokenizer(example_text)
    tokkend=custom_token.tokenize(sample_text)
    for i in tokkend:
        words=word_tokenize(i)
        tagged=nltk.pos_tag(words)
        print(tagged) 
    
  3. Creating featuresets that is of tuple datatype. Most common words are selected that are cross-referenced against occurring words in document to create a Boolean feature like below.

output 2

Classification Results

Results

About

Using NLTK library various preprocessing steps has been applied to feed the input to different models after which vote classifier approach has been adapted to classify sentiments of movie reviews.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published