IMDb Review Classification

In this project, we fine-tune BERT to perform sentiment analysis on a dataset of IMDb movie reviews. The primary goal is to train a model to classify movie reviews as either positive or negative. This project will guide you through the process of text preprocessing, model training, and evaluation using BERT, a state-of-the-art transformer-based model for natural language processing.

About BERT

BERT (Bidirectional Encoder Representations from Transformers) and other Transformer encoder architectures have achieved remarkable success in various NLP tasks. BERT computes vector-space representations of natural language, capturing the context of each token in relation to all other tokens in the input text. The BERT model is pre-trained on a large corpus and can be fine-tuned for specific tasks.

Model Architecture

For this project, we use a variant of BERT called Small BERT. It has the same general architecture as the original BERT but with fewer and/or smaller Transformer blocks. The architecture is as follows:

Input Layer: Accepts tokenized text input.
Pre-processing Layer: Prepares the text data for the model.
Encoder Layer: Processes the text through BERT's Transformer blocks.
Dropout Layer: Applies dropout for regularization.
Dense Layer: Final classification layer with sigmoid activation.

Dataset

We use the Large Movie Review Dataset, which contains 50,000 movie reviews from the Internet Movie Database (IMDb). This dataset is used for training and evaluating the sentiment classification model.

Output

Here is the Image of Training and Validation loss, and Training and Validation accuracy.

Setup

To work on this project, you can use Google Colab. This allows for easy setup and provides access to GPU resources. The Colab notebook for this project is included in the repository.

Colab Notebook

Reference Colab Notebook

Learning Objectives

By the end of this project, you will:

Learn how to load a pre-trained BERT model from TensorFlow Hub.
Learn how to build a custom classification model by combining BERT with a classifier.
Learn how to fine-tune a BERT model on the IMDb dataset.
Learn how to save and use the trained model.
Learn how to evaluate the performance of a text classification model.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
bert.ipynb		bert.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IMDb Review Classification

About BERT

Model Architecture

Dataset

Output

Setup

Colab Notebook

Learning Objectives

About

Releases

Packages

Languages

dheerajkallakuri/ImdbReviewClassification

Folders and files

Latest commit

History

Repository files navigation

IMDb Review Classification

About BERT

Model Architecture

Dataset

Output

Setup

Colab Notebook

Learning Objectives

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages