The goal of this project was to create a sentiment classifier API that could use various models and datasets.
It is written in Python and uses the following libraries:
- Flask: for the API
- Tensorflow & Keras: for Machine Learning
For more details about the project, you can refer to these slides.
So far we are only using the IMDB large movie review dataset. But we plan to use more datasets later on.
Here are the required steps to get started with the API:
-
Clone the repository
-
Download the IMDB dataset and place it in the data folder. We use pre-trained word embeddings from FastText, so you might want to download them to the data folder as well:
-
Create a virtual environment, and install the requirements from
requirements.txt
file -
Add "sentiment_classifier" to your
PYTHONPATH
:
export PYTHONPATH=.:$PYTHONPATH
- Train the models by running:
python sentiment_classifier/scripts/train.py
- Run the API:
python sentiment_classifier/api/wsgi.py
- Test the API:
import requests
r = requests.post(
"http://localhost:8000/api/classify",
json={"text": "I love it"}
)
Make sure to checkout this notebook to better understand how the code works: Example Model Notebook.
To train the classifiers, run the train.py
scripts located in
sentiment_classifier/scripts
.
You can also refer to the documentation.