Skip to content

thomas-chauvet/kaggle_toxic_comment_classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Warning: The dataset contains a lot of dirty words, insults, etc. Please, consider it before checking the notebooks.

Slides

These notebooks were created in order to make a presentation about NLP and Keras, to show "how easy" it could be to create some neural networks for NLP problems with Keras. You can find slides here.

Notebooks to dive into NLP

Notebooks to understand Natural Language Processing (NLP) with Python. We use data from a Kaggle Challenge to find toxic comments in the Wikipedia forum.

Notebooks are splitted into 5 parts:

Have fun!

Play with notebooks

Do not hesitate to clone this repository to play with the notebooks.

You can use the conda environment provided with the environment.yml file. Use command conda env create -f environment.yml.

To avoid heavy computation on NMF and TSNE, we include in data/work some pickle with pre-computed NMF and TSNE.