This repository contains the notebooks shown during the lessons of Data Visualization and Text Mining course.
Course contents:
- NLP in practice: use SpaCy and NLTK to understand NLP algorithms
- Text Classification with ML: apply Machine Learning to text data sources
- Supervised Learning in Machine Learning
- Bag-of-Words
- CountVectorizer vs TDFVectorizer
- Text Classification project
- Topic Modeling: identify topics in a unsupervised dataset
- Latent Dirischlet Allocation
- Non Negative Matrix Factorization
- Data Visualization: see how to visualize your dataset in Python
- DashBoards: make Plots dynamic building an interactive dashboard, using Dash
- Neural Networks: how to classify data with Neural Networks - Tensorflow and PyTorch samples
- Embeddings: moving to a wector-based representation for each word
- LSTM: long-short term memory neural network for text generation and classification
- Transformers: Transformers architecture
- BERT: use a BERT model for text classification and QA
- LLM: discover how generative models work for NLP use cases
Updated: Academic Year 2024/2025