Skip to content

2110531 Data Science and Data Engineering Tools (2023/1)

Notifications You must be signed in to change notification settings

pvateekul/2110531_DSDE_2023s1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

94 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

2110531 Data Science and Data Engineering Tools @Chula 2023

alt text

Syllabus:

Syllabus

Code:

Week01: Intro to Numpy, Pandas

  1. Numpy: Open In Colab

  2. Pandas: Open In Colab

  3. Pandas with Youtube stat data: Open In Colab

  4. (Advanced) Pandas with Youtube stat data: Open In Colab

Assignment (Pandas with Youtube stat data): Open In Colab

Week02: Data Preparation

  1. EDA: Open In Colab

  2. Impute Missing Value: Open In Colab

  3. Split Train/Test: Open In Colab

  4. Outliers with Log: Open In Colab

  5. Outliers with Log (Titanic DataSet): Open In Colab

Assignment: Open In Colab

Week03-04: Traditional ML

  1. Decision Trees: Open In Colab

  2. Linear Regression: Open In Colab

  3. Logistic Regression: Open In Colab

  4. Neural Network: Open In Colab

  5. K Nearest Neighbors: Open In Colab

  6. SVM: Open In Colab

  7. Save and Load Model: Open In Colab

  8. K-Means: Open In Colab

  9. Market-Basket Analysis: Open In Colab

Week05-06: Intro to Deep Learning

  1. Image classification (basic): CIFAR10: Open In Colab

  2. Image classification (advanced): Animal: Open In Colab

  3. Object detection: VOCDetection: Open In Colab

  4. Semantic segmentation: CamSeq2007: Open In Colab

  5. Time series Forecasting: Stock Price: Open In Colab

Week08: Advanced ML (Transformer, GenerativeAI, Model Monitoring, Text Classification)

  1. Image classification with Hugging Face: Open In Colab

  2. Model Monitoring with MLflow: Open In Colab

3-1. Model Monitoring with Tensorboard: Open In Colab

3-2. Model Monitoring with Weight and Biases: Open In Colab

4-1. Diffusion on Text-to-Image: Open In Colab

4-2. Diffusion on Image-to-Image: Open In Colab

  1. OpenAI ChatGPT (Simulated Toyota data): Open In Colab

6-1. Text Classification (TF-IDF): Open In Colab

6-2. Text Classification (BERT): Open In Colab

Week07: Big Data Architecture and Data Storage

  1. Simple Example Open In Colab

  2. Redis Assignment Open In Colab

Week08: Data Extraction

  1. Basic Web Scraping Open In Colab

  2. Wikipedia Scaping Open In Colab

  3. REST API Extraction Open In Colab

  4. Selenium Open In Colab

  5. Excel Extraction Open In Colab

  6. PDF Extraction Open In Colab

Week09: Data Ingestion

The Kafka sample codes cannot be openned in Colab as it requires connection to local kafka server. See more details in week09 code section.

Week10: Spark

  1. Basic Spark Open In Colab

  2. Spark SQL Open In Colab

  3. Spark ML Open In Colab

  4. Spark Assignment Open In Colab

Week11: Ops Stars

The airflow and fastapi sample codes cannot be openned in Colab as it requires local airflow and fastapi installation. See more details in week11 Ops Stars.

About

2110531 Data Science and Data Engineering Tools (2023/1)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •