Support fine-tuning models #1174

leewyang · 2024-07-08T22:19:03Z

This PR adds support for fine-tuning (aka. "continued training" or "incremental learning") of existing pre-trained models on new datasets without having to retrain the entire model from scratch using all of the original training datasets. This can be useful if the original data is unavailable, or if the new data is customer-specific and the fine-tuned model is only intended for that customer. Note that the alternative is to train a new model from only the new data, but this requires a "sufficient" number of pairs of CPU/GPU applications and sqlIDs.

Changes

Updated qualx README.md with fine-tuning instructions.
Added a --base_model argument to the train CLIs.
Added xgb_model.save_config() calls to export the training hyper-parameters while saving the trained models.
Added new model.json.cfg files from training.
Load base_model and cfg when training.
Allow for multiple train/test spit functions, including dataset-specific split functions in plugins.
Fix bug in get_dataset_platforms() to allow for any paths (vs. local relative).

Test

Following CMDs have been tested:

External Usage:

spark-rapids train --base_model

Internal Usage:

python qualx_main.py preprocess
python qualx_main.py train
python qualx_main.py train --base_model
python qualx_main.py evaluate
python qualx_main.py compare

Signed-off-by: Lee Yang <leewyang@gmail.com>

eordentlich

👍

mattahrens · 2024-07-09T13:30:43Z

Updated PR description notes to be clear that new argument is --base_model and not --base-model to follow tools conventions for arguments.

user_tools/src/spark_rapids_tools/tools/qualx/model.py

user_tools/src/spark_rapids_tools/tools/qualx/preprocess.py

support fine-tuning models

5c384cf

Signed-off-by: Lee Yang <leewyang@gmail.com>

leewyang requested review from mattahrens and eordentlich July 8, 2024 22:19

eordentlich approved these changes Jul 9, 2024

View reviewed changes

mattahrens reviewed Jul 9, 2024

View reviewed changes

user_tools/src/spark_rapids_tools/tools/qualx/model.py Show resolved Hide resolved

mattahrens reviewed Jul 9, 2024

View reviewed changes

user_tools/src/spark_rapids_tools/tools/qualx/model.py Show resolved Hide resolved

mattahrens reviewed Jul 9, 2024

View reviewed changes

user_tools/src/spark_rapids_tools/tools/qualx/preprocess.py Show resolved Hide resolved

mattahrens approved these changes Jul 9, 2024

View reviewed changes

leewyang merged commit b0b32bb into NVIDIA:dev Jul 9, 2024
15 checks passed

leewyang deleted the qualx_finetune branch July 9, 2024 20:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support fine-tuning models #1174

Support fine-tuning models #1174

leewyang commented Jul 8, 2024 •

edited by mattahrens

Loading

eordentlich left a comment

mattahrens commented Jul 9, 2024

Support fine-tuning models #1174

Support fine-tuning models #1174

Conversation

leewyang commented Jul 8, 2024 • edited by mattahrens Loading

Changes

Test

External Usage:

Internal Usage:

eordentlich left a comment

Choose a reason for hiding this comment

mattahrens commented Jul 9, 2024

leewyang commented Jul 8, 2024 •

edited by mattahrens

Loading