Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support fine-tuning models #1174

Merged
merged 1 commit into from
Jul 9, 2024
Merged

Support fine-tuning models #1174

merged 1 commit into from
Jul 9, 2024

Conversation

leewyang
Copy link
Collaborator

@leewyang leewyang commented Jul 8, 2024

This PR adds support for fine-tuning (aka. "continued training" or "incremental learning") of existing pre-trained models on new datasets without having to retrain the entire model from scratch using all of the original training datasets. This can be useful if the original data is unavailable, or if the new data is customer-specific and the fine-tuned model is only intended for that customer. Note that the alternative is to train a new model from only the new data, but this requires a "sufficient" number of pairs of CPU/GPU applications and sqlIDs.

Changes

  1. Updated qualx README.md with fine-tuning instructions.
  2. Added a --base_model argument to the train CLIs.
  3. Added xgb_model.save_config() calls to export the training hyper-parameters while saving the trained models.
  4. Added new model.json.cfg files from training.
  5. Load base_model and cfg when training.
  6. Allow for multiple train/test spit functions, including dataset-specific split functions in plugins.
  7. Fix bug in get_dataset_platforms() to allow for any paths (vs. local relative).

Test

Following CMDs have been tested:

External Usage:

spark-rapids train --base_model

Internal Usage:

python qualx_main.py preprocess
python qualx_main.py train
python qualx_main.py train --base_model
python qualx_main.py evaluate
python qualx_main.py compare

Signed-off-by: Lee Yang <leewyang@gmail.com>
Copy link
Collaborator

@eordentlich eordentlich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@mattahrens
Copy link
Collaborator

Updated PR description notes to be clear that new argument is --base_model and not --base-model to follow tools conventions for arguments.

@leewyang leewyang merged commit b0b32bb into NVIDIA:dev Jul 9, 2024
15 checks passed
@leewyang leewyang deleted the qualx_finetune branch July 9, 2024 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants