Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edit tutorial comments on PEFT / LoRA #416

Merged
merged 2 commits into from
Jul 3, 2023
Merged

Conversation

vchiley
Copy link
Contributor

@vchiley vchiley commented Jul 3, 2023

It doesn't really address this issue but it allows us to note that FSDP + LoRA not working is a known issue.

@vchiley vchiley self-assigned this Jul 3, 2023
@danbider
Copy link
Contributor

danbider commented Jul 3, 2023

had a hard time pushing my commit through the spaces: here's my suggestion

Can I finetune using PEFT / LoRA?

  • The LLM Foundry codebase does not directly have examples of PEFT or LORA workflows. However, our MPT model is a subclass of HuggingFace PretrainedModel, and Feature/peft compatible models #346 added required features to enable HuggingFace’s PEFT / LORA workflows for MPT. MPT models with LoRA modules can be trained either using LLM Foundry or Hugging Face's accelerate. Within LLM Foundry, run (scripts/train/train.py), adding lora arguments to the config .yaml, like so:
lora:
  args:
    r: 16
    lora_alpha: 32
    lora_dropout: 0.05
    target_modules: ['Wqkv']
  • In the current release, these features have Beta support.
  • For efficiency, The MPT model concatenates the Q, K, and V matrices in each attention block into a single Wqkv matrix that is three times wider. Currently, LoRA supports a low-rank approximation to this Wqkv matrix.
  • Known issue: PEFT / LoRA do not directly work with FSDP.

Copy link
Contributor

@codestar12 codestar12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@codestar12 codestar12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vchiley vchiley merged commit 4e6a878 into mosaicml:main Jul 3, 2023
@germanjke
Copy link

@vchiley

@vchiley vchiley deleted the lora_cmts branch November 9, 2023 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants