Inference with peft and base model are the same #515

fozziethebeat · 2023-05-29T04:28:19Z

I trained a small pythia-160m model using a small variant of the example int8 training script in trl/blob/main/examples/sentiment/scripts/gpt-neox-20b_peft/clm_finetune_peft_imdb.py.

When doing inference on this LoRa model in a separate script, the results from both the adapter and base model remain the same. I've tried calling both get_base_model and disable_adapter which both seem like they should do the right thing after scanning the code.

A sample demonstration script:

import torch
import os

from dataclasses import dataclass, field
from datasets import load_dataset
from peft import PeftModel, PeftConfig
from torch.utils.data import DataLoader
from tqdm import tqdm
from transformers import (
    HfArgumentParser,
    AutoModelForCausalLM,
    AutoTokenizer,
)


@dataclass
class ModelArguments:
    final_model: str = (
        field(
            default="peft_done",
        ),
    )


parser = HfArgumentParser(ModelArguments)

# Fun fact, even if you have only one argument class to parse, you still need
# to decompose a tuple.
(model_args,) = parser.parse_args_into_dataclasses()

peft_model_id = model_args.final_model
config = PeftConfig.from_pretrained(peft_model_id)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

model = AutoModelForCausalLM.from_pretrained(
    config.base_model_name_or_path,
    return_dict=True,
    load_in_8bit=True,
    device_map="auto",
)


def generate(model, prompt):
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    if inputs["input_ids"].shape[1] >= (2048 - 128):
        return "Too Long"

    with torch.no_grad():
        outputs = model.generate(**inputs, max_new_tokens=50)
        input_ids = inputs["input_ids"]
        generated_tokens = outputs[:, input_ids.shape[1] :]
        return tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0]


print(generate(model, "I really enjoyed the "))
model = PeftModel.from_pretrained(model, peft_model_id)
print(generate(model, "I really enjoyed the "))
print(generate(model.get_base_model(), "I really enjoyed the "))

This results in the following output:

the book. I have been reading it for a while now, and I am
very excited to read it again.

I am very happy with the book. I am very excited to read it again.
I am very happy with
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.
icing on the cake. I was so impressed with the way the icing was applied. I was so impressed with the way the icing was applied. I was so impressed with the way the icing was applied. I was so impressed with the
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.
icing on the cake. I was so impressed with the way the icing was applied. I was so impressed with the way the icing was applied. I was so impressed with the way the icing was applied. I was so impressed with the

From what I can tell, the inference from the base model before the adapter is loaded works is unique, but inference on both the adapter and base model after PeftModel.from_pretrained is exactly the same.

How does one disable an adapter or get inference results from a base model while an adapter is active?

The text was updated successfully, but these errors were encountered:

github-actions · 2023-06-28T15:03:31Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

fozziethebeat mentioned this issue Jun 30, 2023

get_base_model() is returning the base model with the LoRA still applied. #430

Closed

github-actions bot closed this as completed Jul 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference with peft and base model are the same #515

Inference with peft and base model are the same #515

fozziethebeat commented May 29, 2023

github-actions bot commented Jun 28, 2023

Inference with peft and base model are the same #515

Inference with peft and base model are the same #515

Comments

fozziethebeat commented May 29, 2023

github-actions bot commented Jun 28, 2023