load the model erro #141

hengran · 2024-07-30T08:17:39Z

Hi, i meet a problem when load the model using the "tevatron/retriever/driver/encode.py". "lora_name_or_path" is trained with "tevatron/retriever/driver/train.py". I am confused by this problem.
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in run_code
exec(code, run_globals)
File "/root/paddlejob/workspace/env_run/llm-index/src/tevatron/retriever/driver/encode.py", line 119, in
main()
File "/root/paddlejob/workspace/env_run/llm-index/src/tevatron/retriever/driver/encode.py", line 63, in main
model = DenseModel.load(
File "/root/paddlejob/workspace/env_run/llm-index/src/tevatron/retriever/modeling/encoder.py", line 175, in load
lora_model = PeftModel.from_pretrained(base_model, lora_name_or_path, config=lora_config, use_safetensors=False)
File "/usr/local/lib/python3.8/dist-packages/peft/peft_model.py", line 356, in from_pretrained
model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/peft/peft_model.py", line 730, in load_adapter
load_result = set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name)
File "/usr/local/lib/python3.8/dist-packages/peft/utils/save_and_load.py", line 249, in set_peft_model_state_dict
load_result = model.load_state_dict(peft_model_state_dict, strict=False)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 2189, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PeftModelForFeatureExtraction:
size mismatch for base_model.model.layers.0.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([11008, 8]).
size mismatch for base_model.model.layers.0.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([11008, 8]).
size mismatch for base_model.model.layers.0.mlp.down_proj.lora

MXueguang · 2024-08-05T05:12:24Z

Hi @hengran, this is a relevant one #118.
I guess you can delete the safetensor ckpt and use the adaptor_model.bin in the saved lora ckpt (if it is not trained/saved in latest version)

Mr-Lnan · 2024-09-20T03:39:42Z

I have the same problem, have you solved it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

load the model erro #141

load the model erro #141

hengran commented Jul 30, 2024

MXueguang commented Aug 5, 2024

Mr-Lnan commented Sep 20, 2024

load the model erro #141

load the model erro #141

Comments

hengran commented Jul 30, 2024

MXueguang commented Aug 5, 2024

Mr-Lnan commented Sep 20, 2024