Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] _raise_timeout_error when training chatglm2-6b #2713

Closed
wangshuai09 opened this issue Nov 22, 2023 · 0 comments · Fixed by #2715
Closed

[BUG] _raise_timeout_error when training chatglm2-6b #2713

wangshuai09 opened this issue Nov 22, 2023 · 0 comments · Fixed by #2715

Comments

@wangshuai09
Copy link
Contributor

Describe the bug
Loading /home/wangshuai/models/chatglm2-6b requires to execute some code in that repo, you can inspect the content of the repository at https://hf.co//home/wangshuai/models/chatglm2-6b. You can dismiss this prompt by passing trust_remote_code=True. Do you accept? [y/N] True False None Loading /home/wangshuai/models/chatglm2-6b requires to execute some code in that repo, you can inspect the content of the repository at https://hf.co//home/wangshuai/models/chatglm2-6b. You can dismiss this prompt by passing trust_remote_code=True. Do you accept? [y/N] Traceback (most recent call last): File "/root/miniconda3/envs/torch_npu/lib/python3.8/site-packages/transformers/dynamic_module_utils.py", line 597, in resolve_trust_remote_code answer = input( File "/root/miniconda3/envs/torch_npu/lib/python3.8/site-packages/transformers/dynamic_module_utils.py", line 577, in _raise_timeout_error raise ValueError( ValueError: Loading this model requires you to execute the configuration file in that repo on your local machine. We asked if it was okay but did not get an answer. Make sure you have read the code there to avoid malicious use, then set the option trust_remote_code=Trueto remove this error.

To Reproduce
Model download form huggingface-chatglm2-6b
Script is
torchrun --nproc_per_node=4 --master_port=20001 fastchat/train/train.py \ --model_name_or_path /home/xxx/models/chatglm2-6b \ --data_path /home/xxx/datasets/evol-instruct-chinese/evol-instruct-chinese-1024-subset.json \ --fp16 True \ --output_dir output_chatglm \ --num_train_epochs 5 \ --per_device_train_batch_size 8 \ --per_device_eval_batch_size 1 \ --gradient_accumulation_steps 1 \ --evaluation_strategy "no" \ --save_strategy "epoch" \ --learning_rate 5e-5 \ --weight_decay 0. \ --lr_scheduler_type "cosine" \ --logging_steps 1 \ --fsdp "full_shard auto_wrap" \ --model_max_length 512 \ --gradient_checkpointing True \ --lazy_preprocess True

Reason
trust_remote_code shold be True to execute code present on the Hub on local machine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant