You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[rank0]: raise ValueError(f"Some specified arguments are not used by the HfArgumentParser: {remaining_args}")
[rank0]: ValueError: Some specified arguments are not used by the HfArgumentParser: ['--lora', '--lora_target_modules', 'q_proj,k_proj,v_proj,o_proj,down_proj,up_proj,gate_proj', '--query_prefix', 'Query: ', '--passage_prefix', 'Passage: ', '--pooling', 'eos', '--append_eos_token', '--temperature', '0.01', '--train_group_size', '16', '--query_max_len', '32', '--passage_max_len', '156']
DDP does not support such use cases in default. You can try to use _set_static_graph() as a workaround if your module graph does not change over iterations.
Parameter at index 447 with name encoder.base_model.model.layers.31.mlp.down_proj.lora_B.default.weight has been marked as ready twice.
these are steps followed to setup :
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
git clone https://github.com/texttron/tevatron.git
cd tevatron
git checkout tevatron-v1 also git checkout main
pip install transformers datasets peft
pip install deepspeed accelerate
pip install faiss-cpu
pip install -e .
run the following command to run
python -m torch.distributed.run --nproc_per_node=1 -m tevatron.driver.train
--output_dir retriever-mistral
--model_name_or_path "/Mixtral-7b-instruct"
--lora
--lora_target_modules q_proj,k_proj,v_proj,o_proj,down_proj,up_proj,gate_proj
--save_steps 50
--dataset_name Tevatron/msmarco-passage-aug
--query_prefix "Query: "
--passage_prefix "Passage: "
--pooling eos
--append_eos_token
--normalize
--fp16
--temperature 0.01
--per_device_train_batch_size 4
--gradient_checkpointing
--train_group_size 16
--learning_rate 1e-4
--query_max_len 32
--passage_max_len 156
--num_train_epochs 1
--logging_steps 10
--overwrite_output_dir
--gradient_accumulation_steps 4
i always got this error
/opt/conda/bin/python: Error while finding module specification for 'tevatron.driver.train' (ModuleNotFoundError: No module named 'tevatron.driver')
The text was updated successfully, but these errors were encountered: