-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can you share the dataset class of SST-5, SNLI, TREC datasets? #36
Comments
Also, I tried to modify the code to support the zero-order optimization training of accuracy, an non-differentiable objective function. I use roberta-large model and SST2 dataset. I set the batchsize to 512 and the learning rate to 1e-6 and 5e-7. I tried to reproduce the results in your paper, but the training results were poor. Can you share this part of your code implementation? |
Hi, You can run the non-differentiable example by (large models, squad, also mentioned in README) MODEL=facebook/opt-13b TASK=SQuAD MODE=prefix LR=1e-2 EPS=1e-1 bash mezo.sh --non_diff --evaluation_strategy no --save_strategy no --save_model The implementation is here: Line 734 in 552cb1b
|
thanks for your reply! Sure, i have already tried the OPT-13b model finetuning on Squad dataset using MeZO, and the result is quite good. I want to try more non-differentiable example, such as Classfication tasks (accuracy metric), Can you share this part of your code implementation? I really appreciate your help. |
Hi Ziming, I realized the feature is actually provided. It is implemented under the flag |
Yes, I have resolved my issue, and I am very grateful for your enthusiastic assistance!!! |
Hi, i am interested in your non-differentiable objectives experiments using MeZO, but i don't find the dataset class and prompt template of SST-5, SNLI, TREC datasets. Can you share the dataset class of SST-5, SNLI, TREC datasets? Thank you very much!!
The text was updated successfully, but these errors were encountered: