Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can you share the dataset class of SST-5, SNLI, TREC datasets? #36

Open
zimingyy opened this issue Aug 14, 2024 · 5 comments
Open

Can you share the dataset class of SST-5, SNLI, TREC datasets? #36

zimingyy opened this issue Aug 14, 2024 · 5 comments

Comments

@zimingyy
Copy link

Hi, i am interested in your non-differentiable objectives experiments using MeZO, but i don't find the dataset class and prompt template of SST-5, SNLI, TREC datasets. Can you share the dataset class of SST-5, SNLI, TREC datasets? Thank you very much!!

@zimingyy
Copy link
Author

Also, I tried to modify the code to support the zero-order optimization training of accuracy, an non-differentiable objective function. I use roberta-large model and SST2 dataset. I set the batchsize to 512 and the learning rate to 1e-6 and 5e-7. I tried to reproduce the results in your paper, but the training results were poor. Can you share this part of your code implementation?

@gaotianyu1350
Copy link
Member

Hi,

You can run the non-differentiable example by (large models, squad, also mentioned in README)

MODEL=facebook/opt-13b TASK=SQuAD MODE=prefix LR=1e-2 EPS=1e-1 bash mezo.sh --non_diff --evaluation_strategy no --save_strategy no --save_model

The implementation is here:

def zo_forward_nondiff(self, model, inputs):

@zimingyy
Copy link
Author

thanks for your reply! Sure, i have already tried the OPT-13b model finetuning on Squad dataset using MeZO, and the result is quite good. I want to try more non-differentiable example, such as Classfication tasks (accuracy metric), Can you share this part of your code implementation? I really appreciate your help.

@gaotianyu1350
Copy link
Member

Hi Ziming,

I realized the feature is actually provided. It is implemented under the flag --optimize_acc in the medium sized model folder.

@zimingyy
Copy link
Author

Yes, I have resolved my issue, and I am very grateful for your enthusiastic assistance!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants