Suggested DPO hyper-parameters #1003

Tejaswgupta · 2024-10-02T19:53:56Z

Tejaswgupta
Oct 2, 2024

I've fine-tuned Qwen2.5 using Llama-factory , the results of the fine-tuned model aren't great and it's prone to wrong answers. What are the suggested parameters to do DPO/SimPO on Qwen. I found a bunch of resources of Mistral and Llama-3.1 but none for Qwen, not even in the technical report.

I'm trying to do hit and trial but any suggestions from anyone who's done it before is highly appreciated.

Thank you in advance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggested DPO hyper-parameters #1003

{{title}}

Replies: 0 comments

Select a reply

Suggested DPO hyper-parameters #1003

Tejaswgupta Oct 2, 2024

Replies: 0 comments

Tejaswgupta
Oct 2, 2024