Add Unsupported Languages to Base Model #3636

pourmand1376 · 2023-08-06T06:05:44Z

Yesterday, I was talking to @andreaskoepf on discord about how to add a new language to Base LLM.

Today I saw this comment from @somerandomguyontheweb:

Hi @pourmand1376, sorry for a slighly off-topic question: could you please share any details on how your friend managed to fine-tune LLaMA on text-only dataset, without instructions? I'm interested in doing the same thing with Belarusian Wikipedia, but so far I've only seen tutorials on how to instruct-tune LLaMA, and Wikipedia articles as such don't contain clearly delimited prompts and responses. Could you please briefly describe the approach?
Thanks in advance for any comments.

It seems that there are others like me who would like to fine-tune LLMs for unsupported languages like Persian.

This can be the place to discuss it. About asked question, I only know that he used this repository as the base and changes lots of things to make it work. I will ask him to give further details.

However, I think this repo can potentially serve as a repo for training base LLMs also.

I think we need a clear guide for people like me on how to do this thing. What I've seen so far, is that the Open-assistant team has done a great job for SFT fine-tuning. But there seems to be no code for fine-tuning base LLMs for other languages.

LAION-AI locked and limited conversation to collaborators Aug 6, 2023

olliestanley converted this issue into discussion #3639 Aug 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Add Unsupported Languages to Base Model #3636

Add Unsupported Languages to Base Model #3636

pourmand1376 commented Aug 6, 2023 •

edited

Loading

This issue was moved to a discussion.

This issue was moved to a discussion.

Add Unsupported Languages to Base Model #3636

Add Unsupported Languages to Base Model #3636

Comments

pourmand1376 commented Aug 6, 2023 • edited Loading

This issue was moved to a discussion.

pourmand1376 commented Aug 6, 2023 •

edited

Loading