Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Unsupported Languages to Base Model #3636

Closed
pourmand1376 opened this issue Aug 6, 2023 · 0 comments
Closed

Add Unsupported Languages to Base Model #3636

pourmand1376 opened this issue Aug 6, 2023 · 0 comments

Comments

@pourmand1376
Copy link
Contributor

pourmand1376 commented Aug 6, 2023

Yesterday, I was talking to @andreaskoepf on discord about how to add a new language to Base LLM.

Today I saw this comment from @somerandomguyontheweb:

Hi @pourmand1376, sorry for a slighly off-topic question: could you please share any details on how your friend managed to fine-tune LLaMA on text-only dataset, without instructions? I'm interested in doing the same thing with Belarusian Wikipedia, but so far I've only seen tutorials on how to instruct-tune LLaMA, and Wikipedia articles as such don't contain clearly delimited prompts and responses. Could you please briefly describe the approach?
Thanks in advance for any comments.

It seems that there are others like me who would like to fine-tune LLMs for unsupported languages like Persian.

This can be the place to discuss it. About asked question, I only know that he used this repository as the base and changes lots of things to make it work. I will ask him to give further details.

However, I think this repo can potentially serve as a repo for training base LLMs also.

I think we need a clear guide for people like me on how to do this thing. What I've seen so far, is that the Open-assistant team has done a great job for SFT fine-tuning. But there seems to be no code for fine-tuning base LLMs for other languages.

@LAION-AI LAION-AI locked and limited conversation to collaborators Aug 6, 2023
@olliestanley olliestanley converted this issue into discussion #3639 Aug 6, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant