Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support StableVicuna #829

Closed
iRanadheer opened this issue May 4, 2023 · 4 comments · Fixed by #2696
Closed

Support StableVicuna #829

iRanadheer opened this issue May 4, 2023 · 4 comments · Fixed by #2696
Labels
good first issue Good for newcomers

Comments

@iRanadheer
Copy link

I've seen you are supporting the openai 3.5 turbo model, and I couldnt find how to call using an api. perhaps I couldn't find the instructions?

Also, do you have any plans on supporting the stablevicuna model?

@rwl4
Copy link

rwl4 commented May 5, 2023

The reference to gpt 3.5 turbo you saw is actually the model conversation style which could be used on any model, and it's not completed. I actually added full support for finetuning models using a subset of OpenAI's ChatML here: #644

@merrymercy
Copy link
Member

For gpt-3.5 / gpt-4, please use this option

parser.add_argument(
"--add-chatgpt", action="store_true",
help="Add OpenAI's ChatGPT models (gpt-3.5-turbo, gpt-4)"
)
parser.add_argument(
"--add-claude", action="store_true",
help="Add Anthropic's Claude models (claude-v1)"
)

For stablevicuna, please help us add it https://github.com/lm-sys/FastChat/blob/main/docs/arena.md#how-to-add-a-new-model

@iRanadheer
Copy link
Author

@merrymercy

The process of adding models to the system is not difficult, but it becomes complicated when each model has its own "end-of-sequence" token that it was trained with. To solve this issue, a template for each version of the model can be created. For example, since the stablevicuna model was trained using vicuna v0, adding a template for v0 should solve the issue, as long as the model name argument does not contain the string "vicuna". Otherwise, a different template (V1.1) will be called. see below

def get_default_conv_template(model_name):
    model_name = model_name.lower()
    if "vicuna" in model_name or "output" in model_name:
        return conv_vicuna_v1_1

For the wizard-vicuna-13b model, since it was created using v1.1, we can use the existing template.

@merrymercy
Copy link
Member

merrymercy commented May 8, 2023

We did some refactoring to make adding new models easier.
Please help us add the support for wizard-vicuna-13b and StableVicuna. You can see #1019 for example.
For the name conflict, you can register a model_adapter StableVicunaAdapter prior to the VicunaAdatper so the stable vicuna has a higher priority. Then you can get the correct template for stable vicuna.

@merrymercy merrymercy changed the title Sample Code for OpenAI Turbo Model and StableVicuna Support StableVicuna May 8, 2023
@merrymercy merrymercy added the good first issue Good for newcomers label May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants