Support StableVicuna #829

iRanadheer · 2023-05-04T16:24:34Z

I've seen you are supporting the openai 3.5 turbo model, and I couldnt find how to call using an api. perhaps I couldn't find the instructions?

Also, do you have any plans on supporting the stablevicuna model?

rwl4 · 2023-05-05T00:45:03Z

The reference to gpt 3.5 turbo you saw is actually the model conversation style which could be used on any model, and it's not completed. I actually added full support for finetuning models using a subset of OpenAI's ChatML here: #644

merrymercy · 2023-05-05T15:26:31Z

For gpt-3.5 / gpt-4, please use this option

FastChat/fastchat/serve/gradio_web_server.py

Lines 591 to 598 in a879340

    
           parser.add_argument( 
        
               "--add-chatgpt", action="store_true", 
        
               help="Add OpenAI's ChatGPT models (gpt-3.5-turbo, gpt-4)" 
        
           ) 
        
           parser.add_argument( 
        
               "--add-claude", action="store_true", 
        
               help="Add Anthropic's Claude models (claude-v1)" 
        
           )

For stablevicuna, please help us add it https://github.com/lm-sys/FastChat/blob/main/docs/arena.md#how-to-add-a-new-model

iRanadheer · 2023-05-05T19:35:19Z

@merrymercy

The process of adding models to the system is not difficult, but it becomes complicated when each model has its own "end-of-sequence" token that it was trained with. To solve this issue, a template for each version of the model can be created. For example, since the stablevicuna model was trained using vicuna v0, adding a template for v0 should solve the issue, as long as the model name argument does not contain the string "vicuna". Otherwise, a different template (V1.1) will be called. see below

def get_default_conv_template(model_name):
    model_name = model_name.lower()
    if "vicuna" in model_name or "output" in model_name:
        return conv_vicuna_v1_1

For the wizard-vicuna-13b model, since it was created using v1.1, we can use the existing template.

merrymercy · 2023-05-08T08:37:29Z

We did some refactoring to make adding new models easier.
Please help us add the support for wizard-vicuna-13b and StableVicuna. You can see #1019 for example.
For the name conflict, you can register a model_adapter StableVicunaAdapter prior to the VicunaAdatper so the stable vicuna has a higher priority. Then you can get the correct template for stable vicuna.

merrymercy changed the title ~~Sample Code for OpenAI Turbo Model and StableVicuna~~ Support StableVicuna May 8, 2023

merrymercy added the good first issue Good for newcomers label May 8, 2023

hi-jin mentioned this issue Nov 18, 2023

support stable-vicuna model #2696

Merged

3 tasks

merrymercy closed this as completed in #2696 Nov 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support StableVicuna #829

Support StableVicuna #829

iRanadheer commented May 4, 2023

rwl4 commented May 5, 2023 •

edited

Loading

merrymercy commented May 5, 2023

iRanadheer commented May 5, 2023

merrymercy commented May 8, 2023 •

edited

Loading

Support StableVicuna #829

Support StableVicuna #829

Comments

iRanadheer commented May 4, 2023

rwl4 commented May 5, 2023 • edited Loading

merrymercy commented May 5, 2023

iRanadheer commented May 5, 2023

merrymercy commented May 8, 2023 • edited Loading

rwl4 commented May 5, 2023 •

edited

Loading

merrymercy commented May 8, 2023 •

edited

Loading