whisper : improve handling of prompts #1981

ggerganov · 2024-03-21T06:03:49Z

fix #1960, #1961, #1962

Changed whisper_tokenize() to return the negative number of required tokens when the token buffer is not big enough. This can be used to determine the necessary tokens for a given text:

int n_needed = -whisper_tokenize(ctx, "some text", NULL, 0);

Also improve docs by clarifying that the Whisper models can process prompts of up to only n_text_ctx/2 tokens which is 224 tokens. So there is no point to provide longer prompts as they would be truncated:

whisper.cpp/whisper.cpp

Lines 5474 to 5480 in 48a1452

    
           // if we have already generated some text, use it as a prompt to condition the next generation 
        
           if (!prompt_past.empty() && t_cur < 0.5f && params.n_max_text_ctx > 0) { 
        
               int n_take = std::min(std::min(params.n_max_text_ctx, whisper_n_text_ctx(ctx)/2), int(prompt_past.size())); 
        
               prompt = { whisper_token_prev(ctx) }; 
        
               prompt.insert(prompt.begin() + 1, prompt_past.end() - n_take, prompt_past.end()); 
        
           }

ggerganov · 2024-03-21T16:54:37Z

@sindresorhus lmk if this change would work for you

sindresorhus · 2024-03-24T17:24:51Z

Thanks for looking into this. That would work, although I still think wrapping it up into a separate function would make for a better API and make it more discoverable for consumers. I would never have thought of using whisper_tokenize like that.

// Some docs
int whisper_token_count(struct whisper_context * ctx, const char * text) {
   return -whisper_tokenize(ctx, text, NULL, 0);
}

sindresorhus · 2024-03-24T17:25:40Z

fix #1960, #1961, #1962

By the way, you need to make that:

fix #1960, fix #1961, fix #1962

For GitHub to do the right thing and close them all when this PR is merged.

sindresorhus · 2024-03-24T17:29:50Z

examples/main/main.cpp

@@ -207,7 +207,7 @@ void whisper_print_usage(int /*argc*/, char ** argv, const whisper_params & para
    fprintf(stderr, "  -nt,       --no-timestamps     [%-7s] do not print timestamps\n",                        params.no_timestamps ? "true" : "false");
    fprintf(stderr, "  -l LANG,   --language LANG     [%-7s] spoken language ('auto' for auto-detect)\n",       params.language.c_str());
    fprintf(stderr, "  -dl,       --detect-language   [%-7s] exit after automatically detecting language\n",    params.detect_language ? "true" : "false");
-    fprintf(stderr, "             --prompt PROMPT     [%-7s] initial prompt\n",                                 params.prompt.c_str());
+    fprintf(stderr, "             --prompt PROMPT     [%-7s] initial prompt (max n_text_ctx/2 tokens)\n",       params.prompt.c_str());


I would document the 224 number here for quick reference.

Same in

whisper.cpp/whisper.h

Line 507 in 5c2c07d

// maximum of whisper_n_text_ctx()/2 tokens are used

* whisper : improve handling of prompts * whisper : add whisper_token_count helper

whisper : improve handling of prompts

5c2c07d

sindresorhus reviewed Mar 24, 2024

View reviewed changes

whisper : add whisper_token_count helper

ba69578

ggerganov merged commit 1558ec5 into master Mar 25, 2024
6 checks passed

jiahansu pushed a commit to WiseSync/whisper.cpp that referenced this pull request Apr 17, 2024

whisper : improve handling of prompts (ggerganov#1981)

6bbbccc

* whisper : improve handling of prompts * whisper : add whisper_token_count helper

viktor-silakov pushed a commit to viktor-silakov/whisper_node_mic.cpp that referenced this pull request May 11, 2024

whisper : improve handling of prompts (ggerganov#1981)

39b57a4

* whisper : improve handling of prompts * whisper : add whisper_token_count helper

iThalay pushed a commit to iThalay/whisper.cpp that referenced this pull request Sep 23, 2024

whisper : improve handling of prompts (ggerganov#1981)

41707ab

* whisper : improve handling of prompts * whisper : add whisper_token_count helper

iThalay pushed a commit to iThalay/whisper.cpp that referenced this pull request Sep 23, 2024

whisper : improve handling of prompts (ggerganov#1981)

260eeb8

* whisper : improve handling of prompts * whisper : add whisper_token_count helper

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

whisper : improve handling of prompts #1981

whisper : improve handling of prompts #1981

ggerganov commented Mar 21, 2024 •

edited

Loading

ggerganov commented Mar 21, 2024

sindresorhus commented Mar 24, 2024 •

edited

Loading

sindresorhus commented Mar 24, 2024

sindresorhus Mar 24, 2024

	// if we have already generated some text, use it as a prompt to condition the next generation
	if (!prompt_past.empty() && t_cur < 0.5f && params.n_max_text_ctx > 0) {
	int n_take = std::min(std::min(params.n_max_text_ctx, whisper_n_text_ctx(ctx)/2), int(prompt_past.size()));

	prompt = { whisper_token_prev(ctx) };
	prompt.insert(prompt.begin() + 1, prompt_past.end() - n_take, prompt_past.end());
	}

whisper : improve handling of prompts #1981

whisper : improve handling of prompts #1981

Conversation

ggerganov commented Mar 21, 2024 • edited Loading

ggerganov commented Mar 21, 2024

sindresorhus commented Mar 24, 2024 • edited Loading

sindresorhus commented Mar 24, 2024

sindresorhus Mar 24, 2024

Choose a reason for hiding this comment

ggerganov commented Mar 21, 2024 •

edited

Loading

sindresorhus commented Mar 24, 2024 •

edited

Loading