Count seems to be off by 5-8 tokens #5

drorm · 2023-03-16T18:50:01Z

came here from openai/openai-node#18 and I'm using this to count tokens when streaming.
Seems to be off by 5-8 tokens in either direction when comparing to the result that the official API gives when using without streaming.
I haven't done a very deep analysis, just gave it a few prompts:

tell me a joke
tell me a n sentence story. (where n is between 5 and 12)

syonfox · 2023-03-17T18:25:27Z

That info is mostly in the demo app. I would probably make a new repository for it.

But I will add those file to npm so that it is possible if not ideal.

//18+ for fetch is required :( 
npm update gptoken // need 0.1.0 if you edit package json



let streamOne = require("gptoken/demo_app/streamOne")
require(dotenv).configure()
//proccess.env.OPENAI_API_KEY = "..."

//streamOne.test()

//streamOne(model, prompt/messages, onToken, onData, onError, options)
//see code 


//after editing .env and installing dotenv use this code to get started



require('dotenv').config();

if(!process.env.OPENAI_API_KEY) console.error("OPENAI_API_KEY is required; npm install dotenv; nano .env; require('d>
let streamOne = require("./node_modules/gptoken/demo_app/streamOne");


function onData(token, msg, parsed, response) { 
        console.log(msg.content);
}

let msgs = [{role: "system", content: "whats the best cat?"}]
streamOne("gpt-3.5-turbo", msgs, onData);

This is my current code feel free to make some edits and improve it here in the demo app.

todo: add check for missing options right now it needs the functions to exist.

syonfox · 2023-03-17T18:28:51Z

https://github.com/syonfox/GPT-3-Encoder/blob/GPToken/demo_app/streamOne.js

also what token estomator are you using do you have an example of the tokenized output.

We could write a test to compare them. to what is expected.

drorm · 2023-03-18T04:47:12Z

I compare it to the result given by the API when not syncing.
When I make the request, I get both of them, to give the results. It's typically between -2% to +2%, and the main purpose is to get an idea of costs. Since 3.5 is so cheap these days, I really don't think it's worth spending a lot of time on this.

syonfox · 2023-03-20T17:42:35Z

si, I agree that the most usefull and original function is just to get a quick estimate. It would be useful if it was compatible with some of the guts.

Anyway yeah, I think the request can give accurate tokens but a few cents here and there never hurt much.

syonfox · 2023-03-20T18:19:03Z

This is probably due to issue #6

drorm · 2023-03-27T00:53:25Z

Just published https://github.com/drorm/gish and you can see in the screencast, the display of the counting when streaming. Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Count seems to be off by 5-8 tokens #5

Count seems to be off by 5-8 tokens #5

drorm commented Mar 16, 2023

syonfox commented Mar 17, 2023 •

edited

Loading

syonfox commented Mar 17, 2023

drorm commented Mar 18, 2023

syonfox commented Mar 20, 2023

syonfox commented Mar 20, 2023

drorm commented Mar 27, 2023

Count seems to be off by 5-8 tokens #5

Count seems to be off by 5-8 tokens #5

Comments

drorm commented Mar 16, 2023

syonfox commented Mar 17, 2023 • edited Loading

syonfox commented Mar 17, 2023

drorm commented Mar 18, 2023

syonfox commented Mar 20, 2023

syonfox commented Mar 20, 2023

drorm commented Mar 27, 2023

syonfox commented Mar 17, 2023 •

edited

Loading