Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream 2 sentence output is sometimes removing space between words from input tokens #5

Closed
ekcrisp opened this issue Oct 31, 2024 · 6 comments

Comments

@ekcrisp
Copy link

ekcrisp commented Oct 31, 2024

I am using llama cpp python and my generator is yielding one token at a time and passing that into stream2sentence. Sometimes words are being combined in output sentences. I am using the default settings, nltk tokenizer. Notice "thingwas" and "nerviouslychuckles" are one word in the output below. I confirmed this wasn't in the input tokens (using Llama 3B instruct). I am observing this once every 50-100 tokens or so, and I haven't noticed a pattern around when it occurs. I can provide code to reproduce later if this isn't a known issue, and if you point me in the right direction I can try to fix it myself.

Sentence 4: That thingwas older than my aunt from Quebec, which is saying something, right?
Sentence 5: (nervouslychuckles once more) Anyway, that was the oldest car I've ever seen near the border of Canada, and I'm glad I got to see it...

@KoljaB
Copy link
Owner

KoljaB commented Oct 31, 2024

Will look into that, code to reproduce would be awesome

@ekcrisp
Copy link
Author

ekcrisp commented Nov 1, 2024

I'm running this on a Raspberry pi 5, seems like it happens every 5 sentences or so. Thanks for taking a look


import random
from llama_cpp import Llama
from stream2sentence import generate_sentences

chat_input = ''' 
<|system|>
You are a creative writer who is interested in nature. You have traveled the world and have many stories to tell.
</s>
<|user|>
Where have you traveled recently?
</s>
<|assistant|>
'''

llm = Llama(
    model_path='./Llama-3.2-3B-Instruct-Q8_0.gguf',
    n_ctx=4096,
    n_threads=4,
    verbose=False
)

def output_generator():
    for output in llm(
        chat_input,
        stream=True,
        seed=random.randint(1, 1000000),
        max_tokens=1000
    ):
        yield output['choices'][0]['text']

for idx, sentence in enumerate(
    generate_sentences(
        output_generator()
    ), start=1):
    print(f"Sentence {idx}: {sentence}")

@davidchi31415
Copy link

I am also experiencing this issue. Any updates?

@KoljaB
Copy link
Owner

KoljaB commented Nov 7, 2024

Thanks for reporting. Should be fixed now in v0.2.7. Feedback would be awesome.

@davidchi31415
Copy link

Yes it seems perfect now! Thank you so much for this awesome library.

@ekcrisp
Copy link
Author

ekcrisp commented Nov 8, 2024

issue is fixed, thanks for updating

@ekcrisp ekcrisp closed this as completed Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants