-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix todo: avoid relying on logits_all == true
in perplexity_v2
#9102
base: master
Are you sure you want to change the base?
Conversation
Hi, I wrote that comment in #6122, but after studying Though it's true that it could avoid using For the strided perplexity calculation (aka I see in b0c6ad7 that you've simply made it ask for all outputs, same as using If you do figure out which outputs to keep, know that the logits in the buffer returned by Hopefully this helps! |
@compilade Thanks a lot for your suggestion! I will carefully consider your advice and propose changes related to In the current PR, I only plan to modify the |
Changes
From #9037 (comment) and some related comments in
perplexity.cpp
, I noticed thatlogits_all
seems to be deprecated?So I made the following changes:
llama_batch.logits
, avoid relying onlogits_all == true
when running in functionperplexity_v2
.llama_batch.logits
tollama_batch.output
.Test Platform
Linux 6.5.0-41-generic #41~22.04.2-Ubuntu SMP PREEMPT_DYNAMIC Mon Jun 3 11:32:55 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
Test Command
Test Results
Before
After
Others
If these changes are acceptable to the community, I'd like to modify other places that depend on
logits_all == true
, such asllama.cpp/examples/perplexity/perplexity.cpp
Lines 1797 to 1801 in 90db814
llama.cpp/examples/imatrix/imatrix.cpp
Lines 516 to 520 in 90db814