Skip to content

llama-cli CPU Control | Pin to Physical Cores #9996

Answered by bmtwl
Allan-Luu asked this question in Q&A
Discussion options

You must be logged in to vote

You're going to want to use numactl to give you fine-grained control over where the threads execute:
numactl -N0-7 -m0-7 /path/to/llama-cli -m /path/to/model.gguf -t 8 --numa numactl
You can check the layout of your system with numactl -H so that you're getting the behaviour you want when assigning numa nodes

Replies: 3 comments 5 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@Allan-Luu
Comment options

Answer selected by Allan-Luu
Comment options

You must be logged in to vote
4 replies
@Allan-Luu
Comment options

@max-krasnyansky
Comment options

@MSZHabibie
Comment options

@Allan-Luu
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
5 participants