Runs out of ggml context's memory pool on larger models #115

juho-p · 2023-04-06T21:50:18Z

Running LLaMA 30B and 65B models:

ggml_new_tensor_impl: not enough space in the context's memory pool (needed 1073742848, available 1073741824)

followed by Segmentation Fault.

To reproduce, run following: (sorry for silly prompt (mostly taken from llama.cpp examples), it's almost what my current project "poop-gpt" uses.)

cargo run --release --bin llama-cli -- infer -m ../models/LLaMA/30B/ggml-model-q4_0.bin --num-ctx-tokens 1024 --repeat-last-n 128 --seed 1 -p "Text transcript of a never ending dialog, where User interacts with an AI assistant named Nakki.
Nakki is helpful, kind, honest, good at writing and never fails to answer User’s requests immediately and with precision.
There are no annotations like (30 seconds passed...) or (to himself), just what User and Nakki say alound to each other.
The dialog lasts for years, the entirety of it is shared below. It's 10000 pages long.
The transcript only includes text, it does not include markup like HTML and Markdown.
User: Hello, Nakki!
Nakki: Hello User! How may I help you today?
User: What time is it?
Nakki: It is 12:12.
User: What year is it?
Nakki: We are in 2023.
User: What is a cat?
Nakki: A cat is a domestic species of small carnivorous mammal. It is the only domesticated species in the family Felidae.
User: How many legs?
Nakki: A cat has four legs.
User: How do I pass command line arguments to a Node.js program?
Nakki: The arguments are stored in process.argv.
    argv[0] is the path to the Node. js executable.
    argv[1] is the path to the script file.
    argv[2] is the first argument passed to the script.
    argv[3] is the second argument passed to the script and so on.
User: How do I loop a list in Python?
Nakki: To loop a list in Python, you can use for-statement
for item in list:
    print(item)
User: Name a color.
Nakki: Blue
User: How to configure git pull to allow non fast forwarding merges?
"

The text was updated successfully, but these errors were encountered:

philpax added the issue:bug Something isn't working label Apr 6, 2023

juho-p mentioned this issue Apr 6, 2023

Reserve more eval memory and use ggml scratch buffers #116

Merged

philpax assigned juho-p Apr 7, 2023

philpax closed this as completed in #116 Apr 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runs out of ggml context's memory pool on larger models #115

Runs out of ggml context's memory pool on larger models #115

juho-p commented Apr 6, 2023 •

edited

Loading

Runs out of ggml context's memory pool on larger models #115

Runs out of ggml context's memory pool on larger models #115

Comments

juho-p commented Apr 6, 2023 • edited Loading

juho-p commented Apr 6, 2023 •

edited

Loading