Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Context splitting #4269

Closed
candre23 opened this issue Nov 30, 2023 · 3 comments
Closed

Context splitting #4269

candre23 opened this issue Nov 30, 2023 · 3 comments
Labels
enhancement New feature or request stale

Comments

@candre23
Copy link

Feature Description

Add the ability to split context between multiple GPUs, much as model layers can currently be split.

Motivation

Currently, with multi-GPU setups, LCPP only stores/processes context on the "first" GPU. This is fine for most models which are only capable of handling 4k context tokens natively (or double that with rope scaling). But as more and more large context models are being released, this limitation is becoming an issue. For example, the new yi 34b 200k models are limited to however much context can be fit into the first GPU only (64k in the case of a 24GB card), regardless of total VRAM available. If context could be split across multiple cards, then a larger context window could be utilized.

@candre23 candre23 added the enhancement New feature or request label Nov 30, 2023
@BarfingLemurs
Copy link
Contributor

#3457

@NXTler
Copy link

NXTler commented Dec 24, 2023

This problem becomes even more apparent with more, smaller GPU's. I for an instance, have 2 Tesla K80's, meaning that I have 4x 12Gb of vram. When running a large model like Dolphin-mixtral-2.6 Q5, I can only utilize up to 36Gb of the theoretical 48Gb, due to the first GPU running out of memory to store context.

@github-actions github-actions bot added the stale label Mar 19, 2024
Copy link
Contributor

github-actions bot commented Apr 3, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as completed Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale
Projects
None yet
Development

No branches or pull requests

3 participants