Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve load balancing at higher deployment query volume #933

Open
Theodus opened this issue Sep 5, 2024 · 0 comments
Open

improve load balancing at higher deployment query volume #933

Theodus opened this issue Sep 5, 2024 · 0 comments

Comments

@Theodus
Copy link
Member

Theodus commented Sep 5, 2024

For high-volume deployments, the gateway has a tendency to pick multiple indexers (often 3) to maximize performance. The primary issue with this behavior is that we run a higher risk of unnecessarily overloading indexers. Above some threshold indexer-selection should load-balance requests between indexers that would otherwise all be included in the selected set.

The first challenge is detecting high volume on a deployment. Here's an rough design:

The gateway should track query volume per subgraph deployment. This would likely be a parking_lot::RwLock<HashMap<DeploymentId, AtomicUsize>>. Inserting into the map should be relatively infrequent, and updating an entry only requires a read lock. The atomic counter is incremented by the amount of indexers selected, and decremented once each indexer request completes. A "high volume" state on a deployment is when this counter is above some threshold, meaning there are approximately n outstanding indexer requests happening concurrently.

There are multiple potential approaches for what to do when we detect high volume on a deployment. Here's a list that increase in difficulty, and might be a reasonable order of iterations to go down until we hit "good enough for now":

  1. When the deployment is "high volume", call indexer_selection::select with a limit of 1 instead of 3.
  2. Add a parameter to indexer_selection::select that acts as a cost to including additional indexers in the selected set. This value should increase at higher volume.
  3. Use a proper load-balancing algorithm between the selected indexers, see this for inspiration: https://samwho.dev/load-balancing/.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant