Experimental: Introduce a pool of query planners #4897

o0Ignition0o · 2024-04-03T12:15:45Z

Experimental: Introduce a pool of query planners (PR #4897)

The router supports a new experimental feature: a pool of query planners to parallelize query planning.

You can configure query planner pools with the supergraph.query_planner.experimental_available_parallelism option:

supergraph:
  query_planner:
    experimental_parallelism: auto # number of available cpus

Its value is the number of query planners that run in parallel, and its default value is 1. You can set it to the special value auto to automatically set it equal to the number of available CPUs.

You can discuss and comment about query planner pools in this GitHub discussion.

By @xuorig and @o0Ignition0o in #4897

router-perf · 2024-04-03T12:16:52Z

This changes adresses contention we didn't see before the pool of planner was introduced. Borrow semantics and locks make for a surprising pattern where a lock is held a bit too long. This changeset adresses it, and we expect a performance boost at stale / under heavy load.

*Description here* Fixes #**issue_number**  --- **Checklist** Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review. - [ ] Changes are compatible[^1] - [ ] Documentation[^2] completed - [ ] Performance impact assessed and acceptable - Tests added and passing[^3] - [ ] Unit Tests - [ ] Integration Tests - [ ] Manual Tests **Exceptions** *Note any exceptions here* **Notes** [^1]: It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. [^2]: Configuration is an important part of many changes. Where applicable please try to document configuration examples. [^3]: Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions. Co-authored-by: Marc-Andre Giroux <mgiroux@netflix.com>

*Description here* Fixes #**issue_number**  --- **Checklist** Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review. - [ ] Changes are compatible[^1] - [ ] Documentation[^2] completed - [ ] Performance impact assessed and acceptable - Tests added and passing[^3] - [ ] Unit Tests - [ ] Integration Tests - [ ] Manual Tests **Exceptions** *Note any exceptions here* **Notes** [^1]: It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. [^2]: Configuration is an important part of many changes. Where applicable please try to document configuration examples. [^3]: Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions. --------- Co-authored-by: Marc-Andre Giroux <mgiroux@netflix.com>

.changesets/exp_carton_ginger_magnet_beacon.md

Co-authored-by: Edward Huang <edward.huang@apollographql.com>

Geal

could you mention it in the docs?
I am not sure auto will be the best option here. Tokio already spawns threads according to the available capacity, so if at the same time there are as many planner threads as there are cores, then we risk not having any capacity left to handle requests because the planners do blocking work

garypen

This seems to be different to the approach of using web workers that you were discussing earlier in the week. Was there a problem with that approach? I'm asking because I'm curious, but I think this is a better approach to that anyway.

One other thing worth asking. Instead of managing a queue using async_channel, maybe put some kind of adaptive, load-shedding queuing model in front of the pool? e.g.: https://crates.io/crates/little-loadshedder and expose the configuration so that user's can express a queue wait time in configuration?

.changesets/exp_carton_ginger_magnet_beacon.md

apollo-router/src/error.rs

apollo-router/src/plugins/cache/entity.rs

apollo-router/src/query_planner/bridge_query_planner_pool.rs

apollo-router/src/query_planner/caching_query_planner.rs

apollo-router/src/query_planner/bridge_query_planner_pool.rs

o0Ignition0o · 2024-04-08T08:48:22Z

This seems to be different to the approach of using web workers that you were discussing earlier in the week. Was there a problem with that approach? I'm asking because I'm curious, but I think this is a better approach to that anyway.

It turns out both approaches are equivalent in term of runtime capabilities. Thankfully a v8 runtime is initialized in a static :D

The good news is we should be able to drop in any planner implementation in the future.

One other thing worth asking. Instead of managing a queue using async_channel, maybe put some kind of adaptive, load-shedding queuing model in front of the pool? e.g.: https://crates.io/crates/little-loadshedder and expose the configuration so that user's can express a queue wait time in configuration?

This could be worth considering as a followup. I'm fairly happy with the MPMC approach since workers decide to pick new jobs as soon as they're ready to deal with them.

apollo-router/Cargo.toml

apollo-router/src/configuration/mod.rs

BrynCooke

Needs a config metric.

.changesets/exp_carton_ginger_magnet_beacon.md

garypen

Does anyone else see this warning:

garypen@Garys-MacBook-Pro router % cargo check                         
warning: /Users/garypen/dev/router/apollo-router/Cargo.toml: file `/Users/garypen/dev/router/apollo-router/benches/planner.rs` found to be present in multiple build targets:
  * `example` target `planner`
  * `bench` target `planner`
<etc...>

Maybe address this? I don't know if it really matters, but ...

o0Ignition0o · 2024-04-08T16:39:06Z

@garypen this is odd, I don't see several targets in the Cargo.toml:

https://github.com/apollographql/router/pull/4897/files#diff-aca654efc6c22bebf4bd167370ab3bf380f3e086befe3d7c6761a8f7eb59d89c

garypen · 2024-04-09T07:21:58Z

@garypen this is odd, I don't see several targets in the Cargo.toml:

https://github.com/apollographql/router/pull/4897/files#diff-aca654efc6c22bebf4bd167370ab3bf380f3e086befe3d7c6761a8f7eb59d89c

The warning is removed if you add:

autobenches = false

to the package section in apollo-router/Cargo.toml. Target Autodiscovery is the cause of the problem.

I think you need to decide if this is the behaviour you want wrt other benches/examples etc...

o0Ignition0o · 2024-04-09T09:40:50Z

@garypen great catch! fixed in d492701

Docs for query planner pool (#4897)

o0Ignition0o added 3 commits April 2, 2024 17:25

wip

6c6e453

looking good

02c3173

cleanup

e8e51df

This comment has been minimized.

Sign in to view

apollo-bot2 assigned o0Ignition0o Apr 3, 2024

o0Ignition0o and others added 16 commits April 3, 2024 14:20

wip

0f0e63b

expose auto in the configuration

5bfa58d

make test pass

23aff3b

clean up some error handling

c1f6170

check if channel is full in poll_ready

3126e54

lint

1add817

lint

b43666c

Merge branch 'dev' into igni/query_planner_pool

41ebc57

missed one

f2ca738

readd use that is required on linux

b89dd13

Merge branch 'dev' into igni/query_planner_pool

fcd78ea

update configuration name

d0f6956

changeset

4fb2829

o0Ignition0o changed the title ~~Igni/query planner pool~~ Experimental: Introduce a pool of query planners Apr 4, 2024

shorgi reviewed Apr 4, 2024

View reviewed changes

.changesets/exp_carton_ginger_magnet_beacon.md Outdated Show resolved Hide resolved

shorgi reviewed Apr 4, 2024

View reviewed changes

.changesets/exp_carton_ginger_magnet_beacon.md Outdated Show resolved Hide resolved

shorgi reviewed Apr 4, 2024

View reviewed changes

.changesets/exp_carton_ginger_magnet_beacon.md Outdated Show resolved Hide resolved

o0Ignition0o and others added 2 commits April 5, 2024 09:58

Apply suggestions from code review

273fd5b

Co-authored-by: Edward Huang <edward.huang@apollographql.com>

update the planner example with usage instructions.

b8854fb

o0Ignition0o marked this pull request as ready for review April 5, 2024 08:45

Geal approved these changes Apr 5, 2024

View reviewed changes

Merge branch 'dev' into igni/query_planner_pool

a62e8f4

garypen reviewed Apr 5, 2024

View reviewed changes

o0Ignition0o added 3 commits April 8, 2024 10:56

address review comments

1f7e478

lint

faf75c8

turn expect into an error

68d6374

o0Ignition0o requested a review from garypen April 8, 2024 09:07

bnjjj reviewed Apr 8, 2024

View reviewed changes

apollo-router/Cargo.toml Show resolved Hide resolved

apollo-router/src/configuration/mod.rs Show resolved Hide resolved

o0Ignition0o added 3 commits April 8, 2024 14:34

simplify configuration deserialization

a5209a8

update snapshot

8cca46c

Merge branch 'dev' into igni/query_planner_pool

c8b7dc5

o0Ignition0o enabled auto-merge (squash) April 8, 2024 12:48

o0Ignition0o disabled auto-merge April 8, 2024 12:48

BrynCooke reviewed Apr 8, 2024

View reviewed changes

.changesets/exp_carton_ginger_magnet_beacon.md Outdated Show resolved Hide resolved

experimental_available_parallelism -> experimental_parallelism

41c8c35

garypen reviewed Apr 8, 2024

View reviewed changes

shorgi mentioned this pull request Apr 8, 2024

docs: query planner pool #4928

Merged

add configuration metric

d492701

BrynCooke approved these changes Apr 9, 2024

View reviewed changes

garypen approved these changes Apr 9, 2024

View reviewed changes

o0Ignition0o added 2 commits April 9, 2024 14:59

add fixed metric for parallelism if set to auto

dc7891d

fix remaining tests

903843b

o0Ignition0o enabled auto-merge (squash) April 9, 2024 13:33

o0Ignition0o merged commit 99824bf into dev Apr 9, 2024
13 of 14 checks passed

o0Ignition0o deleted the igni/query_planner_pool branch April 9, 2024 13:42

o0Ignition0o pushed a commit that referenced this pull request Apr 16, 2024

docs: query planner pool (#4928)

6d36ef7

Docs for query planner pool (#4897)

abernix mentioned this pull request Apr 22, 2024

prep release: v1.45.0 #4995

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experimental: Introduce a pool of query planners #4897

Experimental: Introduce a pool of query planners #4897

o0Ignition0o commented Apr 3, 2024 •

edited by jira bot

Loading

This comment has been minimized.

router-perf bot commented Apr 3, 2024

Geal left a comment

garypen left a comment

o0Ignition0o commented Apr 8, 2024

BrynCooke left a comment

garypen left a comment

o0Ignition0o commented Apr 8, 2024

garypen commented Apr 9, 2024 •

edited

Loading

o0Ignition0o commented Apr 9, 2024

Experimental: Introduce a pool of query planners #4897

Experimental: Introduce a pool of query planners #4897

Conversation

o0Ignition0o commented Apr 3, 2024 • edited by jira bot Loading

Experimental: Introduce a pool of query planners (PR #4897)

This comment has been minimized.

router-perf bot commented Apr 3, 2024

Geal left a comment

Choose a reason for hiding this comment

garypen left a comment

Choose a reason for hiding this comment

o0Ignition0o commented Apr 8, 2024

BrynCooke left a comment

Choose a reason for hiding this comment

garypen left a comment

Choose a reason for hiding this comment

o0Ignition0o commented Apr 8, 2024

garypen commented Apr 9, 2024 • edited Loading

o0Ignition0o commented Apr 9, 2024

o0Ignition0o commented Apr 3, 2024 •

edited by jira bot

Loading

garypen commented Apr 9, 2024 •

edited

Loading