-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ApolloGateway.stop() is not reliable #4428
Comments
It's also worth noting that our docs don't appear to tell you to run ApolloGateway.stop() anywhere, and it might make sense if ApolloServer.stop() did this for you! But that's a separate issue. |
glasser
added a commit
that referenced
this issue
Feb 8, 2021
Because AS is what invokes ApolloGateway.load, it should be its responsibility to invoke the matching stop method. apollographql/federation#452 (aimed at @apollo/gateway 0.23) will change ApolloGateway to no longer "unref" its polling timer: ie, it makes calling `stop()` actually part of its expected API instead of something you can ignore if you feel like it without affecting program shutdown. It has a bit of a hack to still unref the timer if it looks like you're using an old (pre-2.18) version of Apollo Server, but this PR (which will be released in v2.20.0) will make ApolloServer stop the gateway for you. Part of fixing #4428.
glasser
added a commit
that referenced
this issue
Feb 8, 2021
Because AS is what invokes ApolloGateway.load, it should be its responsibility to invoke the matching stop method. apollographql/federation#452 (aimed at @apollo/gateway 0.23) will change ApolloGateway to no longer "unref" its polling timer: ie, it makes calling `stop()` actually part of its expected API instead of something you can ignore if you feel like it without affecting program shutdown. It has a bit of a hack to still unref the timer if it looks like you're using an old (pre-2.18) version of Apollo Server, but this PR (which will be released in v2.20.0) will make ApolloServer stop the gateway for you. Part of fixing #4428. Also simplify typings of toDispose set. `() => ValueOrPromise<void>` is a weird type because `void` means "I'm not going to look at the return value" which is sorta incompatible with "but I need to see if it's a Promise or not". So just make these all async (changing a couple implementations by adding `async`).
glasser
added a commit
to apollographql/federation
that referenced
this issue
Feb 9, 2021
The (not particularly documented) ApolloGateway.stop() method didn't reliably stop the gateway from polling. All it did was cancel a timeout. But if it is called while pollServices is in the middle of running, it would do nothing, and then pollServices would carry on and set another timeout. Better semantics would be for stop() to reliably stop polling: allow the current poll to complete, ensure that there will be no more polls, and then (async) return. This PR implements those semantics, by implementing an explicit state machine for ApolloGateway's polling. One reason that these bugs were able to survive is that calling stop() is often unnecessary. In apollographql/apollo-server#3105 we chose to `unref` the polling timeout to allow the Node process to exit if it's the only thing left on the event loop, instead of encouraging users of `ApolloGateway` to be responsible and call `stop()` themselves. While this may be reasonable when the gateway's lifecycle is merely a schema polling timer, we may have future changes to the gateway where proper lifecycle handling is more important. So this PR also moves away from the world where it's fine to not bother to explicitly stop the gateway. That said, in the common case, we don't want users to have to write gateway stopping code. It makes more sense for stopping the `ApolloGateway` to be the responsibility of `ApolloServer.stop()`, just as `ApolloServer.stop()` can trigger events in plugins. So in the recently-released Apollo Server v2.20, `ApolloServer.stop()` calls `ApolloGateway.stop()`. This should mean that in most cases, the missing `unref` shouldn't keep the process running, as long as you've run `ApolloServer.stop()` (and if you don't, it's likely that other parts of the server are keeping your process running). What if you're still running an old version of Apollo Server? For a bit of backwards compatibility, `ApolloGateway` detects if it appears to be connected to Apollo Server older than v2.18. If so, it continues to do the old unref(). If you're using v2.18 or v2.19, the upgrade to v2.20 should be pretty painless (it's mostly just minor bugfixes). If you're using ApolloGateway on its own for some reason, and this change causes your processes to hang on shutdown, adding a `stop()` call should be pretty straightforward. (If we learn that this change really is devastating, we can always go back to an unconditional timer.unref() later.) Fixes apollographql/apollo-server#4428.
This was referenced Mar 9, 2021
Closed
This was referenced Mar 16, 2021
kodiakhq bot
added a commit
to ProjectXero/dbds
that referenced
this issue
Mar 17, 2021
Bumps apollo-datasource from 0.7.2 to 0.7.3. Changelog Sourced from apollo-datasource's changelog. CHANGELOG The version headers in this history reflect the versions of Apollo Server itself. Versions of other packages (e.g., those which are not actual HTTP integrations; packages not prefixed with "apollo-server", or just supporting packages) may use different versions. 🆕 Please Note!: 🆕 The @apollo/federation and @apollo/gateway packages now live in the apollographql/federation repository. @apollo/gateway @apollo/federation vNEXT The changes noted within this vNEXT section have not been released yet. New PRs and commits which introduce changes should include an entry in this vNEXT section as part of their development. With few exceptions, the format of the entry should follow convention (i.e., prefix with package name, use markdown backtick formatting for package names and code, suffix with a link to the change-set à la [PR #YYY](https://link/pull/YYY), etc.). When a release is being prepared, a new header will be (manually) created below and the appropriate changes within that release will be moved into the new section. apollo-server-core: The SIGINT and SIGTERM signal handlers installed by default (when not disabled by stopOnTerminationSignals: false) now stay active (preventing process termination) while the server shuts down, instead of letting a second signal terminate the process. The handlers still re-signal the process after this.stop() concludes. Also, if this.stop() throws, the signal handlers will now log and exit 1 instead of throwing an uncaught exception. [Issue #4931](apollographql/apollo-server#4931) apollo-server-lambda: (UPDATE THIS MESSAGE BEFORE RELEASE; we are not sure if this actually helps nodejs14 compatibility or if it's just a nice refactor.) Support the nodejs14 runtime by changing the handler to be an async handler. (For backwards compatibility, if the handler receives a callback, it still acts like a non-async handler.) [Issue #1989](apollographql/apollo-server#1989) [PR #5004](apollographql/apollo-server#5004) v2.21.1 apollo-server-lambda: The onHealthCheck option did not previously work. Additionally, health checks (with onHealthCheck or without) didn't work in all Lambda contexts, such as behind Custom Domains; the path check is now more flexible. [Issue #3999](apollographql/apollo-server#3999) [PR #4969](apollographql/apollo-server#4969) [Issue #4891](apollographql/apollo-server#4891) [PR #4892](apollographql/apollo-server#4892) The debug option to new ApolloServer (which adds stack traces to errors) now affects errors that come from requests executed with server.executeOperation (and its wrapper apollo-server-testing), instead of just errors that come from requests executed over HTTP. [Issue #4107](apollographql/apollo-server#4107) [PR #4948](apollographql/apollo-server#4948) Bump version of @apollographql/graphql-playground-html to v1.6.27 and @apollographql/graphql-playground-react to v1.7.39 to resolve incorrectly rendered CDN URL when Playground version was false-y. [PR #4932](apollographql/apollo-server#4932) [PR #4955](apollographql/apollo-server#4955) [Issue #4937](apollographql/apollo-server#4937) v2.21.0 Apollo Server can now be installed with graphql@15 without causing peer dependency errors or warnings. (Apollo Server has a file upload feature which was implemented as a wrapper around the graphql-upload package. We have been unable to upgrade our dependency on that package due to backwards-incompatible changes in later versions, and the version we were stuck on did not allow graphql@15 as a peer dependency. We have now switched to a fork of that old version called @apollographql/graphql-upload-8-fork that allows graphql@15.) Also bump the graphql-tools dependency from 4.0.0 to 4.0.8 for graphql@15 support. [Issue #4865](apollographql/apollo-server#4865) v2.20.0 apollo-server: Previously, ApolloServer.stop() functioned like net.Server.close() in that it did not close idle connections or close active connections after a grace period. This meant that trying to await ApolloServer.stop() could hang indefinitely if there are open connections. Now, this method closes idle connections, and closes active connections after 10 seconds. The grace period can be adjusted by passing the new stopGracePeriodMillis option to new ApolloServer, or disabled by passing Infinity (though it will still close idle connections). Note that this only applies to the "batteries-included" ApolloServer in the apollo-server package with its own built-in Express and HTTP servers. [PR #4908](apollographql/apollo-server#4908) [Issue #4097](apollographql/apollo-server#4097) apollo-server-core: When used with ApolloGateway, ApolloServer.stop now invokes ApolloGateway.stop. (This makes sense because ApolloServer already invokes ApolloGateway.load which is what starts the behavior stopped by ApolloGateway.stop.) Note that @apollo/gateway 0.23 will expect to be stopped in order for natural program shutdown to occur. [PR #4907](apollographql/apollo-server#4907) [Issue #4428](apollographql/apollo-server#4428) apollo-server-core: Avoid instrumenting schemas for the old graphql-extensions library unless extensions are provided. [PR #4893](apollographql/apollo-server#4893) [Issue #4889](apollographql/apollo-server#4889) apollo-server-plugin-response-cache@0.6.0: The shouldReadFromCache and shouldWriteToCache hooks were always documented as returning ValueOrPromise<boolean> (ie, that they could be either sync or async), but they actually only worked if they returned a bool. Now they can be either sync or async as intended. [PR #4890](apollographql/apollo-server#4890) [Issue #4886](apollographql/apollo-server#4886) apollo-datasource-rest@0.10.0: The RESTDataSource.trace method is now protected instead of private to allow more control over logging and metrics. [PR #3940](apollographql/apollo-server#3940) v2.19.2 apollo-server-express: types: Export ExpressContext from main module. [PR #4821](apollographql/apollo-server#4821) [Issue #3699](apollographql/apollo-server#3699) apollo-server-env: types: The first parameter to fetch is now marked as required, as intended and in accordance with the Fetch API specification. [PR #4822](apollographql/apollo-server#4822) [Issue #4741](apollographql/apollo-server#4741) apollo-server-core: Update graphql-tag package to latest, now with its graphql-js peerDependencies expanded to include ^15.0.0 [PR #4833](apollographql/apollo-server#4833) v2.19.1 apollo-server-core: The debugPrintReports option to ApolloServerPluginUsageReporting now prints traces as well. [PR #4805](apollographql/apollo-server#4805) v2.19.0 apollo-server-testing: types: Allow generic variables usage of query and mutate functions. [PR #4383](apollograpqh/apollo-server#4383) apollo-server-express: Export the GetMiddlewareOptions type. [PR #4599](apollograpqh/apollo-server#4599) apollo-server-lambda: Fix file uploads - ignore base64 decoding for multipart queries. [PR #4506](apollographql/apollo-server#4506) apollo-server-core: Do not send operation documents that cannot be executed to Apollo Studio. Instead, information about these operations will be combined into one "operation" for parse failures, one for validation failures, and one for unknown operation names. ... (truncated) Commits c212627 Release See full diff in compare view Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase. Dependabot commands and options You can trigger Dependabot actions by commenting on this PR: @dependabot rebase will rebase this PR @dependabot recreate will recreate this PR, overwriting any edits that have been made to it @dependabot merge will merge this PR after your CI passes on it @dependabot squash and merge will squash and merge this PR after your CI passes on it @dependabot cancel merge will cancel a previously requested merge and block automerging @dependabot reopen will reopen this PR if it is closed @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
This was referenced Mar 18, 2021
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
All that ApolloGateway.stop() does is clear a pollingTimer if it's currently set.
But what if
pollServices
is currently running (but awaiting some async operation inside updateComposition) when this is happening? Then there are two problems:pollServices
may just end up callingpollServices
again and schedule another call to itself, even though we thought we stopped itstop
returns while stuff is still happening, which is inappropriateI think the right fix is to have an explicit stopped flag, and also have a "pollServices is running now" promise. Something like
With a functional ApolloGateway.stop() invoked from ApolloServer.stop(), I think we can skip the unref too.
The text was updated successfully, but these errors were encountered: