Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AsyncLocalStorage based ContextManager #1210

Merged
merged 6 commits into from
Jun 26, 2020

Conversation

johanneswuerbach
Copy link
Contributor

Which problem is this PR solving?

Performance impact of using the current nodejs default context manager.

Short description of the changes

async_hooks especially with promise executing tracking enabled are known to have a significant performance impact nodejs/benchmarking#181

Luckily there is a new API in node AsyncLocalStorage (available since v13.10.0+ and v12.17.0+), which is supposed to be scoped more towards the ContextManager use-case and is more performant.AsyncLocalStorage recently received additional perf improvements nodejs/node#32891 and has more planned nodejs/diagnostics#376

As this is a fairly recent API, I'm sure this can't be accepted as is and should be seen more as a conversation starter.

Open question:

  • Was AsyncLocalStorage already tested and discarded (all test pass, but I might miss something)
  • Does it make sense to be added as a separate context manager?
  • Would it make sense to feature detected AsyncLocalStorage and use this as default?

@codecov
Copy link

codecov bot commented Jun 17, 2020

Codecov Report

Merging #1210 into master will increase coverage by 0.06%.
The diff coverage is 97.40%.

@@            Coverage Diff             @@
##           master    #1210      +/-   ##
==========================================
+ Coverage   93.23%   93.29%   +0.06%     
==========================================
  Files         122      124       +2     
  Lines        3550     3567      +17     
  Branches      714      715       +1     
==========================================
+ Hits         3310     3328      +18     
+ Misses        240      239       -1     
Impacted Files Coverage Δ
...sync-hooks/src/AbstractAsyncHooksContextManager.ts 96.49% <96.49%> (ø)
...ontext-async-hooks/src/AsyncHooksContextManager.ts 100.00% <100.00%> (+3.12%) ⬆️
...async-hooks/src/AsyncLocalStorageContextManager.ts 100.00% <100.00%> (ø)

@dyladan
Copy link
Member

dyladan commented Jun 17, 2020

Think @OlivierAlbertini should weigh in on this, but in general I am not against including a more performant context manager. As far as feature detection goes, I'm not sure how I feel about having the default behavior change based on which version of node you use, but I am open to being convinced otherwise.

@johanneswuerbach
Copy link
Contributor Author

As a datapoint from our mildly complex express app on node v14.4.0.

# Without opentelemetry tracer

Requests      [total, rate, throughput]         569, 9.48, 9.47
Duration      [total, attack, wait]             1m0s, 1m0s, 91.537ms
Latencies     [min, mean, 50, 90, 95, 99, max]  79.78ms, 105.634ms, 99.757ms, 116.015ms, 133.831ms, 208.806ms, 1.102s

# With current ContextManager

Requests      [total, rate, throughput]         446, 7.43, 7.41
Duration      [total, attack, wait]             1m0s, 1m0s, 127.091ms
Latencies     [min, mean, 50, 90, 95, 99, max]  108.509ms, 134.938ms, 130.591ms, 147.889ms, 160.71ms, 246.351ms, 555.653ms

# With AsyncLocalStorage based ContextManager

Requests      [total, rate, throughput]         507, 8.44, 8.42
Duration      [total, attack, wait]             1m0s, 1m0s, 111.535ms
Latencies     [min, mean, 50, 90, 95, 99, max]  90.519ms, 118.767ms, 113.332ms, 137.837ms, 151.618ms, 216.908ms, 299.603ms

@vmarchaud
Copy link
Member

@dyladan Think you meant to tag me since Olivier didn't work on this part ^^

Was AsyncLocalStorage already tested and discarded (all test pass, but I might miss something)

We didn't yet test it in OT, someone already brought the suggestion here however my main complain with it was simply that the API is still experimental (way more than async_hooks itself), at the time there were still some fixes made to the API.

Does it make sense to be added as a separate context manager?

This is mandatory for me since we support node 8, 10 and versions that doesn't have this API. Also it allows users to opt-in.

Would it make sense to feature detected AsyncLocalStorage and use this as default?

It may in the future but i would be against for now for reasons cited above. I totally agree that this is the way forward (for both performance and correctness reasons) but i would prefer to wait to have real world usage feedback.

async_hooks especially with promise executing tracking enabled are known to have a significant performance impact nodejs/benchmarking#181

Correct me if i'm wrong but AsyncLocalStorage does track promise too (see this) with the PromiseHook API ?

To sum everything i said, i'm totally fine having a AsyncLocalStorageContextManager implementation so user can try this new API out. However i would wait for @Qard to "finish" his works and real world usage to make it the default (when available ofc).

PS: On the benchmark i believe it would be better to make micro-benchark (see #1123) because other code (that get JIT'ed by V8) might influence the result.

@Qard
Copy link

Qard commented Jun 18, 2020

AsyncLocalStorage is as stable as async_hooks at this point, if not more so due to the reduced surface area.

It uses async_hooks internally, so everything async_hooks can handle it can handle too. Through async_hooks it uses promise hooks, however I made some changes to skip the additional garbage collection tracking to emit destroy events for promises if there is no destroy hook present. AsyncLocalStorage only uses the init hook and the executionAsyncResource function, so it has much simpler needs than the full async_hooks capabilities. It can function without much of the usual overhead of async_hooks.

Both async_hooks and AsyncLocalStorage have also had issues with thenables. I have a fix that has already landed in V8 and is just in the process of landing in Node.js now.

Beyond that, AsyncLocalStorage should be stable enough for regular use. I’m still working on some more changes to async_hooks, but those are largely internal-facing other than some performance improvements.

@johanneswuerbach
Copy link
Contributor Author

@vmarchaud introduced AsyncLocalStorageContextManager as an opt-in second context manager. I tried to share as much as code as possible between both context managers, wdyt?

@vmarchaud
Copy link
Member

@johanneswuerbach Looks great overall ! I will take a closer look tomorrow.

I'm wondering whats the best approach to maintain both implementation, should we make two different package ? The switch between asynchooks and asynclocalstorage will be made by the node-sdk anyway so it doesn't really matter.

Copy link
Member

@dyladan dyladan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very large diff makes it difficult to see if the tests have been modified beyond just running them with both context managers. Did you end up adding/removing/changing any tests?

Use AsyncLocalStorage in ContentManager to improve performance
@johanneswuerbach
Copy link
Contributor Author

@dyladan I didn't remove any tests, hiding whitespaces https://github.com/open-telemetry/opentelemetry-js/pull/1210/files?diff=split&w=1 should make this more easy to compare.

I tried to touch as least code as possible, should I maybe split this up into two commits:

a) adding the new base context manager
b) adding the AsyncLocalStorageContextManager

?

@vmarchaud
Copy link
Member

@johanneswuerbach Wow i didnt know that you could remove the whitespace diff from github, thanks ! Should be the default ahah

@dyladan
Copy link
Member

dyladan commented Jun 23, 2020

Cool. Please try to avoid changing history and force pushing during the review process. It breaks the github emails and links, and breaks the link between conversations and code.

Copy link
Member

@markwolff markwolff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@vmarchaud vmarchaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, i will report with some production feedback when this will get merged. I've had weird trace with the current impl and was wondering if it's linked to it or something downstream in the collectors that i run.

Copy link
Member

@mayurkale22 mayurkale22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mayurkale22 mayurkale22 added the Merge:LGTM This PR is ready to be merged by a Maintainer (has enough valid approvals, successful build, etc.) label Jun 25, 2020
@johanneswuerbach
Copy link
Contributor Author

Sorry I’m not sure about the process, should i squash my commits now?

Copy link
Member

@obecny obecny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned you did some performance tests, I wonder if you could create some performance test here (we have benchmark folder). It might be useful to somehow compare it in future whoever will decide to refactor this again and then check if it was improved or not. This doesn't need to be in this PR so can be a separate issue - not a blocker here., thx

@obecny
Copy link
Member

obecny commented Jun 25, 2020

Sorry I’m not sure about the process, should i squash my commits now?

https://github.com/open-telemetry/opentelemetry-js/blob/master/CONTRIBUTING.md#general-merge-requirements

don't merge, please wait for maintainer

@dyladan
Copy link
Member

dyladan commented Jun 25, 2020

Sorry I’m not sure about the process, should i squash my commits now?

Commits are squashed during the merge process anyway so it isn't that important unless you want the message to say something specific.

don't merge, please wait for maintainer

he doesn't even have permission to merge. Only approvers/maintainers do.

@dyladan dyladan merged commit bf2f617 into open-telemetry:master Jun 26, 2020
@vmarchaud
Copy link
Member

@obecny For the reference there is an issue for this #1123

@dyladan dyladan added the enhancement New feature or request label Jul 6, 2020
pichlermarc pushed a commit to dynatrace-oss-contrib/opentelemetry-js that referenced this pull request Dec 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Merge:LGTM This PR is ready to be merged by a Maintainer (has enough valid approvals, successful build, etc.)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants