Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node.js Collaboration Summit diagnostics discussion minutes #95

Closed
watson opened this issue May 4, 2017 · 7 comments
Closed

Node.js Collaboration Summit diagnostics discussion minutes #95

watson opened this issue May 4, 2017 · 7 comments

Comments

@watson
Copy link
Member

watson commented May 4, 2017

These are my raw notes taken at the Node.js Collaboration Summit in Berlin. Sorry for the mess 😅. I'll update when new stuff comes up. Feel free to add stuff I missed 😃

Day 1: 2017-05-04

Problem: How do we make it easier for APM vendors to instrument node-core and userland modules without having to monkey-patch everything (even after AsyncHooks lands)?

Suggestion: We need to make it easier for module owners to know how to structure and build their modules so they are easier to instrument. We can make a guide on https://nodejs.org/en/docs/guides/ that specifies how module owners should make their modules easy to instrument.

Topic ideas:

  • How to use the AsyncHooks Embedder API
  • How to send detailed tracing events to APM vendors using some API

Other notes:

  • This should be a living document
  • Discussion on this could be had on a PR that’s opened to the nodejs.org website creating this guide

Problem: Can we standardise the way tracing information is sent from an instrumented module (called module-x below) to an APM agent?

Suggestions:

  • Maybe we can use the V8 trace events API
    • Pros: No need for module-x to know about the APM agents
    • Cons:
      • Currently no way to bind the trace event to the current context
      • Might be expensive to cross the barrier into C-land
      • Might not be detailed enough
      • Overhead of running in production?
  • Alternatively invent a new API in JavaScript land
    • This could either be in Node core or a userland module
    • If it’s a userland module, then module-x needs to detect if it’s present before it can send events to it. The APM agents need to do the same.
      • What if there’s multiple versions of this module installed?
    • Research if other languages have something similar
@sam-github
Copy link

sam-github commented May 5, 2017

Hooks should simplify APM code, but there is no way to get context data (fs.open()'s path, HTTP path/method, etc.), so does not remove the need to monkey-patch the API surface for APM's that need such information (hooks alone are sufficient for CLS style code that only needs to track continuations).

The Embedder API now exists - it didn't at time of NINA Austin. It allows thirdparty modules to interact with the hooks to indicate "intent" and provides a uniform solution of the "connection pooling" aka "async queueing" problem (cf. https://docs.google.com/document/d/1tlQ0R6wQFGqCS5KeIw0ddoLbaSYx6aU7vyXOkv-wvlM/edit#). This is great, APMs look forward to this, it decreases the need for continuous adaption of our code bases to the changes in internals of thirdparty modules, because they can maintain their use of the Embedder API themselves.

APM builders present:

  • danielkhan, dynatrace
  • thomas watson, opbeat
  • sam roberts, strong-agent, concurix, appmetrics
  • jkrems, groupon (which has an internal "APMish" framework)

Promises are problems, interaction with utask queue is hard to track, and we need to patch v8 promises and the bluebird promises (at least)

Q: do/will hooks help with that?

Mongo has an APM API so monkey-patching is not necessary, and they maintain the API themselves. This is a great pattern, perhaps we should more clearly advocate it as something that npm packages that want to support APM reporting can do, perhaps we can even advocate a specific API/API pattern. cf. http://mongodb.github.io/node-mongodb-native/2.0/reference/management/apm/

elasticsearch may get an APM API soon (according to @watson)

Maybe we can get some common push behind an API for this, and put out a blog
with examples showing how to use the Embedder API and also solve the app-specific
info collection problem. The information on the app-specific could perhaps
be emitted as V8 trace events. We were all quite interested in publishing a guide that clearly reflects what APM vendors would like packages to do so that they would be easier to support reporting on by APM builders, and what the intended use of the Embedder API is.

We can could work on this by PRing a guide.

We should start talking about it in the diagnostics meetings.

V8: does it have js APIs? Does it have levels? Is the timing of it acceptable,
will it work us? Can trace events be used?

async hooks:

@matthewloring
Copy link

I wish I could have attended this! Just to speak quickly to:

Promises are problems, interaction with utask queue is hard to track, and we need to patch v8 promises > and the bluebird promises (at least)

Q: do/will hooks help with that?

I recently worked with the V8 team and @trevnorris to have Promise Hooks (interface, design doc) added to V8 so that async hooks can observe promise lifecycles. I have also prototyped an integration of these promise hooks into async hooks but we decided to wait to add this support until the existing async hooks PR lands as it is already quite complex. So, I guess the answer is hopefully async hooks should help once the initial implementation + promise hook integration follow-on have both landed.

@danielkhan
Copy link
Contributor

danielkhan commented May 9, 2017

It was daniel khan, dynatrace :)

I would prefer more of a standard over a blog. It can live under /guides on the Node.js website. I can think of a blog post that describes the problems APM vendors are facing today, though. This would give us something to pass on to customers to baseline their expectations when it comes to Node monitoring. I could sketch something vendor neutral out for you to review, ok?

@sam-github
Copy link

@danielkhan Sorry Daniel! Updated. I agree, a guide makes more sense (with a blog to point out its existence)

@mhdawson
Copy link
Member

mhdawson commented May 9, 2017

Guide + intro blog with pointer was what we settled on in the discussions.

@joshgav
Copy link
Contributor

joshgav commented May 11, 2017

As you all were discussing this topic last week (and I sadly couldn't join!) we were busy at Microsoft preparing https://github.com/Microsoft/node-diagnostic-channel for use with a couple of our APM-like products. Take a look at the README, particularly the Why section; I think the diagnostic-channel project may be a good start to addressing the problems discussed here. We're open to developing it further with your help.

Also, I think it would be a small step to integrate diagnostic-channel with trace_events, particularly with the help of @jasongin's JS Tracing API proposal. That would enable us to experiment with both a) sending diagnostic messages to diagnostic-channel directly, and b) sending them to trace_events and then to diagnostic-channel.

@mike-kaufman
Copy link
Contributor

Per discussion in Diag WG Meeting on 2017-12-13, we decided to open a seperate issue (#134) to track the latter issue here ("Can we standardise the way tracing information is sent from an instrumented module"), and we decided to close this one.

Plz re-activate If something else in this thread is lost & needs tracked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants