Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expose gc event as inspector's runtime event #220

Closed
vmarchaud opened this issue Aug 9, 2018 · 12 comments
Closed

expose gc event as inspector's runtime event #220

vmarchaud opened this issue Aug 9, 2018 · 12 comments

Comments

@vmarchaud
Copy link
Contributor

Hi everyone,

I would like to discuss the best way to expose metrics about the GC life cycle to the JS land through the inspector module.
As lot of knows, we can give the --trace-gc and --trace-gc-verbose to V8, it will then print some data at each GC passage. I believe it would be really useful for users (and APM vendors) to be able to access those metrics to better understand diagnostics their applications.

Anyway i'm (and @keymetrics) willing to put some time on this problem but i'm not sure what is the right way to do it. I suppose modifying V8 in deps/v8/ isn't a really good idea so i would have done it by "hijacking" the protocol session to add our own method. Then at this point just implement it in C++ with the V8 C++ API.

What you guys think of this ? Maybe someone already started to work on something similar ?

@gireeshpunathil
Copy link
Member

  • direct invocation of v8 APIs looks to be a good idea.
  • but how do we expose this to APMs? through JS callbacks? what would be the relevance of inspector?
  • slightly related: how node-report extracts heap statistics (only snapshot)

@vmarchaud
Copy link
Contributor Author

I believe we could expose this as a function of the Runtime domain, so we could do something like that :

const inspector = require('inspector')
const session = new inspector.Session()
session.connect()

session.post('Runtime.enable')
session.on('Runtime.garbageCollectionStats', (stats) => {
   // we get the GC duration, type etc
})

The heap statistics is already exposed trough the v8 module so i believe we only need to get GC related metrics here

@ofrobots
Copy link
Contributor

ofrobots commented Aug 9, 2018

@vmarchaud Have you looked at the trace_events API? V8 exposes high level GC events by default and there some 'disabled-by-default' categories for lower level events that expose GC behaviour in excruciating detail. Please give it a shot, and leave feedback on how well (or not) it would work for your use-cases.

trace_events is still rough around the edges, but is a better API for this use-case IMO. If you're willing to contribute to help shape it up, that would be awesome too!

@ofrobots
Copy link
Contributor

ofrobots commented Aug 9, 2018

FYI, the trace event data is exposed through a file – or through the inspector protocol – albeit we are still working on bugs in the latter. Here's some sample code that you can use to connect to process and get Tracing data out: https://gist.github.com/ofrobots/ab4c4e68b0de4852197308cd25f3cc0e

@vmarchaud
Copy link
Contributor Author

@ofrobots I knew about the trace_events API, but i didn't know it allow to get those metrics, i will try tomorrow to see what i can get with it.
I followed the different PRs about the API being toggleable at runtime, do you know if its planned to backport it to 8.x or 10.x ?

@ofrobots
Copy link
Contributor

ofrobots commented Aug 9, 2018

For back-porting, it depends. My recommendation would be to request a back-port on the specific PR for the feature/fix you would like in back-ported.

@vmarchaud
Copy link
Contributor Author

After thinking about it i believe it would be easier for anyone wanting to retrieve those metrics (which are critical for every production deployment) without having to parse the trace_events stream.

Specially that from my understanding, you can't just get the stream of events continuously, you need to stop it, parse the data to get the metrics, and then re-start it to get new events.

If this is right, i believe my approach will be easier for people that just want to monitor the GC without having to understand how the tracing works IMO.

What do you think ?

@AndreasMadsen
Copy link
Member

AndreasMadsen commented Aug 26, 2018

If this is right, i believe my approach will be easier for people that just want to monitor the GC without having to understand how the tracing works IMO.

The job of node.js itself, it not to make everything easy but to make everything feasible. I would prefer not to implement two ways of doing the same thing, for the sake of making things a bit easier. To take a page from "The Zen of Python".

There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.

I think a much more productive approach is to make trace_events more consumable. So if you have specific issues here, then we can try and solve those.

Specially that from my understanding, you can't just get the stream of events continuously, you need to stop it, parse the data to get the metrics, and then re-start it to get new events.

I'm not that familiar with the Inspector protocol for this, but for sure the file protocol is streaming. You can use https://github.com/nearform/node-trace-events-parser to consume. If the inspector protocol requires you to stop as you say, then I would definetly like to see that changed. Maybe simply implementing a .flush() method. I would suspect @ofrobots knows more about this.

@vmarchaud
Copy link
Contributor Author

vmarchaud commented Aug 26, 2018

The job of node.js itself, it not to make everything easy but to make everything feasible. I would prefer not to implement two ways of doing the same thing, for the sake of making things a bit easier. To take a page from "The Zen of Python".

Totally understandable, i would argue that we sometimes need to provide nicer API for the inspector protocol, for example a PR is going on to simplify coverage report

However i believe if it's possible that at some point to record continuously for GC metrics in the trace_events API, then the job of parsing the stream etc, would be better provided by a 3rd party module.

PS: In another note, i would also ask the question if the current tracing categories and their output are documented (specially V8 side) ?

@hashseed
Copy link
Member

I too am of the opinion that trace events are more suitable for this purpose. Fleshing out tracing output is still an on-going project, though.

@vmarchaud
Copy link
Contributor Author

@hashseed Is there any on going development somewhere about flushing the tracing data ?

@vmarchaud
Copy link
Contributor Author

Closing since we can receive GC event from the trace events that can be dynamically triggered at runtime

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants