Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: domain handling #11

Merged
merged 6 commits into from
Mar 8, 2017
Merged

Conversation

AVVS
Copy link
Contributor

@AVVS AVVS commented Mar 8, 2017

We’ve recently integrated spam-agent-node for our production nodejs app monitoring - its a restify 4.x based http webserver. After we’ve integrated monitoring we’ve noticed that our memory was leaking. Upon investigation of heap dumps we’ve discovered that all the requests and responses are retained in memory and our bound to it’s Domain handler, which in turn is retained by existing timer, which is bound to the domain during domain.add(req) call and is originally created when patching http(s)Server in the spa agent, this creates a circular reference and a memory leak and effectively makes usage of the product impossible as we now have to restart the servers every day in order for them to operate.

A few screenshots and investigation paths:

spm-agent-node (https://github.com/sematext/spm-agent-nodejs/blob/master/lib/httpServerAgent.js#L30)
                               ^ measure/Timer (https://github.com/felixge/node-measured/blob/master/lib/metrics/Timer.js#L10)
                               ^ measure/Meter (https://github.com/felixge/node-measured/blob/master/lib/metrics/Meter.js#L7)

screen shot 2017-03-07 at 10 00 39 am

screen shot 2017-03-07 at 10 00 07 am

This PR makes sure that we exit current domain. The whole thing is really buggy, but this gets the job done. I've been running the server for the past several hours and memory is not leaking anymore

@otisg
Copy link
Member

otisg commented Mar 8, 2017

Looks like this somehow broke the build, @AVVS. We'll need to have a look.

@AVVS
Copy link
Contributor Author

AVVS commented Mar 8, 2017

@otisg noticed a problem with stats being overwritten. If it's not too much hassle - I can add standard eslint instead of jshint as more modern approach

@otisg
Copy link
Member

otisg commented Mar 8, 2017

I can add standard eslint instead of jshint as more modern approach

What say you, @megastef ? Looks like we went through something like that in logagent-js already - sematext/logagent-js#60

@AVVS
Copy link
Contributor Author

AVVS commented Mar 8, 2017

Should all be good now, please review/comment/merge ;)

@AVVS
Copy link
Contributor Author

AVVS commented Mar 8, 2017

actually that interval that wasnt working before helped fix the leak. this doesnt really solve the issue yet, will work on it a bit more

@AVVS
Copy link
Contributor Author

AVVS commented Mar 8, 2017

seems like the ctx.stats = Measured.createCollection() is the root cause of this or smth

@megastef
Copy link
Contributor

megastef commented Mar 8, 2017

Thanks for the contribution! @AVVS as far I know domains are deprecated see nodejs/node#66 and https://nodejs.org/api/domain.html, and this PR would add domain handling - I just wonder if we would create here a problem for future node versions.

@megastef megastef merged commit 5fcb89f into sematext:master Mar 8, 2017
@megastef
Copy link
Contributor

megastef commented Mar 8, 2017

@AVVS
Copy link
Contributor Author

AVVS commented Mar 8, 2017

@megastef its up to you, for me it works much faster than general npm i --production and allows to pin all deps while making sure they wont update to something unexpected

@megastef
Copy link
Contributor

megastef commented Mar 8, 2017

Sure I see the value in it. We would need to maintain the yarn.lock file, I'm just afraid that we forget to update it - so we need a script to generate it before release.

@AVVS
Copy link
Contributor Author

AVVS commented Mar 8, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants