gunicorn instrumentation via statsD and a new logger class #748

alq666 · 2014-05-14T17:58:02Z

(This is another attempt at getting statsD instrumentation to gunicorn initially in #672)

What is this pull request?

statsd is the de facto low-overhead instrumentation of many web stacks. I have come across a number of different ways to instrument gunicorn, none of which are part of the core and rely on external scripts being run out-of-band to guess the number of requests processed or the amount of cpu spent.

I am hereby proposing that gunicorn optionally submits metrics to a local (or remote) statsD instance with the use of --logger-class gunicorn.instrument.statsd.Statsd option. Being that statsD uses UDP as transport, the receiving end being up or down does not affect the performance of the gunicorn application (unlike other stat collection schemes).

Here is an example of such metric being graphed.

How does this work?

The arbiter and the base worker use a new logger that can take additional arguments to turn a log message into a metric. In addition certain known logging levels are automatically turned into metrics.

In particular the access method is overriden to capture duration and count of each gunicorn request.

The new logger class inherits from gunicorn.glogging.Logger so existing logging should be preserved. No external dependencies were created.

Who is behind this?

I use gunicorn at Datadog and have found statsD an indispensable aid to measure the performance of high-traffic applications. There is no company-specific bits in the pull request of course.

…tsd-logger

Move setting of env vars from Arbiter.start to Arbiter.setup so that they are available during application start up when 'preload_app' is used. Closes #735

tilgovi · 2014-05-14T18:05:18Z

This looks like a pretty beautiful solution. I had forgotten how easy it is to send data to statsd.

I wish we had this at my last company!

alq666 · 2014-05-14T18:24:34Z

Thanks @tilgovi

tilgovi · 2014-05-18T00:01:46Z

A few questions:

Can you imagine any value in having the extra keyword arguments available for log formatting?
If yes, should we use the extras keyword argument that log calls take anyway?
Should the worker count log in the arbiter be debug level? It's expected not to change except through administrator action and manage_workers is called every timeout cycle. That seems like unnecessary log noise (though I understand info level is pretty high, so maybe it's okay).
Should the logger check the log level or should we assume that anyone using statsD wants the metrics collected no matter the log level?
What do you think about logging resp.status (the status code)?

alq666 · 2014-05-19T02:15:22Z

@tilgovi

I did not know about extras, I can indeed stuff my keywords into extras to save a call to a function call for each log entry.
With respect to logging level and instrumentation you are correct in that there is a coupling between using INFO or greater to get some instrumentation going. My thinking behind this was two-fold:
1. to provide instrumentation that makes sense out of the box, without having to set up anything special (save command-line arguments). As such DEBUG statements don't send anything to statsD and ERROR/CRITICAL/WARNING only send limited metrics.
2. to equate instrumentation with informational logging: instrumentation is most useful for production, where INFO is likely to be enabled (but not DEBUG).
resp.status is certainly worth capturing as a rate; I'll add that.

…tsd-logger

Use extra keyword in logger, more pythonic capture response code as a rate metric

alq666 · 2014-05-31T17:20:01Z

@tilgovi I've made the changes we discussed (using extra and capturing resp.status as a rate).

I have also added a test to verify that even if statsD instrumentation fails, the logger continues to work.

tilgovi · 2014-05-31T20:08:46Z

tests/test_009-statsd.py

+    def __init__(self, status):
+        self.status = status
+
+def test_statsd_fail():


These don't actually assert anything, do they?

That's fine, I guess. I see you've wrapped all the methods in the logger in except handlers.

Right, if the exception bubbles back up, which it should not, the test will fail.

Oh! Right :)
On May 31, 2014 1:54 PM, "Alexis Lê-Quôc" notifications@github.com wrote:

In tests/test_009-statsd.py:

self.failp = failp

self.msgs = [] # accumulate messages for later inspection

def send(self, msg):

if self.failp:

raise TestException("Should not interrupt the logger")

self.msgs.append(msg)

def reset(self):

self.msgs = []

+class MockResponse(object):

def init(self, status):

self.status = status

+def test_statsd_fail():

Right, if the exception bubbles back up, which it should not, the test
will fail.

—
Reply to this email directly or view it on GitHub
https://github.com/benoitc/gunicorn/pull/748/files#r13263654.

tilgovi · 2014-05-31T20:10:26Z

@benoitc this looks alright to me. Thoughts?

tilgovi · 2014-06-10T17:26:12Z

@benoitc any reason not to include this in the next release"

benoitc · 2014-06-12T11:37:39Z

The patch looks fine for me but I would change the way it is enabled before to integrate it. Instead of passing the class to the the logger class option and add an option to set the host I would automatically set the statd logger if the --logs-statsd-to option is used. Also since it would be a supported backend we should be able to only use a short name with the --logging-class option. While we would be here I would rename this option as --logging-backend. Also naming --logs-statsd-to in --statsd-host sounds better for me.

If it's done until tomorrow we can include it in R19.0. If not it will wait the next one (planned this month or in july).

Thoughts?

benoitc · 2014-06-12T13:00:21Z

let's not wait much more for the 19,0, This feature will be added in 19.1 this month. We should put it in master ASAP.

alq666 · 2014-06-12T19:25:54Z

@benoitc I can implemented the suggested changes over the next week or so.

benoitc · 2014-06-12T19:27:49Z

@alq666 cool! will merge once everything is OK :)

…tsd-logger

No need to specify a logger class. --statsd-to is enough to trigger the Statsd logger automatically

…d-logger

alq666 · 2014-06-19T03:00:05Z

@benoitc @tilgovi I changed the option to --statsd-host. That in itself is enough to trigger the statsD logger automatically so there is no need to use --logger-class to use it. I left the --logger-class option as is to keep this simpler.

tilgovi · 2014-06-19T03:46:47Z

Looking great.

The readme example needs to be updated to reflect the extras keyword use, yeah?

alq666 · 2014-06-22T02:09:02Z

@tilgovi should be good now.

matrixise · 2014-06-22T06:46:48Z

+1. Can we merge this branch?

benoitc · 2014-06-22T06:56:49Z

gunicorn/arbiter.py

@@ -482,6 +482,11 @@ def manage_workers(self):
            (pid, _) = workers.pop(0)
            self.kill_worker(pid, signal.SIGTERM)

+        self.log.info("{0} workers".format(len(workers)),


i can't test it right now but does this change works with the default logger?

It does work with default logger indeed:

Jun 22 15:53:28 localhost gunicorn.test.app.error: [17090] Starting gunicorn 19.1.0 Jun 22 15:53:28 localhost gunicorn.test.app.error: [17090] Listening at: http://127.0.0.1:8000 (17090) Jun 22 15:53:28 localhost gunicorn.test.app.error: [17090] Using worker: sync Jun 22 15:53:28 localhost gunicorn.test.app.error: [17095] Booting worker with pid: 17095 Jun 22 15:53:29 localhost gunicorn.test.app.error: [17096] Booting worker with pid: 17096 Jun 22 15:53:29 localhost gunicorn.test.app.error: [17097] Booting worker with pid: 17097 Jun 22 15:53:29 localhost gunicorn.test.app.error: [17098] Booting worker with pid: 17098 Jun 22 15:53:29 localhost gunicorn.test.app.error: [17090] 4 workers

using gunicorn --log-syslog --log-level=info --workers=4 test:app as a test.

gunicorn instrumentation via statsD and a new logger class

benoitc · 2014-06-28T18:27:53Z

merged thanks!

srgshames · 2014-09-08T14:39:37Z

Really cool feature!

I was wondering if there's a way to differentiate between multiple applications? Is there a way to assign a name/prefix to an instance of a gunicorn app to be reflected in the metric name?

tilgovi · 2014-09-09T06:52:49Z

@srgshames good thought.
@alq666 do you think we should consider prepending the app name to the metrics? Or appending? It's easily available as self.cfg.proc_name from the logger.

alq666 · 2014-09-09T10:13:42Z

I can certainly stick it in the middle in the normal case, just after
gunicorn but before the actual metric name.

In the case the statsd's implementation supports explicit tags (like
Datadog's), I will add the app name as a tag.
On Sep 9, 2014 2:52 AM, "Randall Leeds" notifications@github.com wrote:

@srgshames https://github.com/srgshames good thought.
@alq666 https://github.com/alq666 do you think we should consider
prepending the app name to the metrics? Or appending? It's easily available
as self.cfg.proc_name from the logger.

—
Reply to this email directly or view it on GitHub
#748 (comment).

alq666 and others added 12 commits May 5, 2014 15:50

First cut of logger-based statsd

0b5cc1e

Metrics sent on request and arbiter

35c7147

Sanitize arguments to not trick regular loggers

9b5d287

Tested with mainline statsd

d216924

Drop tracing and sampling to keep the code tight

af06c54

Merge branch 'master' of https://github.com/benoitc/gunicorn into sta…

8a3a3fc

…tsd-logger

More in-depth documentation of the new instrumentation layer

ae98413

Cosmetic cleanup

d2610dc

Fix: logger_class can be undefined.

e6b696b

Move setting of environmental variables before preload_app start.

07a7e5b

Move setting of env vars from Arbiter.start to Arbiter.setup so that they are available during application start up when 'preload_app' is used. Closes #735

Remove unused imports

f1fc271

Stop all servers

d29b937

alq666 added 4 commits May 30, 2014 23:10

Merge branch 'master' of https://github.com/benoitc/gunicorn into sta…

de9b05c

…tsd-logger

Incorporate feedback from PR

89a3c96

Use extra keyword in logger, more pythonic capture response code as a rate metric

Fix silent exception in response code capture

4bf1e1d

Fix failing test and reorder logging to always log

3c3c52f

tilgovi reviewed May 31, 2014
View reviewed changes

alq666 added 2 commits June 16, 2014 23:05

Merge branch 'master' of https://github.com/benoitc/gunicorn into sta…

80b1018

…tsd-logger

Merge branch 'master' of https://github.com/benoitc/gunicorn into sta…

c3421a8

…tsd-logger

alq666 added 3 commits June 18, 2014 21:33

Simpler statsd integration

05051aa

No need to specify a logger class. --statsd-to is enough to trigger the Statsd logger automatically

Merge branch 'statsd-logger' of github.com:alq666/gunicorn into stats…

964c06a

…d-logger

Change --statsd-to to --statsd-host

3c9b9e3

alq666 added 2 commits June 18, 2014 23:58

Documentation reflect new option name

3b63a64

Update instructions to log arbitrary metrics

7f33e36

benoitc reviewed Jun 22, 2014
View reviewed changes

benoitc added a commit that referenced this pull request Jun 28, 2014

Merge pull request #748 from alq666/statsd-logger

ceda9c5

gunicorn instrumentation via statsD and a new logger class

benoitc merged commit ceda9c5 into benoitc:master Jun 28, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gunicorn instrumentation via statsD and a new logger class #748

gunicorn instrumentation via statsD and a new logger class #748

alq666 commented May 14, 2014

tilgovi commented May 14, 2014

alq666 commented May 14, 2014

tilgovi commented May 18, 2014

alq666 commented May 19, 2014

alq666 commented May 31, 2014

tilgovi May 31, 2014

tilgovi May 31, 2014

alq666 May 31, 2014

tilgovi May 31, 2014

tilgovi commented May 31, 2014

tilgovi commented Jun 10, 2014

benoitc commented Jun 12, 2014

benoitc commented Jun 12, 2014

alq666 commented Jun 12, 2014

benoitc commented Jun 12, 2014

alq666 commented Jun 19, 2014

tilgovi commented Jun 19, 2014

alq666 commented Jun 22, 2014

matrixise commented Jun 22, 2014

benoitc Jun 22, 2014

alq666 Jun 22, 2014

benoitc commented Jun 28, 2014

srgshames commented Sep 8, 2014

tilgovi commented Sep 9, 2014

alq666 commented Sep 9, 2014

gunicorn instrumentation via statsD and a new logger class #748

gunicorn instrumentation via statsD and a new logger class #748

Conversation

alq666 commented May 14, 2014

What is this pull request?

How does this work?

Who is behind this?

tilgovi commented May 14, 2014

alq666 commented May 14, 2014

tilgovi commented May 18, 2014

alq666 commented May 19, 2014

alq666 commented May 31, 2014

tilgovi May 31, 2014

Choose a reason for hiding this comment

tilgovi May 31, 2014

Choose a reason for hiding this comment

alq666 May 31, 2014

Choose a reason for hiding this comment

tilgovi May 31, 2014

Choose a reason for hiding this comment

tilgovi commented May 31, 2014

tilgovi commented Jun 10, 2014

benoitc commented Jun 12, 2014

benoitc commented Jun 12, 2014

alq666 commented Jun 12, 2014

benoitc commented Jun 12, 2014

alq666 commented Jun 19, 2014

tilgovi commented Jun 19, 2014

alq666 commented Jun 22, 2014

matrixise commented Jun 22, 2014

benoitc Jun 22, 2014

Choose a reason for hiding this comment

alq666 Jun 22, 2014

Choose a reason for hiding this comment

benoitc commented Jun 28, 2014

srgshames commented Sep 8, 2014

tilgovi commented Sep 9, 2014

alq666 commented Sep 9, 2014