Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check_status fails in race condition when not using dogstatsd #1002

Closed
sethrosenblum opened this issue Jun 26, 2014 · 3 comments
Closed

check_status fails in race condition when not using dogstatsd #1002

sethrosenblum opened this issue Jun 26, 2014 · 3 comments
Assignees

Comments

@sethrosenblum
Copy link
Contributor

https://datadog.desk.com/agent/case/11956

When setting dogstatsd to no, check_Status will occasionally fail and prevent the agent form starting.

This was happening in the redhat init script.

@sethrosenblum
Copy link
Contributor Author

Here's what it looks like:

[vagrant@vagrant-centos-6 ~]$ sudo /etc/init.d/datadog-agent start
Python logging config is no longer supported and will be ignored.
            To configure logging, update the logging portion of 'datadog.conf' to match:
             'https://github.com/DataDog/dd-agent/blob/master/datadog.conf.example'.
             Starting Datadog Agent (using supervisord):
datadog-agent:collector          RUNNING    pid 20921, uptime 0:00:10
datadog-agent:dogstatsd          BACKOFF    Exited too quickly (process log may have details)
datadog-agent:forwarder          RUNNING    pid 20920, uptime 0:00:10
datadog-agent:pup                EXITED     Jun 26 04:29 PM
Datadog Agent (supervisor) is NOT running all child process[FAILED]
Stopping Datadog Agent (using killproc on supervisord):    [  OK  ]

@remh
Copy link
Contributor

remh commented Jun 26, 2014

what does the /var/log/datadog/dogstatsd.log look like ?

@sethrosenblum
Copy link
Contributor Author

2014-06-26 16:19:02 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:268) | Listening on host & port: ('localhost', 8125)
2014-06-26 16:19:02 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:102) | Reporting to localhost:17123 every 10s
2014-06-26 16:19:12 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:142) | Flush #1: flushed 0 metrics and 0 events
2014-06-26 16:19:22 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:142) | Flush #2: flushed 1 metric and 0 events
2014-06-26 16:19:32 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:142) | Flush #3: flushed 1 metric and 0 events
2014-06-26 16:19:43 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:142) | Flush #4: flushed 1 metric and 0 events
2014-06-26 16:19:53 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:142) | Flush #5: flushed 1 metric and 0 events
2014-06-26 16:20:03 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:142) | Flush #6: flushed 1 metric and 0 events
2014-06-26 16:20:14 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:142) | Flush #7: flushed 1 metric and 0 events
2014-06-26 16:20:24 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:142) | Flush #8: flushed 1 metric and 0 events
2014-06-26 16:20:34 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:142) | Flush #9: flushed 1 metric and 0 events
2014-06-26 16:20:44 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:142) | Flush #10: flushed 1 metric and 0 events
2014-06-26 16:20:44 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:144) | First flushes done, 5 flushes will be logged every 70 flushes.
2014-06-26 16:29:26 BRT | WARNING | dd.dogstatsd | util(util.py:173) | Hostname: [removed] is not complying with RFC 1123
2014-06-26 16:29:32 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:97) | Stopping reporter
2014-06-26 16:29:32 BRT | ERROR | dd.dogstatsd | dogstatsd(dogstatsd.py:157) | Error flushing metrics
Traceback (most recent call last):
  File "/usr/share/datadog/agent/dogstatsd.py", line 131, in flush
    self.submit(metrics)
  File "/usr/share/datadog/agent/dogstatsd.py", line 178, in submit
    response = conn.getresponse()
  File "/usr/lib64/python2.6/httplib.py", line 990, in getresponse
    response.begin()
  File "/usr/lib64/python2.6/httplib.py", line 391, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python2.6/httplib.py", line 349, in _read_status
    line = self.fp.readline()
  File "/usr/lib64/python2.6/socket.py", line 433, in readline
    data = recv(1)
error: [Errno 104] Connection reset by peer
2014-06-26 16:29:32 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:340) | Dogstatsd is stopped
2014-06-26 16:29:36 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:357) | Dogstatsd is disabled. Exiting
2014-06-26 16:29:41 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:357) | Dogstatsd is disabled. Exiting
2014-06-26 16:33:22 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:357) | Dogstatsd is disabled. Exiting
2014-06-26 16:33:27 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:357) | Dogstatsd is disabled. Exiting
2014-06-26 16:33:34 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:357) | Dogstatsd is disabled. Exiting
2014-06-26 16:33:41 BRT | INFO | dd.dogstatsd | dogstatsd(dogstatsd.py:357) | Dogstatsd is disabled. Exiting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants