Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postfix queue sizes dd-agent check #610

Merged
merged 10 commits into from
Oct 7, 2013
Merged

Postfix queue sizes dd-agent check #610

merged 10 commits into from
Oct 7, 2013

Conversation

phantasm66
Copy link
Contributor

A generic and configurable agent check for calculating individual postfix queue sizes (number of messages). Example YAML config included. Tried to stick with the standard way of writing a datadog agent check. Postfix queue permissions are strict so i had to make accommodations for different methods of accessing the queues.

if not exists(queue_path):
raise Exception("PostfixQueuesCheck: (%s) queue directory does not exist" % queue_path)

metric_name = '.'.join(['postfix.queues', queue])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, are there only three queues? If so, I would recommend using tags instead of a metric.

postfix.queue.size{queue:active}
postfix.queue.size{queue:incoming}
postfix.queue.size{queue:deferred}

This make it much easier to graph in Datadog. For example, it's very simple to graph the size of all queues in one graph statement vs. one for each queue. For example:

sum:postfix.queue.size{role:email-server} by {queue}

If there are thousands of different queues, it's probably not good to use tags, but if there is a known, fixed amount, tags are great.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are around 12 possible named queues in postfix. This has not changed very much since postfix's inception. The 3 i specify in the yaml config example are the only ones that will get any use by postfix daemons, generally speaking. So, it sounds like tags are the way to go.. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Thanks for looking.

@clutchski
Copy link
Contributor

Nice! Thanks for doing this. I made a few small comments, but all in all, great stuff.

One other general comment ... is it possible to write a small unit test that sets up a fake postfix directory, counts the things in it, and makes sure we count the right things?

@phantasm66
Copy link
Contributor Author

Awesome.. thanks for the feedback. I needed this for work and am more of a Rubyist.. and my Python blows. I can tackle every thing you said here this weekend.. I did do some unit testing locally, i assume you'd like something included?

@@ -0,0 +1,51 @@
from os import walk, geteuid, popen
from os.path import exists, join
from collections import namedtuple
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

namedtuple was introduced in python 2.6
For compatibility purposed can you import it from our util module instead ?

from util import namedtuple

@phantasm66
Copy link
Contributor Author

finishing up the discussed changes tonight/tomorrow...

@ghost ghost assigned remh Aug 13, 2013
@phantasm66
Copy link
Contributor Author

Hey guys.. per @clutchski i changed the name of this check to just postfix (was postfix_queue). I think i took care of everything but the unit test here. I asked a couple of questions in the most recent commit of postfix.py (re: logging error instead of warning, etc..). I did my own testing and the results of this check are correct, but i will work on a unit test in the coming days.

@clutchski clutchski mentioned this pull request Aug 22, 2013
@phantasm66
Copy link
Contributor Author

Ughh.. no time for unit tests yet.. going to try to crack something out in the next couple of days.. apologies :(

@phantasm66
Copy link
Contributor Author

This nose test is done. See commits for an official dd-agent/tests/test_postfix.py above (phantasm66@1ae3ac9).

The results of the nose test:

[jwebb@vagrant-centos64-x86_64 ~]$ nosetests --nocapture --tests=dd-agent/tests/test_postfix.py

hostname: Unknown host

User jwebb may run the following commands on this host:
(ALL) NOPASSWD: ALL
User jwebb may run the following commands on this host:
(ALL) NOPASSWD: ALL
User jwebb may run the following commands on this host:
(ALL) NOPASSWD: ALL
User jwebb may run the following commands on this host:
(ALL) NOPASSWD: ALL
User jwebb may run the following commands on this host:
(ALL) NOPASSWD: ALL

Test messges put into active = 2010
Test messges put into deferred = 1940
Test messges put into maildrop = 1992
Test messges put into incoming = 2072
Test messges put into bounce = 1985

Test messages counted by dd-agent for maildrop = 1992
Test messages counted by dd-agent for active = 2010
Test messages counted by dd-agent for deferred = 1940
Test messages counted by dd-agent for bounce = 1985
Test messages counted by dd-agent for incoming = 2072


Ran 1 test in 1.070s

OK

@phantasm66
Copy link
Contributor Author

Bump*

Any update on a merge with this?

@jschneiderhan
Copy link

This is awesome. I will definitely use it if it gets merged in.

@remh
Copy link
Contributor

remh commented Oct 7, 2013

Thanks @phantasm66 ! Merging it in. It will be part of the next release.

remh added a commit that referenced this pull request Oct 7, 2013
Postfix queue sizes dd-agent check
@remh remh merged commit 988e0e1 into DataDog:master Oct 7, 2013
@phantasm66
Copy link
Contributor Author

Awesome... Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants