Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mysql.replication.seconds_behind_master remains 0 when slave is stopped or broken #425

Closed
dcrosta opened this issue Mar 27, 2013 · 4 comments
Labels

Comments

@dcrosta
Copy link
Contributor

dcrosta commented Mar 27, 2013

If you stop a MySQL slave (or if its replication becomes broken due to corruption on the master or an un-replicate-able event in the binlog), the field Seconds_Behind_Master becomes NULL. According to metric explorer, the value remains "0" for this metric, even though that's (very likely) not true.

replication

I would propose that either:

  1. The metric is set to something like -1 (an impossible value in normal circumstances), so that users can create metric alerts, or
  2. The (old-style) check be given an option to send events when replication is broken. Perhaps this is best done after Move mysql to checks.d #391.
  3. The metric stops reporting, though I don't like this because you can't set up metric alerts about a metric that has stopped reporting (yet?)

If you'd like to do option 1 or 3, I'm happy to do the work and send a pull request. Option 2 probably requires more time than I could commit to.

@olidb2
Copy link
Member

olidb2 commented Mar 28, 2013

Another possibility would be to have the metric count seconds after the
last reported value. Is that information we can get from the DB, or would
that require us to keep state in the agent?

On Wed, Mar 27, 2013 at 9:50 AM, Dan Crosta notifications@github.comwrote:

If you stop a MySQL slave (or if its replication becomes broken due to
corruption on the master or an un-replicate-able event in the binlog), the
field Seconds_Behind_Master becomes NULL. According to metric explorer,
the value remains "0" for this metric, even though that's (very likely) not
true.

[image: replication]https://f.cloud.github.com/assets/35122/308319/30318d9c-96e5-11e2-845e-680d5d71f917.png

I would propose that either:

  1. The metric is set to something like -1 (an impossible value in
    normal circumstances), so that users can create metric alerts, or
  2. The (old-style) check be given an option to send events when
    replication is broken. Perhaps this is best done after Move mysql to checks.d #391Move mysql to checks.d #391
    .
  3. The metric stops reporting, though I don't like this because you
    can't set up metric alerts about a metric that has stopped reporting (yet?)

If you'd like to do option 1 or 3, I'm happy to do the work and send a
pull request. Option 2 probably requires more time than I could commit to.


Reply to this email directly or view it on GitHubhttps://github.com//issues/425
.

@dcrosta
Copy link
Contributor Author

dcrosta commented Mar 28, 2013

If the replication is broken for any reason it seems that Seconds_Behind_Master becomes NULL, so this would be something the agent would have to compute.

@alq666
Copy link
Member

alq666 commented Dec 23, 2013

Released with 4.0.1

@alq666 alq666 closed this as completed Dec 23, 2013
@alq666
Copy link
Member

alq666 commented Dec 23, 2013

With the metric mysql.slave_running

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants