-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pulsar_out_bytes_total and pulsar_out_messages_total metrics act as gauges instead of counters #15819
Comments
related PR: #6918 |
corresponding stats from the cli shows correct values for msgOutCounter and msgInCounter,
|
The issue had no activity for 30 days, mark with Stale label. |
I'm not sure if this is exactly the same issue, but what I'm seeing is when I disconnect from a subscription, the message/byte count for that subscription goes back down to zero. |
@michaeljmarshall I mean when the consumer disconnects. So, for example ctrl-c using pulsar-client. Pulsar topic stats show the correct totals, so maybe #10644 fixed the issue for topic stats, but the Prometheus metrics still have the old problem? |
Fixes apache#15819 The existing code calculates the pulsar_out_bytes_total and pulsar_out_messages_total per subscription metrics by adding the values from the currently connected consumers. This produces incorrect values as soon as one or more of the consumers disconnects from the subscription. This changes these two metrics to directly use the subscription stats for these values, and match the output of `pulsar-admin topic stats`. Signed-off-by: Paul Gier <paul.gier@datastax.com>
The issue seems to be that |
Fixes apache#15819 The existing code calculates the pulsar_out_bytes_total and pulsar_out_messages_total per subscription metrics by adding the values from the currently connected consumers. This produces incorrect values as soon as one or more of the consumers disconnects from the subscription. This changes these two metrics to directly use the subscription stats for these values, and match the output of `pulsar-admin topic stats`. Signed-off-by: Paul Gier <paul.gier@datastax.com>
Fixes apache#15819 The existing code calculates the pulsar_out_bytes_total and pulsar_out_messages_total per subscription metrics by adding the values from the currently connected consumers. This produces incorrect values as soon as one or more of the consumers disconnects from the subscription. This changes these two metrics to directly use the subscription stats for these values, and match the output of `pulsar-admin topic stats`. Signed-off-by: Paul Gier <paul.gier@datastax.com>
Fixes apache#15819 The existing code calculates the pulsar_out_bytes_total and pulsar_out_messages_total per subscription metrics by adding the values from the currently connected consumers. This produces incorrect values as soon as one or more of the consumers disconnects from the subscription. This changes these two metrics to directly use the subscription stats for these values, and match the output of `pulsar-admin topic stats`. Signed-off-by: Paul Gier <paul.gier@datastax.com>
@pgier thanks for the RCA and for fixing this!
|
Fixes apache#15819 The existing code calculates the pulsar_out_bytes_total and pulsar_out_messages_total per subscription metrics by adding the values from the currently connected consumers. This produces incorrect values as soon as one or more of the consumers disconnects from the subscription. This changes these two metrics to directly use the subscription stats for these values, and match the output of `pulsar-admin topic stats`. It also changes these metrics to be defined as `counter` type. Signed-off-by: Paul Gier <paul.gier@datastax.com>
Fixes apache#15819 The existing code calculates the pulsar_out_bytes_total and pulsar_out_messages_total per subscription metrics by adding the values from the currently connected consumers. This produces incorrect values as soon as one or more of the consumers disconnects from the subscription. This changes these two metrics to directly use the subscription stats for these values, and match the output of `pulsar-admin topic stats`. It also changes these metrics to be defined as `counter` type. Signed-off-by: Paul Gier <paul.gier@datastax.com>
Fixes apache#15819 The existing code calculates the pulsar_out_bytes_total and pulsar_out_messages_total per subscription metrics by adding the values from the currently connected consumers. This produces incorrect values as soon as one or more of the consumers disconnects from the subscription. This changes these two metrics to directly use the subscription stats for these values, and match the output of `pulsar-admin topic stats`. It also changes these metrics to be defined as `counter` type. Signed-off-by: Paul Gier <paul.gier@datastax.com>
Fixes apache#15819 The existing code calculates the pulsar_out_bytes_total and pulsar_out_messages_total per subscription metrics by adding the values from the currently connected consumers. This produces incorrect values as soon as one or more of the consumers disconnects from the subscription. This updates these two metrics to directly use the subscription stats for these values, and match the output of `pulsar-admin topic stats`. It also changes these metrics to be defined as `counter` type. Signed-off-by: Paul Gier <paul.gier@datastax.com>
…ion (#18451) Signed-off-by: Paul Gier <paul.gier@datastax.com> Fixes #15819 The existing code calculates the pulsar_out_bytes_total and pulsar_out_messages_total per subscription metrics by adding the values from the currently connected consumers. This produces incorrect values as soon as one or more of the consumers disconnects from the subscription. This changes these two metrics to directly use the subscription stats for these values, and match the output of `pulsar-admin topic stats`. Signed-off-by: Paul Gier <paul.gier@datastax.com> Fixes #15819 ### Motivation The prometheus metrics for pulsar_out_bytes_total and pulsar_out_messages_total should never decrease, and they should match the output seen when using pulsar-admin. ### Modifications Changed the calculation of pulsar_out_bytes_total and pulsar_out_messages_total to directly use the subscription stats instead of calculating these values by summing the values of the currently connected consumers. ### Verifying this change - [X] Make sure that the change passes the CI checks. Added a unit test to cover this case. ### Does this pull request potentially affect one of the following parts: *If the box was checked, please highlight the changes* - [ ] Dependencies (add or upgrade a dependency) - [ ] The public API - [ ] The schema - [ ] The default values of configurations - [ ] The threading model - [ ] The binary protocol - [ ] The REST endpoints - [ ] The admin CLI options - [ ] Anything that affects deployment ### Documentation <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> - [ ] `doc` <!-- Your PR contains doc changes. Please attach the local preview screenshots (run `sh start.sh` at `pulsar/site2/website`) to your PR description, or else your PR might not get merged. --> - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later --> - [X] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [ ] `doc-complete` <!-- Docs have been already added --> ### Matching PR in forked repository PR in forked repository: pgier#2
…ion (#18451) Signed-off-by: Paul Gier <paul.gier@datastax.com> Fixes #15819 The existing code calculates the pulsar_out_bytes_total and pulsar_out_messages_total per subscription metrics by adding the values from the currently connected consumers. This produces incorrect values as soon as one or more of the consumers disconnects from the subscription. This changes these two metrics to directly use the subscription stats for these values, and match the output of `pulsar-admin topic stats`. Signed-off-by: Paul Gier <paul.gier@datastax.com> Fixes #15819 ### Motivation The prometheus metrics for pulsar_out_bytes_total and pulsar_out_messages_total should never decrease, and they should match the output seen when using pulsar-admin. ### Modifications Changed the calculation of pulsar_out_bytes_total and pulsar_out_messages_total to directly use the subscription stats instead of calculating these values by summing the values of the currently connected consumers. ### Verifying this change - [X] Make sure that the change passes the CI checks. Added a unit test to cover this case. ### Does this pull request potentially affect one of the following parts: *If the box was checked, please highlight the changes* - [ ] Dependencies (add or upgrade a dependency) - [ ] The public API - [ ] The schema - [ ] The default values of configurations - [ ] The threading model - [ ] The binary protocol - [ ] The REST endpoints - [ ] The admin CLI options - [ ] Anything that affects deployment ### Documentation <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> - [ ] `doc` <!-- Your PR contains doc changes. Please attach the local preview screenshots (run `sh start.sh` at `pulsar/site2/website`) to your PR description, or else your PR might not get merged. --> - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later --> - [X] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [ ] `doc-complete` <!-- Docs have been already added --> ### Matching PR in forked repository PR in forked repository: pgier#2 (cherry picked from commit c03e33e)
…ion (#18451) Signed-off-by: Paul Gier <paul.gier@datastax.com> Fixes #15819 The existing code calculates the pulsar_out_bytes_total and pulsar_out_messages_total per subscription metrics by adding the values from the currently connected consumers. This produces incorrect values as soon as one or more of the consumers disconnects from the subscription. This changes these two metrics to directly use the subscription stats for these values, and match the output of `pulsar-admin topic stats`. Signed-off-by: Paul Gier <paul.gier@datastax.com> Fixes #15819 ### Motivation The prometheus metrics for pulsar_out_bytes_total and pulsar_out_messages_total should never decrease, and they should match the output seen when using pulsar-admin. ### Modifications Changed the calculation of pulsar_out_bytes_total and pulsar_out_messages_total to directly use the subscription stats instead of calculating these values by summing the values of the currently connected consumers. ### Verifying this change - [X] Make sure that the change passes the CI checks. Added a unit test to cover this case. ### Does this pull request potentially affect one of the following parts: *If the box was checked, please highlight the changes* - [ ] Dependencies (add or upgrade a dependency) - [ ] The public API - [ ] The schema - [ ] The default values of configurations - [ ] The threading model - [ ] The binary protocol - [ ] The REST endpoints - [ ] The admin CLI options - [ ] Anything that affects deployment ### Documentation <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> - [ ] `doc` <!-- Your PR contains doc changes. Please attach the local preview screenshots (run `sh start.sh` at `pulsar/site2/website`) to your PR description, or else your PR might not get merged. --> - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later --> - [X] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [ ] `doc-complete` <!-- Docs have been already added --> ### Matching PR in forked repository PR in forked repository: pgier#2 (cherry picked from commit c03e33e)
…ion (apache#18451) Signed-off-by: Paul Gier <paul.gier@datastax.com> Fixes apache#15819 The existing code calculates the pulsar_out_bytes_total and pulsar_out_messages_total per subscription metrics by adding the values from the currently connected consumers. This produces incorrect values as soon as one or more of the consumers disconnects from the subscription. This changes these two metrics to directly use the subscription stats for these values, and match the output of `pulsar-admin topic stats`. Signed-off-by: Paul Gier <paul.gier@datastax.com> Fixes apache#15819 ### Motivation The prometheus metrics for pulsar_out_bytes_total and pulsar_out_messages_total should never decrease, and they should match the output seen when using pulsar-admin. ### Modifications Changed the calculation of pulsar_out_bytes_total and pulsar_out_messages_total to directly use the subscription stats instead of calculating these values by summing the values of the currently connected consumers. ### Verifying this change - [X] Make sure that the change passes the CI checks. Added a unit test to cover this case. ### Does this pull request potentially affect one of the following parts: *If the box was checked, please highlight the changes* - [ ] Dependencies (add or upgrade a dependency) - [ ] The public API - [ ] The schema - [ ] The default values of configurations - [ ] The threading model - [ ] The binary protocol - [ ] The REST endpoints - [ ] The admin CLI options - [ ] Anything that affects deployment ### Documentation <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> - [ ] `doc` <!-- Your PR contains doc changes. Please attach the local preview screenshots (run `sh start.sh` at `pulsar/site2/website`) to your PR description, or else your PR might not get merged. --> - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later --> - [X] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [ ] `doc-complete` <!-- Docs have been already added --> ### Matching PR in forked repository PR in forked repository: pgier#2 (cherry picked from commit c03e33e) (cherry picked from commit 54dccf9)
…ion (apache#18451) Signed-off-by: Paul Gier <paul.gier@datastax.com> Fixes apache#15819 The existing code calculates the pulsar_out_bytes_total and pulsar_out_messages_total per subscription metrics by adding the values from the currently connected consumers. This produces incorrect values as soon as one or more of the consumers disconnects from the subscription. This changes these two metrics to directly use the subscription stats for these values, and match the output of `pulsar-admin topic stats`. Signed-off-by: Paul Gier <paul.gier@datastax.com> Fixes apache#15819 ### Motivation The prometheus metrics for pulsar_out_bytes_total and pulsar_out_messages_total should never decrease, and they should match the output seen when using pulsar-admin. ### Modifications Changed the calculation of pulsar_out_bytes_total and pulsar_out_messages_total to directly use the subscription stats instead of calculating these values by summing the values of the currently connected consumers. ### Verifying this change - [X] Make sure that the change passes the CI checks. Added a unit test to cover this case. ### Does this pull request potentially affect one of the following parts: *If the box was checked, please highlight the changes* - [ ] Dependencies (add or upgrade a dependency) - [ ] The public API - [ ] The schema - [ ] The default values of configurations - [ ] The threading model - [ ] The binary protocol - [ ] The REST endpoints - [ ] The admin CLI options - [ ] Anything that affects deployment ### Documentation <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> - [ ] `doc` <!-- Your PR contains doc changes. Please attach the local preview screenshots (run `sh start.sh` at `pulsar/site2/website`) to your PR description, or else your PR might not get merged. --> - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later --> - [X] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [ ] `doc-complete` <!-- Docs have been already added --> ### Matching PR in forked repository PR in forked repository: pgier#2
Describe the bug
pulsar_out_bytes_total and pulsar_out_messages_total metrics should normally be counters (always increasing) in the same way as pulsar_in_bytes_total and pulsar_in_messages_total metrics.
However they behave incorrectly as they go up and down.
Furthermore, when producer is idle, these 2 metrics are no longer reporting when pulsar is scraped.
To Reproduce
Steps to reproduce the behavior:
Create a topic, produce and consume from that topic for a long period of time.
Then check how these 2 counters evolve over time
Expected behavior
These 2 counters should continuously increase.
Screenshots
See attached grafana dashboard
Desktop (please complete the following information):
streamnative/sn-platform:2.9.2.17
Additional context
See
The text was updated successfully, but these errors were encountered: