-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] feat: more TPS metrics #3147
Conversation
node/consensus/src/lib.rs
Outdated
@@ -212,6 +212,11 @@ impl<N: Network> Consensus<N> { | |||
impl<N: Network> Consensus<N> { | |||
/// Adds the given unconfirmed solution to the memory pool. | |||
pub async fn add_unconfirmed_solution(&self, solution: ProverSolution<N>) -> Result<()> { | |||
#[cfg(feature = "metrics")] | |||
{ | |||
metrics::increment_gauge(metrics::consensus::SOLUTIONS, 1f64); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we had this conversation on the prior tps pr, but I'm wondering now if we are adding so many tps metrics, whether we should change the snarkVM metrics crate increment_counter
function to take a param to increase the number by rather than just a constant, since this should technically be a counter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's also possible, but then we have 2 linked PRs (snarkVM + snarkOS) which makes it more annoying to merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that's why I avoided it on the first one-- however if we do end up adding more and more metrics it makes sense to eventually change it-- could be an issue for a later date in snarkvm? @vicsn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I personally don't see the need but make an issue for the future if you think the added function will make us more robust!
@@ -212,6 +212,11 @@ impl<N: Network> Consensus<N> { | |||
impl<N: Network> Consensus<N> { | |||
/// Adds the given unconfirmed solution to the memory pool. | |||
pub async fn add_unconfirmed_solution(&self, solution: ProverSolution<N>) -> Result<()> { | |||
#[cfg(feature = "metrics")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should probably add this to after the unconfirmed solutions and transmissions are added, since there are a few points where we fail or return early
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially I did that, but @vicsn requested to move to top.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, I saw that. I think if we want it at the top we'd have to add metrics flagged code to decrement the gauge on failure or early return, which probably isn't optimal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the point is to see the difference with how many come in and how many actually make it into a block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, although in that case it would probably be better to track it from the delivering side, i.e. whatever we are sending. I think maybe that in this case, we are omitting metrics maybe one layer too deep in the code since what we really want is to omit metrics whenever these post endpoints are called. If we do want to just track how many times those endpoints are hit it might be cleaner to move it there- @vicsn what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rationale for putting it in add_unconfirmed_{transaction, solution}
is so we also count transactions received via the router.
I think measuring it on the delivering side (i.e. tx-cannon) would be even better, but that requires more coordination and time to figure out.
node/metrics/src/names.rs
Outdated
} | ||
|
||
pub mod consensus { | ||
pub const CERTIFICATE_COMMIT_LATENCY: &str = "snarkos_consensus_certificate_commit_latency_secs"; | ||
pub const COMMITTED_CERTIFICATES: &str = "snarkos_consensus_committed_certificates_total"; | ||
pub const LAST_COMMITTED_ROUND: &str = "snarkos_consensus_last_committed_round"; | ||
pub const BLOCK_LATENCY: &str = "snarkos_consensus_block_latency_secs"; | ||
pub const TRANSACTIONS: &str = "snarkos_consensus_transactions_total"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we maybe add unconfirmed to the consensus txn metric names just so its clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems useful to me 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how to validate the grafana template changes, otherwise LGTM
Motivation
https://github.com/AleoHQ/snarkOS/issues/3146