Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] feat: more TPS metrics #3147

Merged
merged 2 commits into from
Mar 5, 2024

Conversation

joske
Copy link
Contributor

@joske joske commented Mar 4, 2024

node/consensus/src/lib.rs Outdated Show resolved Hide resolved
@joske joske force-pushed the feat/tps_metrics branch from ef5c5fe to 2f3d112 Compare March 4, 2024 14:03
@@ -212,6 +212,11 @@ impl<N: Network> Consensus<N> {
impl<N: Network> Consensus<N> {
/// Adds the given unconfirmed solution to the memory pool.
pub async fn add_unconfirmed_solution(&self, solution: ProverSolution<N>) -> Result<()> {
#[cfg(feature = "metrics")]
{
metrics::increment_gauge(metrics::consensus::SOLUTIONS, 1f64);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we had this conversation on the prior tps pr, but I'm wondering now if we are adding so many tps metrics, whether we should change the snarkVM metrics crate increment_counter function to take a param to increase the number by rather than just a constant, since this should technically be a counter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's also possible, but then we have 2 linked PRs (snarkVM + snarkOS) which makes it more annoying to merge.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that's why I avoided it on the first one-- however if we do end up adding more and more metrics it makes sense to eventually change it-- could be an issue for a later date in snarkvm? @vicsn

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I personally don't see the need but make an issue for the future if you think the added function will make us more robust!

@@ -212,6 +212,11 @@ impl<N: Network> Consensus<N> {
impl<N: Network> Consensus<N> {
/// Adds the given unconfirmed solution to the memory pool.
pub async fn add_unconfirmed_solution(&self, solution: ProverSolution<N>) -> Result<()> {
#[cfg(feature = "metrics")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should probably add this to after the unconfirmed solutions and transmissions are added, since there are a few points where we fail or return early

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I did that, but @vicsn requested to move to top.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, I saw that. I think if we want it at the top we'd have to add metrics flagged code to decrement the gauge on failure or early return, which probably isn't optimal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the point is to see the difference with how many come in and how many actually make it into a block.

Copy link
Contributor

@miazn miazn Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, although in that case it would probably be better to track it from the delivering side, i.e. whatever we are sending. I think maybe that in this case, we are omitting metrics maybe one layer too deep in the code since what we really want is to omit metrics whenever these post endpoints are called. If we do want to just track how many times those endpoints are hit it might be cleaner to move it there- @vicsn what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rationale for putting it in add_unconfirmed_{transaction, solution} is so we also count transactions received via the router.

I think measuring it on the delivering side (i.e. tx-cannon) would be even better, but that requires more coordination and time to figure out.

}

pub mod consensus {
pub const CERTIFICATE_COMMIT_LATENCY: &str = "snarkos_consensus_certificate_commit_latency_secs";
pub const COMMITTED_CERTIFICATES: &str = "snarkos_consensus_committed_certificates_total";
pub const LAST_COMMITTED_ROUND: &str = "snarkos_consensus_last_committed_round";
pub const BLOCK_LATENCY: &str = "snarkos_consensus_block_latency_secs";
pub const TRANSACTIONS: &str = "snarkos_consensus_transactions_total";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we maybe add unconfirmed to the consensus txn metric names just so its clear?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems useful to me 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

Copy link
Contributor

@vicsn vicsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to validate the grafana template changes, otherwise LGTM

@howardwu howardwu merged commit 6aba25d into AleoNet:mainnet-staging Mar 5, 2024
23 of 24 checks passed
@howardwu howardwu changed the title feat: more TPS metrics [Feature] feat: more TPS metrics Mar 26, 2024
@joske joske deleted the feat/tps_metrics branch June 27, 2024 09:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants