Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: improve performance of rendering metrics to Prometheus string #549

Merged
merged 4 commits into from
Mar 12, 2023

Conversation

ngavalas
Copy link
Contributor

@ngavalas ngavalas commented Mar 8, 2023

Rendering histograms, particularly in getMetricAsPrometheusString, repeats most of the work of label rendering over and over and over and over again on the same strings. This is because we currently fully copy all labels out onto all buckets and then format them all separately as if they weren't the same labelset to begin with.

This PR takes a different approach:

  1. The labels we're passed in are stored separately from the ones automatically added by histogram (specifically, the le label, but this would work for other ones too).
  2. If we get asked for get, we merge them back in so the API is unchanged. No harm, no foul.
  3. The fast path returns the shared labels separately so that getMetricAsPrometheusString can encode and stringify the shared set of original labels exactly once.
  4. When rendering the rest of the labels to strings, we check and make sure they don't exist in the shared set first. This is because in the original code, the provided set of labels takes precedence over the "internal" le label. Deferring this check until serialization time is faster even if the code is slightly more complex.
  5. We append the shared label set onto the end of the internal label set and bam, it's like nothing ever happened.[1]

Here are the benchmark results:

### Results:

┌────────────────────────────────────────┬────────────────────┬────────────────────┐
│ registry                               │ current            │ prom-client@latest │
├────────────────────────────────────────┼────────────────────┼────────────────────┤
│ getMetricsAsJSON#1 with 64             │ 3747.0556369320916 │ 2000.6075672702098 │
├────────────────────────────────────────┼────────────────────┼────────────────────┤
│ getMetricsAsJSON#2 with 8              │ 3458.670353309119  │ 1674.8824925227625 │
├────────────────────────────────────────┼────────────────────┼────────────────────┤
│ getMetricsAsJSON#2 with 4 and 2 with 2 │ 3660.534565081909  │ 1608.8334436915613 │
├────────────────────────────────────────┼────────────────────┼────────────────────┤
│ getMetricsAsJSON#2 with 2 and 2 with 4 │ 3330.496324702882  │ 452.3735183797067  │
├────────────────────────────────────────┼────────────────────┼────────────────────┤
│ getMetricsAsJSON#6 with 2              │ 3081.6146752509944 │ 1982.9467290709053 │
├────────────────────────────────────────┼────────────────────┼────────────────────┤
│ metrics#1 with 64                      │ 3795.732389650059  │ 3175.098432694792  │
├────────────────────────────────────────┼────────────────────┼────────────────────┤
│ metrics#2 with 8                       │ 4029.138075769385  │ 2882.770072766679  │
├────────────────────────────────────────┼────────────────────┼────────────────────┤
│ metrics#2 with 4 and 2 with 2          │ 4371.559530861944  │ 2486.5945077703427 │
├────────────────────────────────────────┼────────────────────┼────────────────────┤
│ metrics#2 with 2 and 2 with 4          │ 4090.871152384675  │ 2384.3272352940894 │
├────────────────────────────────────────┼────────────────────┼────────────────────┤
│ metrics#6 with 2                       │ 4247.01786082521   │ 2102.320963244116  │
└────────────────────────────────────────┴────────────────────┴────────────────────┘

### Summary:

✓ registry ➭ getMetricsAsJSON#1 with 64 is 87.30% faster.
✓ registry ➭ getMetricsAsJSON#2 with 8 is 106.5% faster.
✓ registry ➭ getMetricsAsJSON#2 with 4 and 2 with 2 is 127.5% faster.
✓ registry ➭ getMetricsAsJSON#2 with 2 and 2 with 4 is 636.2% faster.
✓ registry ➭ getMetricsAsJSON#6 with 2 is 55.41% faster.
✓ registry ➭ metrics#1 with 64 is 19.55% faster.
✓ registry ➭ metrics#2 with 8 is 39.77% faster.
✓ registry ➭ metrics#2 with 4 and 2 with 2 is 75.81% faster.
✓ registry ➭ metrics#2 with 2 and 2 with 4 is 71.57% faster.
✓ registry ➭ metrics#6 with 2 is 102.0% faster.

This should also help #543.

[1] Technically, the order of some labels on histograms changes now depending on where they came from. You can see this in the histogram test case, which I needed to tweak. Label order stability doesn't seem super important, but I can try to fix this if anyone feels really strongly. It'll be messy and JS objects aren't really ordered consistently across older node versions anyway if I remember correctly.

@ngavalas
Copy link
Contributor Author

ngavalas commented Mar 8, 2023

I see that we actually ban people from passing in le in histograms, so technically there should be no conflicts in that particular case, but it does seem safer to just keep the extra collision check in registry.js so we don't double apply labels ever.

@SimenB
Copy link
Collaborator

SimenB commented Mar 9, 2023

Thanks for sending this PR @ngavalas! Could you rebase now next has landed on master?

Really excited about the results of this PR 👍

@SimenB SimenB requested a review from zbjornson March 9, 2023 12:08
@ngavalas
Copy link
Contributor Author

ngavalas commented Mar 9, 2023

Yup will rebase this morning, thanks.

@ngavalas
Copy link
Contributor Author

ngavalas commented Mar 9, 2023

Rebased and tests + benchmark passed, but I'm going to read the merged code carefully to be triple sure.

Two things I can think of after reading this more:

  1. I am pretty sure label order doesn't strictly matter. In any case, this change only affects label order in a "breaking" way across the old version and the new version; within a single scrape it should be consistent.
  2. What happens if someone passes in a labels object and then mutates it outside of the library? This won't play nicely with my memoization (__flattened_internal) but I want to avoid making an unnecessary copy if things already get weird if people do that, which I suspect is the case

@SimenB
Copy link
Collaborator

SimenB commented Mar 9, 2023

  1. As long as the spec doesn't say anything specific, I think that's perfectly fine. And even if people do snapshot testing of the string (we do at work), this change will go out as a major release. So I don't think it's an issue (again, assuming it doesn't violate the spec).
  2. I'm inclined to call that a bug in the user code and not something we need to worry about. If your optimization breaks, we can do Object.freeze. If any mutation is just ignored, I think that's fine.

@ngavalas
Copy link
Contributor Author

ngavalas commented Mar 9, 2023

It doesn't break, it just ignores any mutation after you render it to the prom string one time. This is like an edge case of an edge case and I don't feel bad about it not working right in that case. As for (1), I am almost positive it doesn't violate the spec. Will work to confirm on my end further. The only time it swaps order is if the label moves from passed in to a default label on the same metric. I don't even know if this is possible. :)

Copy link
Collaborator

@zbjornson zbjornson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

One idea below, feel free to decline.

These benchmarks are so noisy 😞 hard to detect anything with less than ~15% improvement.

lib/registry.js Outdated

const formattedLabels = formatLabels(labels);
const flattened = formattedLabels.join(',');
Object.defineProperty(labels, '__flattened_internal', { value: flattened });
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could use a WeakMap here instead of defining a property.

diff --git a/lib/registry.js b/lib/registry.js
index 2169b66..01b0f8e 100644
--- a/lib/registry.js
+++ b/lib/registry.js
@@ -219,14 +219,17 @@ function formatLabels(labels) {
 	);
 }
 
+const cache = new WeakMap();
+
 function flattenSharedLabels(labels) {
-	if (labels.__flattened_internal) {
-		return labels.__flattened_internal;
+	const cached = cache.get(labels);
+	if (cached) {
+		return cached;
 	}
 
 	const formattedLabels = formatLabels(labels);
 	const flattened = formattedLabels.join(',');
-	Object.defineProperty(labels, '__flattened_internal', { value: flattened });
+	cache.set(labels, flattened);
 	return flattened;
 }
 function escapeLabelValue(str) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's a good idea. Let me fix up the changelog and make that change tomorrow morning.

@SimenB SimenB changed the title [rfc] perf: improve performance of rendering metrics to Prometheus string perf: improve performance of rendering metrics to Prometheus string Mar 12, 2023
@SimenB SimenB merged commit a38aa2b into siimon:master Mar 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants