reduce cost of large variant matrix #5392

minrk · 2024-06-28T11:45:35Z

Description

when variant matrix is large and mostly unused (as in conda-forge), the length of input_variants may be several thousand (13,824 in the case of petsc4py) when only a few are actually used.

This causes get_loop_vars and metadata.copy() to become very expensive and dominate render time.

This reduction cuts time spent in render_recipe for petsc4py from over 2 minutes to 40 seconds to produce 72 actual variants:

before:

after:

(result is unchanged)

Checklist - did you ...

Add a file to the news directory (using the template) for the next release's release notes?
Add / update necessary tests?
Add / update outdated documentation?

when variant matrix is large and mostly unused (as in conda-forge), the length of input_variants may be several thousand when only a few are actually used. This causes `get_loop_vars` and `metadata.copy()` to become very expensive.

codspeed-hq · 2024-06-28T11:56:55Z

CodSpeed Performance Report

Merging #5392 will improve performances by ×2.6

_{Comparing minrk:reduce_variants (311e48b) with main (433f048)}

Summary

⚡ 1 improvements
✅ 3 untouched benchmarks

Benchmarks breakdown

	Benchmark	`main`	`minrk:reduce_variants`	Change
⚡	`test_render_recipe`	64.7 s	25.1 s	×2.6

minrk · 2024-06-28T12:18:30Z

need to investigate why the conda-build tests produce different results with actual runs of conda-build, all of which seem to produce the right variants.

minrk · 2024-06-28T13:08:01Z

seems to be the exclusion of get_used_vars doesn't actually cover all used variables. And some tests appear to expect matrix entries for explicitly unused variables (bzip2 in test_setting_condarc_vars_with_env_var_expansion), which makes me think that perhaps this should be a conda-smithy thing and not a conda-build thing.

should reduce less

beeankha · 2024-06-28T17:46:47Z

pre-commit.ci autofix

vastly reduces the number of copies computed for large variant matrices

rather than computing all loop vars and then intersecting, only consider relevant keys when computing loop vars reduces get_used_loop_vars from O(n_vars * n_variants) to O(n_used_vars * n_variants)

config.copy already copies this, no need to do it twice in metadata.copy

minrk · 2024-07-01T09:46:27Z

conda_build/metadata.py

@@ -2394,7 +2394,6 @@ def validate_features(self):
    def copy(self: Self) -> MetaData:
        new = copy.copy(self)
        new.config = self.config.copy()
-        new.config.variant = copy.deepcopy(self.config.variant)


config.copy on the line before already does exactly this, no need to do it twice

minrk · 2024-07-01T09:47:23Z

conda_build/metadata.py

        used_vars = self.get_used_vars(
            force_top_level=force_top_level, force_global=force_global
        )
-        return set(loop_vars).intersection(used_vars)
+        return self.get_loop_vars(subset=used_vars)


get_loop_vars is far cheaper if we pass a subset to consider instead of computing the (usually quite small) intersection after looping over all variables across all variants.

minrk · 2024-07-01T10:18:43Z

I've taken a different approach that doesn't modify the variants list at all, so shouldn't have any consequences besides performance. Instead of reducing the actual variants list, I've reduced the cost of the two dominant operations on the large variant list:

defer copying metadata.config.variants for each top_level variant until after it's been filtered, so far fewer variant dicts are copied (each variant should be copied exactly once, I believe, rather than once per top_level loop)
in get_used_loop_vars, compute used_vars first and pass it to get_loop_vars so only the used subset are considered, rather than comparing all keys across all variants every time.

The first changes the variants copy from O(top level variants * input_variants) to O(top_level_variants * per_top_level_variants), which is the same as O(input_variants). So the reduction is equal to the number of top level variants; in the case of petsc4py on linux-64, that's 72 * 13,824 = 995,328, reduced to 72 * 192 = 13,824, a reduction of 98.6%. In the case of conda-forge, the real used number of variants is 72, so there's still a further 99.5% that the original proposal reduced by, but I don't know how to do that safely while maintaining all the unspecified guarantees in conda-build.

The second changes the get_loop_vars call from O(variant keys * input variants) to O(used vars * input variants), in the case of petsc4py that's ~365 -> 14, a reduction of 96%:

The savings aren't quite what they are for actually reducing the variants list because there are still some operations on the full list but it still cuts render time in half.

minrk · 2024-07-01T13:28:01Z

I'm afraid I don't understand the mac failures or how they could be related to this PR. The same tests pass just fine on my mac. Hopefully something transient?

beckermr · 2024-07-02T12:02:37Z

Those mac failures do look unrelated. Let's try to rerun before we dig into them.

beckermr · 2024-07-02T17:29:53Z

@mbargull Can you take a look at this one? I don't see any obvious problems but this work gets a bit into the guts of the code.

minrk · 2024-09-13T17:38:46Z

Anyone have a chance to look at this?

conda_build/render.py

to avoid calling pickle in too many places

beckermr · 2024-09-17T15:53:19Z

@isuruf @beeankha @kenodegard Can we merge this one for the 24.9 release?

news/5392-variant-copy

minrk requested a review from a team as a code owner June 28, 2024 11:45

conda-bot added the cla-signed [bot] added once the contributor has signed the CLA label Jun 28, 2024

discard unused variants before copying metadata

9f5ef06

when variant matrix is large and mostly unused (as in conda-forge), the length of input_variants may be several thousand when only a few are actually used. This causes `get_loop_vars` and `metadata.copy()` to become very expensive.

minrk force-pushed the reduce_variants branch from 9270257 to 9f5ef06 Compare June 28, 2024 11:47

minrk mentioned this pull request Jun 28, 2024

reduce input variants to only _used_ input variants conda-forge/conda-smithy#1968

Open

1 task

try reducing with all used vars instead of loop vars

568aed3

should reduce less

minrk force-pushed the reduce_variants branch from aa3dbbb to 568aed3 Compare June 28, 2024 17:04

minrk marked this pull request as draft June 28, 2024 18:55

This comment was marked as outdated.

Sign in to view

perf: copy distributed variants list after subsetting

0797c42

vastly reduces the number of copies computed for large variant matrices

minrk changed the title ~~discard unused variants before copying metadata~~ reduce cost of large unused variant matrix Jul 1, 2024

minrk added 2 commits July 1, 2024 11:45

perf: pass used_vars subset to get_loop_vars

8bcbf09

rather than computing all loop vars and then intersecting, only consider relevant keys when computing loop vars reduces get_used_loop_vars from O(n_vars * n_variants) to O(n_used_vars * n_variants)

remove redundant deepcopy of config.variant

d1ba529

config.copy already copies this, no need to do it twice in metadata.copy

minrk force-pushed the reduce_variants branch from f287604 to d1ba529 Compare July 1, 2024 09:46

minrk commented Jul 1, 2024

View reviewed changes

minrk marked this pull request as ready for review July 1, 2024 10:29

minrk changed the title ~~reduce cost of large unused variant matrix~~ reduce cost of large variant matrix Jul 1, 2024

Merge branch 'main' into reduce_variants

5007189

isuruf reviewed Sep 13, 2024

View reviewed changes

conda_build/render.py Outdated Show resolved Hide resolved

add config.copy_variants method

5f48708

to avoid calling pickle in too many places

beckermr previously approved these changes Sep 14, 2024

View reviewed changes

Merge branch 'main' into reduce_variants

0df82cc

beckermr requested a review from isuruf September 17, 2024 11:14

beeankha reviewed Sep 17, 2024

View reviewed changes

news/5392-variant-copy Outdated Show resolved Hide resolved

beeankha dismissed beckermr’s stale review via 8a1f9ca September 17, 2024 17:22

beeankha changed the base branch from main to 24.9.x September 17, 2024 17:45

beeankha previously approved these changes Sep 17, 2024

View reviewed changes

Update news/5392-variant-copy

644baaf

beeankha dismissed their stale review via 644baaf September 17, 2024 18:05

beeankha force-pushed the reduce_variants branch from 8a1f9ca to 644baaf Compare September 17, 2024 18:05

beeankha previously approved these changes Sep 17, 2024

View reviewed changes

kenodegard mentioned this pull request Sep 17, 2024

Add benchmark test for render_recipe #5490

Merged

3 tasks

Add benchmark test for render_recipe (conda#5490)

311e48b

kenodegard dismissed beeankha’s stale review via 311e48b September 18, 2024 01:34

beeankha approved these changes Sep 18, 2024

View reviewed changes

beeankha merged commit 1ba5760 into conda:24.9.x Sep 18, 2024
28 checks passed

minrk deleted the reduce_variants branch September 18, 2024 15:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reduce cost of large variant matrix #5392

reduce cost of large variant matrix #5392

minrk commented Jun 28, 2024 •

edited

Loading

codspeed-hq bot commented Jun 28, 2024 •

edited

Loading

minrk commented Jun 28, 2024

minrk commented Jun 28, 2024 •

edited

Loading

beeankha commented Jun 28, 2024

This comment was marked as outdated.

minrk Jul 1, 2024

minrk Jul 1, 2024

minrk commented Jul 1, 2024

minrk commented Jul 1, 2024 •

edited

Loading

beckermr commented Jul 2, 2024

beckermr commented Jul 2, 2024

minrk commented Sep 13, 2024

beckermr commented Sep 17, 2024 •

edited

Loading

reduce cost of large variant matrix #5392

reduce cost of large variant matrix #5392

Conversation

minrk commented Jun 28, 2024 • edited Loading

Description

Checklist - did you ...

codspeed-hq bot commented Jun 28, 2024 • edited Loading

CodSpeed Performance Report

Merging #5392 will improve performances by ×2.6

Summary

Benchmarks breakdown

minrk commented Jun 28, 2024

minrk commented Jun 28, 2024 • edited Loading

beeankha commented Jun 28, 2024

This comment was marked as outdated.

minrk Jul 1, 2024

Choose a reason for hiding this comment

minrk Jul 1, 2024

Choose a reason for hiding this comment

minrk commented Jul 1, 2024

minrk commented Jul 1, 2024 • edited Loading

beckermr commented Jul 2, 2024

beckermr commented Jul 2, 2024

minrk commented Sep 13, 2024

beckermr commented Sep 17, 2024 • edited Loading

minrk commented Jun 28, 2024 •

edited

Loading

codspeed-hq bot commented Jun 28, 2024 •

edited

Loading

minrk commented Jun 28, 2024 •

edited

Loading

minrk commented Jul 1, 2024 •

edited

Loading

beckermr commented Sep 17, 2024 •

edited

Loading