Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter testitems from AST before eval to reduce latency #170

Merged
merged 20 commits into from
Sep 9, 2024

Conversation

nickrobinson251
Copy link
Collaborator

@nickrobinson251 nickrobinson251 commented Aug 7, 2024

The aim is to make interactive use of runtests() much snappier, specifically to improve time-to-first-test when filtering to specific testitems.

With this PR we now remove any @testitems that aren't requested by the user (because the user passed name and/or tags and/or their own shouldrun function) from the AST before evaluating the code and constructing TestItem objects. Filtering the AST should be faster, but so far it looks like about that same is much faster:

Here are timings on a synthetic benchmark which calls runtests(...) and filters out all the tests, for a large number of test-items across many test files
Now

nfiles = 10, ntestitems = 10000
0.137534666 seconds

nfiles = 10, ntestitems = 100000
1.097515417 seconds

nfiles = 10000, ntestitems = 10000
0.868995333 seconds

nfiles = 10000, ntestitems = 100000
1.835276458 seconds

Before (v1.25):

nfiles = 10, ntestitems = 10000
3.724025084 seconds

nfiles = 10, ntestitems = 100000
37.919445459 seconds

nfiles = 10000, ntestitems = 10000
5.182262875 seconds

nfiles = 10000, ntestitems = 100000
46.858825541 seconds
benchmark code

Running $ julia --proj bench/filtering.jl with the file:

# bench/filtering.jl
const DIR = "_gen_"
function gen_test_files(ntestitems, nfiles)
    ntestitems_per_file, overflow = divrem(ntestitems, nfiles)
    rm(DIR; force=true, recursive=true)
    mkpath(DIR)
    filename = ""
    for i in 1:nfiles
        fn = lpad(i, length(string(nfiles)), "0")
        filename = joinpath(DIR, "file_$(fn)_tests.jl")
        open(filename, "a") do io
            for j in 1:ntestitems_per_file
                tn = (i - 1) * ntestitems_per_file + j
                # "$(join(["ajahdfjlkgklwrjg" for _ in 1:1000]))"
                print(io, """
                    @testitem "test number $(tn)" tags=[:x, :y] begin
                        @test 1 == 1
                    end
                    """
                )
            end
        end
    end
    open(filename, "a") do io
        for tn in (ntestitems-overflow+1):ntestitems
            print(io, """
                @testitem "test number $(tn)" tags=[:x, :y] begin
                    @test 1 == 1
                end
                """
            )
        end
    end
    return nothing
end

using BenchmarkTools
using ReTestItems
for nfiles in (10, 10_000)
    for ntestitems in (10_000, 100_000)
        println("nfiles = $nfiles, ntestitems = $ntestitems")
        gen_test_files(ntestitems, nfiles)
        times = ((@timed redirect_stdio(stdout=devnull, stderr=devnull) do
            runtests(DIR; name="foo", tags=:bar);
        end).time for _ in 1:3)
        println(minimum(times), " seconds")
        println()
    end
end
rm(DIR; force=true, recursive=true)
RAI use-case

For the RAI test suite... this shaves off 1 second (about 40% of the time) for all directories where filtering can be done exclusively on the AST:

  • Now: 1.517397 seconds
  • Before: 2.576764 seconds

(running runtests(dirs_with_unit_tests...; name="nopenope", tags=:nope) to filter out all testitems, i.e. run no tests, from these directories)

And for all directories (including those which may require evaling the files):

  • Now: 10.974462 seconds
  • Before: 16.653225 seconds

(running runtests(; name="nopenope", tags=:nope) to filter out all testitems, i.e. run no tests)

joint work with @Drvi

xref #41

@Drvi
Copy link
Collaborator

Drvi commented Aug 8, 2024

FWIW, here is my very hacky implementation that skips eval , which does show a very significant perf improvement (but it does cheat by skipping over @eval and @static blocks and doesn't handle @test_rels properly). Even with this limitation, I think it proves that successfully avoiding eval while filtering can help a lot (1.8secs to 0.3secs when I tried it again today).

@nickrobinson251
Copy link
Collaborator Author

ooo very interesting -- i think the key is filtering the testitems out with nothing rather than an empty expression :()

julia> code1 = Union{Expr,Nothing}[:() for _ in 1:100_000];

julia> code2 = Union{Expr,Nothing}[nothing for _ in 1:100_000];

julia> @time for c in code1
           Core.eval(Main, c)
       end
  3.848924 seconds (3.60 M allocations: 164.779 MiB, 0.62% gc time)

julia> @time for c in code2
           Core.eval(Main, c)
       end
  0.015101 seconds (298.98 k allocations: 4.562 MiB)

i also like the ShouldRun clean-up on your branch !

@nickrobinson251 nickrobinson251 marked this pull request as ready for review August 14, 2024 17:24
@nickrobinson251 nickrobinson251 requested review from NHDaly and Drvi August 14, 2024 17:24
@nickrobinson251 nickrobinson251 changed the title Filter testitems from AST before eval Filter testitems from AST before eval to reduce latency Aug 14, 2024
@Drvi
Copy link
Collaborator

Drvi commented Aug 22, 2024

I can't seem to replicate the 1 second speedup on RAICode:/

@btime runtests("test", name="aalkbjlhkbkhjvkhfcjgfca")
# PR:     1.533 s (6831030 allocations: 431.75 MiB)
# 1.24.0: 1.633 s (8192533 allocations: 497.06 MiB)

It seems that the remaining overhead is coming from eval of @test_rels, this is the same issue I hit with my prototype.
With these changes diff, I was able to get quite nice performance:

# diff:  301.330 ms (3016961 allocations: 240.58 MiB)

What do you think? Is it worth special-casing test-rel like this? What would the alternative be?

# custom function that doesn't call `Core.eval(M, expr)` if `expr === nothing`, but
# using `Base.include` means we benefit from improvements made upstream rather than
# having to maintain our own version of that code.
function filter_testitem(f, expr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got confused a little because there are two points at which we filter test items -- here at the AST level and then later in the FilteredVector level when we filter on the evald TestItem. Let's maybe explain here, the things that need to be expanded, like:

@testitem "name with interpolation $(VERSION)" begin ... end

get filtered later and where? Because I think I'll forget how things work again:)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooo, name with interpolation is interesting... i've never thought of us as (officially) supporting that.

i always thought we'd (one day) require files only have @testitem and @testsetup calls at the top-level, and that name and tags must be a static

But yes, you're quite right, for things that need to be expanded before filtering (@test_rel, @eval ...) we have to filter later in the FilteredVector, which is why (one day) i'd like us to not support those things at all and do all filter on the AST. Will add more comments explaining this.

@nickrobinson251
Copy link
Collaborator Author

Is it worth special-casing test-rel like this? What would the alternative be?

i've been reluctant to special-case @test_rel in ReTestItems.jl since this is an open-source test framework, whereas @test_rel is an ugly RAI-internal hack...

BUT the compromise instead has to be allow any top-level macro-call (so as to allow @test_rel) rather than restricting to just @testitem and @testsetup... and that has led to other unintended uses, like this complete mis-use of ReTestItems:

@eval begin                                                                                        
    num_groups = 32
    for group_id in 1:num_groups
        eval(quote
            @testitem "MapOperator all natives - group $($group_id)" begin
            end
        )
    end
end

So, currently in this PR i stick with the (bad) status quo where we support arbitrary macrocalls at the top-level (assuming the create TestItems) and so we have to still have the FilteredVector, and we pay the latency cost incurred by having to eval these macrocalls (even though the resulting TestItems might all be filtered out).

But we could instead disallow all macro-calls except @testitem and @testsetup and @test_rel (i.e. add a RAI-specific hack here), which as you say would allow us to write something that works on the @test_rel-AST...
And we'd have to go edit the RAI codebase to rid it of the evil @eval code too.

idk what's best.

Copy link
Collaborator

@Drvi Drvi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is great! A bunch of folks should get a nice latency reduction out of this:)

For RAI though, I think the transition to this new version of ReTestItems will be slightly challenging since we now disallow top-level @eval and I think the macro nesting with @test_rel cause some issues for the filtering logic, e.g. now I get these errors from @test_rel:

`tags` keyword must be passed a collection of `Symbol`s, got: $(Expr(:escape, :([:ring2, :integration])))

(I've tweaked the error message a bit to see what we got for tags).

Lastly, I think for RAI specifically, a good chunk of the latency reduction is still on the table, since currently, we eval all of the @test_rels. It would be great if we could also try to extract the name and tags from these (many of them do have names and tags so we we should be able to skip eval often), but that can wait for a later PR.

@nickrobinson251
Copy link
Collaborator Author

we now disallow top-level @eval

i'll make a PR to update the codebase to remove @eval. I think the upfront pain is worth it to stop this mis-use before it spreads further 😛

It would be great if we could also try to extract the name and tags from these (many of them do have names and tags so we we should be able to skip eval often), but that can wait for a later PR.

agreed 👍 -- just want to land that in a follow-up PR as you say 😁

I think the macro nesting with @test_rel cause some issues for the filtering logic

this bit i didn't quite follow -- is it separate from the usage of @eval? How can i repro this?

@nickrobinson251
Copy link
Collaborator Author

I think the macro nesting with @test_rel cause some issues for the filtering logic

found what was wrong -- thanks for the pointer!

@nickrobinson251 nickrobinson251 merged commit f9cf56c into main Sep 9, 2024
7 checks passed
@nickrobinson251 nickrobinson251 deleted the npr-filter-before-eval branch September 9, 2024 15:57
@nickrobinson251 nickrobinson251 added the perf About improving the performance of the package label Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf About improving the performance of the package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants