Filter testitems from AST before `eval` to reduce latency #170

nickrobinson251 · 2024-08-07T13:09:54Z

The aim is to make interactive use of runtests() much snappier, specifically to improve time-to-first-test when filtering to specific testitems.

With this PR we now remove any @testitems that aren't requested by the user (because the user passed name and/or tags and/or their own shouldrun function) from the AST before evaluating the code and constructing TestItem objects. Filtering the AST ~~should be faster, but so far it looks like about that same~~ is much faster:

Here are timings on a synthetic benchmark which calls runtests(...) and filters out all the tests, for a large number of test-items across many test files
Now

nfiles = 10, ntestitems = 10000
0.137534666 seconds

nfiles = 10, ntestitems = 100000
1.097515417 seconds

nfiles = 10000, ntestitems = 10000
0.868995333 seconds

nfiles = 10000, ntestitems = 100000
1.835276458 seconds

Before (v1.25):

nfiles = 10, ntestitems = 10000
3.724025084 seconds

nfiles = 10, ntestitems = 100000
37.919445459 seconds

nfiles = 10000, ntestitems = 10000
5.182262875 seconds

nfiles = 10000, ntestitems = 100000
46.858825541 seconds

benchmark code

Running $ julia --proj bench/filtering.jl with the file:

# bench/filtering.jl
const DIR = "_gen_"
function gen_test_files(ntestitems, nfiles)
    ntestitems_per_file, overflow = divrem(ntestitems, nfiles)
    rm(DIR; force=true, recursive=true)
    mkpath(DIR)
    filename = ""
    for i in 1:nfiles
        fn = lpad(i, length(string(nfiles)), "0")
        filename = joinpath(DIR, "file_$(fn)_tests.jl")
        open(filename, "a") do io
            for j in 1:ntestitems_per_file
                tn = (i - 1) * ntestitems_per_file + j
                # "$(join(["ajahdfjlkgklwrjg" for _ in 1:1000]))"
                print(io, """
                    @testitem "test number $(tn)" tags=[:x, :y] begin
                        @test 1 == 1
                    end
                    """
                )
            end
        end
    end
    open(filename, "a") do io
        for tn in (ntestitems-overflow+1):ntestitems
            print(io, """
                @testitem "test number $(tn)" tags=[:x, :y] begin
                    @test 1 == 1
                end
                """
            )
        end
    end
    return nothing
end

using BenchmarkTools
using ReTestItems
for nfiles in (10, 10_000)
    for ntestitems in (10_000, 100_000)
        println("nfiles = $nfiles, ntestitems = $ntestitems")
        gen_test_files(ntestitems, nfiles)
        times = ((@timed redirect_stdio(stdout=devnull, stderr=devnull) do
            runtests(DIR; name="foo", tags=:bar);
        end).time for _ in 1:3)
        println(minimum(times), " seconds")
        println()
    end
end
rm(DIR; force=true, recursive=true)

RAI use-case

For the RAI test suite... this shaves off 1 second (about 40% of the time) for all directories where filtering can be done exclusively on the AST:

Now: 1.517397 seconds
Before: 2.576764 seconds

(running runtests(dirs_with_unit_tests...; name="nopenope", tags=:nope) to filter out all testitems, i.e. run no tests, from these directories)

And for all directories (including those which may require evaling the files):

Now: 10.974462 seconds
Before: 16.653225 seconds

(running runtests(; name="nopenope", tags=:nope) to filter out all testitems, i.e. run no tests)

joint work with @Drvi

xref #41

Drvi · 2024-08-08T09:11:07Z

FWIW, here is my very hacky implementation that skips eval , which does show a very significant perf improvement (but it does cheat by skipping over @eval and @static blocks and doesn't handle @test_rels properly). Even with this limitation, I think it proves that successfully avoiding eval while filtering can help a lot (1.8secs to 0.3secs when I tried it again today).

nickrobinson251 · 2024-08-08T11:52:12Z

ooo very interesting -- i think the key is filtering the testitems out with nothing rather than an empty expression :()

julia> code1 = Union{Expr,Nothing}[:() for _ in 1:100_000];

julia> code2 = Union{Expr,Nothing}[nothing for _ in 1:100_000];

julia> @time for c in code1
           Core.eval(Main, c)
       end
  3.848924 seconds (3.60 M allocations: 164.779 MiB, 0.62% gc time)

julia> @time for c in code2
           Core.eval(Main, c)
       end
  0.015101 seconds (298.98 k allocations: 4.562 MiB)

i also like the ShouldRun clean-up on your branch !

This reverts commit a61eb98.

Drvi · 2024-08-22T10:21:31Z

I can't seem to replicate the 1 second speedup on RAICode:/

@btime runtests("test", name="aalkbjlhkbkhjvkhfcjgfca")
# PR:     1.533 s (6831030 allocations: 431.75 MiB)
# 1.24.0: 1.633 s (8192533 allocations: 497.06 MiB)

It seems that the remaining overhead is coming from eval of @test_rels, this is the same issue I hit with my prototype.
With these changes diff, I was able to get quite nice performance:

# diff:  301.330 ms (3016961 allocations: 240.58 MiB)

What do you think? Is it worth special-casing test-rel like this? What would the alternative be?

Drvi · 2024-08-22T10:28:25Z

src/filtering.jl

+# custom function that doesn't call `Core.eval(M, expr)` if `expr === nothing`, but
+# using `Base.include` means we benefit from improvements made upstream rather than
+# having to maintain our own version of that code.
+function filter_testitem(f, expr)


I got confused a little because there are two points at which we filter test items -- here at the AST level and then later in the FilteredVector level when we filter on the evald TestItem. Let's maybe explain here, the things that need to be expanded, like:

@testitem "name with interpolation $(VERSION)" begin ... end

get filtered later and where? Because I think I'll forget how things work again:)

ooo, name with interpolation is interesting... i've never thought of us as (officially) supporting that.

i always thought we'd (one day) require files only have @testitem and @testsetup calls at the top-level, and that name and tags must be a static

But yes, you're quite right, for things that need to be expanded before filtering (@test_rel, @eval ...) we have to filter later in the FilteredVector, which is why (one day) i'd like us to not support those things at all and do all filter on the AST. Will add more comments explaining this.

nickrobinson251 · 2024-08-22T11:15:01Z

Is it worth special-casing test-rel like this? What would the alternative be?

i've been reluctant to special-case @test_rel in ReTestItems.jl since this is an open-source test framework, whereas @test_rel is an ugly RAI-internal hack...

BUT the compromise instead has to be allow any top-level macro-call (so as to allow @test_rel) rather than restricting to just @testitem and @testsetup... and that has led to other unintended uses, like this complete mis-use of ReTestItems:

@eval begin                                                                                        
    num_groups = 32
    for group_id in 1:num_groups
        eval(quote
            @testitem "MapOperator all natives - group $($group_id)" begin
            end
        )
    end
end

So, currently in this PR i stick with the (bad) status quo where we support arbitrary macrocalls at the top-level (assuming the create TestItems) and so we have to still have the FilteredVector, and we pay the latency cost incurred by having to eval these macrocalls (even though the resulting TestItems might all be filtered out).

But we could instead disallow all macro-calls except @testitem and @testsetup and @test_rel (i.e. add a RAI-specific hack here), which as you say would allow us to write something that works on the @test_rel-AST...
And we'd have to go edit the RAI codebase to rid it of the evil @eval code too.

idk what's best.

Drvi

I think this is great! A bunch of folks should get a nice latency reduction out of this:)

For RAI though, I think the transition to this new version of ReTestItems will be slightly challenging since we now disallow top-level @eval and I think the macro nesting with @test_rel cause some issues for the filtering logic, e.g. now I get these errors from @test_rel:

`tags` keyword must be passed a collection of `Symbol`s, got: $(Expr(:escape, :([:ring2, :integration])))

(I've tweaked the error message a bit to see what we got for tags).

Lastly, I think for RAI specifically, a good chunk of the latency reduction is still on the table, since currently, we eval all of the @test_rels. It would be great if we could also try to extract the name and tags from these (many of them do have names and tags so we we should be able to skip eval often), but that can wait for a later PR.

nickrobinson251 · 2024-09-09T10:43:08Z

we now disallow top-level @eval

i'll make a PR to update the codebase to remove @eval. I think the upfront pain is worth it to stop this mis-use before it spreads further 😛

It would be great if we could also try to extract the name and tags from these (many of them do have names and tags so we we should be able to skip eval often), but that can wait for a later PR.

agreed 👍 -- just want to land that in a follow-up PR as you say 😁

I think the macro nesting with @test_rel cause some issues for the filtering logic

this bit i didn't quite follow -- is it separate from the usage of @eval? How can i repro this?

nickrobinson251 · 2024-09-09T10:58:04Z

I think the macro nesting with @test_rel cause some issues for the filtering logic

found what was wrong -- thanks for the pointer!

nickrobinson251 force-pushed the npr-filter-before-eval branch from 79d7595 to ab2c128 Compare August 12, 2024 17:38

nickrobinson251 added 8 commits August 12, 2024 21:50

Filter testitems from AST before eval

39b8f6b

Remove previous filtering code

219476e

TEMP comment out test with TODO comment

414703f

WIP3

6223566

WIP

f628240

Revert "Remove previous filtering code"

0049fc3

This reverts commit a61eb98.

tidying

d6be40c

Refactor to a single callable type for filtering

481010d

nickrobinson251 force-pushed the npr-filter-before-eval branch from ab2c128 to 481010d Compare August 12, 2024 20:51

nickrobinson251 added 6 commits August 14, 2024 12:58

refactor into smaller functions

b965898

Add tests

562a4fb

Add more tests

4a9a75a

fixup! Add more tests

abed0ee

fixup! Add more tests

57da455

fixup! Add more tests

896d219

nickrobinson251 marked this pull request as ready for review August 14, 2024 17:24

nickrobinson251 requested review from NHDaly and Drvi August 14, 2024 17:24

nickrobinson251 changed the title ~~Filter testitems from AST before eval~~ Filter testitems from AST before eval to reduce latency Aug 14, 2024

nickrobinson251 mentioned this pull request Aug 15, 2024

Replace relpath with cheaper function #174

Merged

Drvi reviewed Aug 22, 2024

View reviewed changes

nickrobinson251 added 4 commits August 22, 2024 18:48

Tighten usage check to not allow any top-level macrocall

e019ae7

another test

015a76c

fixup! another test

e75cb61

Require String literal name, and Vector{Symbol} tags

7d9b4f1

nickrobinson251 added 2 commits August 22, 2024 20:27

fixup! Tighten usage check to not allow any top-level macrocall

09d032f

fixup! Require String literal name, and Vector{Symbol} tags

7003079

Drvi approved these changes Sep 9, 2024

View reviewed changes

nickrobinson251 merged commit f9cf56c into main Sep 9, 2024
7 checks passed

nickrobinson251 deleted the npr-filter-before-eval branch September 9, 2024 15:57

nickrobinson251 mentioned this pull request Sep 9, 2024

Do all filtering of testitems at AST level #179

Merged

nickrobinson251 added the perf About improving the performance of the package label Sep 11, 2024

nickrobinson251 mentioned this pull request Sep 11, 2024

Statically parse, rather than run, test files to extract @testitems #41

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter testitems from AST before `eval` to reduce latency #170

Filter testitems from AST before `eval` to reduce latency #170

nickrobinson251 commented Aug 7, 2024 •

edited

Loading

Drvi commented Aug 8, 2024

nickrobinson251 commented Aug 8, 2024

Drvi commented Aug 22, 2024

Drvi Aug 22, 2024

nickrobinson251 Aug 22, 2024

nickrobinson251 commented Aug 22, 2024

Drvi left a comment

nickrobinson251 commented Sep 9, 2024

nickrobinson251 commented Sep 9, 2024

Filter testitems from AST before eval to reduce latency #170

Filter testitems from AST before eval to reduce latency #170

Conversation

nickrobinson251 commented Aug 7, 2024 • edited Loading

Drvi commented Aug 8, 2024

nickrobinson251 commented Aug 8, 2024

Drvi commented Aug 22, 2024

Drvi Aug 22, 2024

Choose a reason for hiding this comment

nickrobinson251 Aug 22, 2024

Choose a reason for hiding this comment

nickrobinson251 commented Aug 22, 2024

Drvi left a comment

Choose a reason for hiding this comment

nickrobinson251 commented Sep 9, 2024

nickrobinson251 commented Sep 9, 2024

Filter testitems from AST before `eval` to reduce latency #170

Filter testitems from AST before `eval` to reduce latency #170

nickrobinson251 commented Aug 7, 2024 •

edited

Loading