Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweak bounds check elimination #42692

Merged
merged 7 commits into from
Dec 18, 2021
Merged

Conversation

pchintalapudi
Copy link
Member

@pchintalapudi pchintalapudi commented Oct 18, 2021

This is a combination of two changes:

  1. Move the IRCE pass further down the pass pipeline and add an extra GVN pass
    1. This allows some additional functions to become vectorizable without an inbounds macro.
  2. Change size(Vector) to delegate to arraylen instead of arraysize
    1. This converts gep inbounds {}*, {}** instructions to instead use gep inbounds { i8*, i64, i16, i16, i32 }, { i8*, i64, i16, i16, i32 }* to get the array length.
    2. Without this, LLVM does not see that the two geps are equivalent, and therefore fails to eliminate one of the loads. This has implications on whether or not some functions vectorize as well as overall code size (a useless loop inStatistics._mean(identity, Int[], :) was completely eliminated after this change).

base/array.jl Outdated Show resolved Hide resolved
@vchuravy
Copy link
Member

@nanosoldier runbenchmarks(ALL, vs=":master")

@pchintalapudi
Copy link
Member Author

Test case 1:

function f4!(a, xs)
                       s = a[1]
                      for i in 1:length(xs)
                           s2 = 0
                       for j in 1:length(xs[i])
                           s3 = 0
                           for k in 1:length(xs[i][j])
                               s3 += xs[i][j][k]
                           end
                           s2 += s3
                       end
                          s += s2
                      end
                       a[1] = s
                   end

The function was benchmarked with a Vector{100 x Vector{100 x Vector{100 x random Int}}} as second input.

master

Godbolt: https://godbolt.org/z/Kc8zM6616

julia> @benchmark f4!([0], $(c))
BenchmarkTools.Trial: 8743 samples with 1 evaluation.
 Range (min … max):  499.955 μs …  4.058 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     568.345 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   570.304 μs ± 62.607 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                                     ▂ ▃▃  ▁▂▃▁  ▁█▂▂▄▃▃▅   ▁  ▁
  ▄▁▅▁▁▃▅▁▁▃█▁▁▃▄▁▅▁▁▇▆▆▄▅▇▆▅▁█▇▄▄▅▅▄█▆███▇████▇▇█████████▇███ █
  500 μs        Histogram: log(frequency) by time       585 μs <

 Memory estimate: 64 bytes, allocs estimate: 1.

PR

Godbolt: https://godbolt.org/z/vK73fevsM

julia> @benchmark f4!([0], $(c))
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  69.419 μs … 115.779 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     69.430 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   69.558 μs ±   1.019 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▁  ▁                                                        ▁
  ██▃▁█▁▁▁▃▁▁▃▁▁▄▁▁▃▁▁▁▃▃▄▁▁▁▁▁▁▁▁▅▄▁▁▁▄▄▄▄▅▄▄▆▄▄▄▄▄▆▆▆▅▅▄▅▄▄▄ █
  69.4 μs       Histogram: log(frequency) by time        74 μs <

 Memory estimate: 64 bytes, allocs estimate: 1.

base/array.jl Outdated Show resolved Hide resolved
@nanosoldier
Copy link
Collaborator

Something went wrong when running your job:

NanosoldierError: error when preparing/pushing to report repo: failed process: Process(setenv(`git push`; dir="/run/media/system/data/nanosoldier/workdir/NanosoldierReports"), ProcessExited(1)) [1]

Unfortunately, the logs could not be uploaded.

@vtjnash
Copy link
Member

vtjnash commented Oct 19, 2021

Seems a bit noisy, but it is unclear. You might want to re-run those that showed significant deltas: https://github.com/JuliaCI/NanosoldierReports/blob/master/benchmark/by_hash/c3f5381_vs_50fcb03/report.md

@pchintalapudi
Copy link
Member Author

@nanosoldier runbenchmarks(ALL, vs=":master")

src/aotcompile.cpp Outdated Show resolved Hide resolved
@vchuravy
Copy link
Member

vchuravy commented Dec 5, 2021

Bump! I think we can rebase this now that the Statistics.jl changes are on master

@vchuravy
Copy link
Member

@pchintalapudi can you test if this fixes #43308?

@vchuravy
Copy link
Member

@nanosoldier runtests(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

@vchuravy
Copy link
Member

@nanosoldier runtests(["AlphaStableDistributions", "BenchmarkTools", "CMAEvolutionStrategy", "CopEnt", "DiffEqOperators", "FaultDetectionTools", "GPUArrays", "ITensors", "IntervalTrees", "NODAL", "NumericalAlgorithms", "PartialSvdStoch", "Pitaya", "ProgressiveHedging", "QuadEig", "QuantumPropagators", "SLEEFPirates", "SimpleSDMLayers", "SnowyOwl", "StochasticRounding", "Wflow", "Zygote"], vs = ":master")

@nanosoldier
Copy link
Collaborator

Something went wrong when running your job:

NanosoldierError: failed to run tests: NanosoldierError: failed to upload test log: AWSException: InvalidAccessKeyId -- The AWS Access Key Id you provided does not exist in our records.
HTTP.ExceptionRequest.StatusError(403, "PUT", "/julialang-reports/nanosoldier/pkgeval/by_hash/f65ee36_vs_b3e4341/NumericalAlgorithms.1.8.0-DEV-e225ea776a0.log?x-amz-acl=public-read", HTTP.Messages.Response:
"""
HTTP/1.1 403 Forbidden
x-amz-request-id: NXY9SHDPHBSSBZ63
x-amz-id-2: Ox0mg6Med7cK17AdIZbQK0ghk2M/LsH1/aykBdneySzQ+EDpbYcyOj4rF3kpRFkIG1H2yq3YNP0=
Content-Type: application/xml
Transfer-Encoding: chunked
Date: Wed, 15 Dec 2021 21:36:47 GMT
Server: AmazonS3
Connection: close

<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>InvalidAccessKeyId</Code><Message>The AWS Access Key Id you provided does not exist in our records.</Message><AWSAccessKeyId>AKIA4WZGSTHCMSSHRWL6</AWSAccessKeyId><RequestId>NXY9SHDPHBSSBZ63</RequestId><HostId>Ox0mg6Med7cK17AdIZbQK0ghk2M/LsH1/aykBdneySzQ+EDpbYcyOj4rF3kpRFkIG1H2yq3YNP0=</HostId></Error>""")

Logs and partial data can be found here
cc @maleadt

@vchuravy
Copy link
Member

Ah great yet another AWS failure...

@pchintalapudi
Copy link
Member Author

pchintalapudi commented Dec 16, 2021

The difference between gp and gp_inbounds in #43308 should be fixed by e07c2bf, but that requires the addition of an additional DCE pass to eliminate a select undef statement. The finer difference between ge_inbounds and gp_inbounds is due to LLVM unrolling the loop by 16X for gp_inbounds but only 8X for ge_inbounds prior to SLP vectorization, which is something that is not addressed by this PR.

@vchuravy
Copy link
Member

@nanosoldier runtests(["AlphaStableDistributions", "BenchmarkTools", "CMAEvolutionStrategy", "CopEnt", "DiffEqOperators", "FaultDetectionTools", "GPUArrays", "ITensors", "IntervalTrees", "NODAL", "NumericalAlgorithms", "PartialSvdStoch", "Pitaya", "ProgressiveHedging", "QuadEig", "QuantumPropagators", "SLEEFPirates", "SimpleSDMLayers", "SnowyOwl", "StochasticRounding", "Wflow", "Zygote"], vs = ":master")

@nanosoldier
Copy link
Collaborator

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

@vchuravy vchuravy merged commit f0c46b3 into JuliaLang:master Dec 18, 2021
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Feb 22, 2022
* Add GVN and move IRCE to allow more functions to vectorize
* Canonicalize arraysize(Vector, 1) to arraylen(Vector)

Co-authored-by: Prem Chintalapudi <premc@csail.mit.edu>
Co-authored-by: Valentin Churavy <vchuravy@users.noreply.github.com>
@pchintalapudi pchintalapudi deleted the pc/arraysize branch March 6, 2022 21:58
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022
* Add GVN and move IRCE to allow more functions to vectorize
* Canonicalize arraysize(Vector, 1) to arraylen(Vector)

Co-authored-by: Prem Chintalapudi <premc@csail.mit.edu>
Co-authored-by: Valentin Churavy <vchuravy@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants