Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] Windows/arm64: Regressions in IfStatements #77984

Closed
performanceautofiler bot opened this issue Nov 3, 2022 · 8 comments
Closed

[Perf] Windows/arm64: Regressions in IfStatements #77984

performanceautofiler bot opened this issue Nov 3, 2022 · 8 comments
Assignees
Labels
arch-arm64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI os-windows tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Milestone

Comments

@performanceautofiler
Copy link

performanceautofiler bot commented Nov 3, 2022

Run Information

Architecture arm64
OS Windows 10.0.19041
Baseline 7d5efbb9e10b6d8beb91c90cbdefd7360869cece
Compare 0e24ea7c2a0436a8f2bf83e8f5981ec035518b99
Diff Diff

Regressions in IfStatements.IfStatements

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
AndAnd - Duration of single invocation 50.85 μs 60.02 μs 1.18 0.00 False
AndAndAnd - Duration of single invocation 48.76 μs 51.73 μs 1.06 0.00 False

graph
graph
Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'IfStatements.IfStatements*'

Related Issues

Regressions

Improvements

Payloads

Baseline
Compare

Histogram

Edge Detector Info

Collection Data

IfStatements.IfStatements.AndAnd


Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionWindowed: Marked as regression because 60.016950830140495 > 53.430863129058444.
IsChangePoint: Marked as a change because one of 10/11/2022 1:45:10 PM, 11/1/2022 8:41:52 AM, 11/3/2022 4:18:41 AM falls between 10/25/2022 1:30:53 PM and 11/3/2022 4:18:41 AM.
IsRegressionStdDev: Marked as regression because -701.6727127857075 (T) = (0 -60020.57172055355) / Math.Sqrt((3457.9591975832996 / (39)) + (568.9918365919992 / (7))) is less than -2.0153675744421933 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (39) + (7) - 2, .025) and -0.17980682074412616 = (50873.21980618611 - 60020.57172055355) / 50873.21980618611 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

```#### IfStatements.IfStatements.AndAndAnd

```log

Description of detection logic

IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small.
IsRegressionWindowed: Marked as regression because 51.733831401766004 > 51.209566442757016.
IsChangePoint: Marked as a change because one of 11/1/2022 8:41:52 AM, 11/3/2022 4:18:41 AM falls between 10/25/2022 1:30:53 PM and 11/3/2022 4:18:41 AM.
IsRegressionStdDev: Marked as regression because -578.8484547464244 (T) = (0 -51739.81218834986) / Math.Sqrt((497.56047658656183 / (39)) + (91.36335693339602 / (7))) is less than -2.0153675744421933 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (39) + (7) - 2, .025) and -0.06026238051797967 = (48799.06440052408 - 51739.81218834986) / 48799.06440052408 is less than -0.05.
IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.
IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so.

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

### Run Information
Architecture arm64
OS Windows 10.0.19041
Baseline 7d5efbb9e10b6d8beb91c90cbdefd7360869cece
Compare 0e24ea7c2a0436a8f2bf83e8f5981ec035518b99
Diff Diff
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@kunalspathak kunalspathak changed the title [Perf] Windows/arm64: 5 Regressions on 11/1/2022 1:15:18 PM [Perf] Windows/arm64: Regressions in IfStatements Nov 7, 2022
@kunalspathak kunalspathak transferred this issue from dotnet/perf-autofiling-issues Nov 7, 2022
@kunalspathak kunalspathak added arch-arm64 os-windows tenet-performance-benchmarks Issue from performance benchmark tenet-build-performance Impacts build time: official, developer or CI tenet-performance Performance related issue and removed tenet-build-performance Impacts build time: official, developer or CI labels Nov 7, 2022
@kunalspathak
Copy link
Member

#73472 (comment)
@a74nh

@kunalspathak kunalspathak reopened this Nov 7, 2022
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Nov 7, 2022
@jeffschwMSFT jeffschwMSFT added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 8, 2022
@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Nov 8, 2022
@JulieLeeMSFT JulieLeeMSFT added this to the 8.0.0 milestone Nov 8, 2022
@EgorBo
Copy link
Member

EgorBo commented Nov 10, 2022

Same regression on Ampere arm64 Ubuntu - dotnet/perf-autofiling-issues#9668

@a74nh
Copy link
Contributor

a74nh commented Nov 11, 2022

I'm seeing the same slowdown.

Without if conversion. =~50us

        12000003          and     w3, w0, #1
        12000024          and     w4, w1, #1
        2A040063          orr     w3, w3, w4
        12000044          and     w4, w2, #1
        2A040063          orr     w3, w3, w4
        35000043          cbnz    w3, G_M12418_IG04
        528000A0          mov     w0, #5

With if conversion. =~60us

        12000003          and     w3, w0, #1
        12000024          and     w4, w1, #1
        2A040063          orr     w3, w3, w4
        35000083          cbnz    w3, G_M12418_IG04
        528000A3          mov     w3, #5
        7200005F          tst     w2, #1
        1A800060          csel    w0, w3, w0, eq

.... But with #77728 =~43us

        12000003          and     w3, w0, #1
        12000024          and     w4, w1, #1
        2A040063          orr     w3, w3, w4
        12000044          and     w4, w2, #1
        2A040063          orr     w3, w3, w4
        528000A4          mov     w4, #5
        7100007F          cmp     w3, #0
        1A800080          csel    w0, w4, w0, eq

I think this is because with the new patch, the if conversion happens after optimise bools. I could pull that part out of 77728 and submit it separately, or we could just wait for 77728 to get merged?

@a74nh
Copy link
Contributor

a74nh commented Nov 11, 2022

I think this is because with the new patch, the if conversion happens after optimise bools

... Yes, it is.

The IR is a chain of JTRUEs.

  • Without If conversion: Optimize bools turns the sequence into a single JTRUE statement.

  • With If conversion: If conversion converts the final JTRUE, and ignores the others. Then optimize bools takes the remaining JTRUEs and combines them

  • With the If conversion moved later: Optimize bools turns the sequence into a single JTRUE statement. Then If conversion converts this single block.

@a74nh
Copy link
Contributor

a74nh commented Nov 30, 2022

Hoping this should be fixed now that #77728 is merged

@a74nh
Copy link
Contributor

a74nh commented Dec 5, 2022

Not sure how to link the latest graphs here. But they drop quite nicely with the latest HEAD:

And And
50.85 μs -> 60.02 μs -> 41.39 μs

And And And
48.76 μs -> 51.73 μs -> 46.22 μs

@kunalspathak - should ok to close this now.

@jakobbotsch
Copy link
Member

Closing since this was fixed by #77728.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-arm64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI os-windows tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Projects
None yet
Development

No branches or pull requests

7 participants