Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: SSA performance regression on polygon code #15532

Open
nkovacs opened this issue May 3, 2016 · 4 comments
Open

cmd/compile: SSA performance regression on polygon code #15532

nkovacs opened this issue May 3, 2016 · 4 comments
Labels
early-in-cycle A change that should be done early in the 3 month dev cycle. NeedsFix The path to resolution is known, but the work has not been done. Performance
Milestone

Comments

@nkovacs
Copy link

nkovacs commented May 3, 2016

  1. What version of Go are you using (go version)?
    go 1.6.2 and devel +15f7a66
  2. What operating system and processor architecture are you using (go env)?
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/nkovacs/progs/go"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GO15VENDOREXPERIMENT="1"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
CXX="g++"
CGO_ENABLED="1"
  1. What did you do?

clone this repo: https://github.com/nkovacs/polygonperf
and run go test -cpu x -bench . with go 1.6.2 and devel +15f7a66

Results for BenchmarkContains (ns/op) on a Core 2 Q6600:

master 4 cpu 1.6.2 4 cpu master 1 cpu 1.6.2 1 cpu master 2 cpu 1.6.2 2 cpu
370 285 297 282 334 264
378 295 297 284 325 264
374 304 299 283 367 235
364 295 298 411 316 251
386 285 298 278 353 301
375 272 298 277 328 254
374 305 297 279 351 260
379 318 298 394 333 241
374 289 298 310 361 247
380 296 298 272 346 247
375.4 294.4 297.8 307 341.4 256.4

Results for BenchmarkStructContains (ns/op) on a Core 2 Q6600:

master 4 cpu 1.6.2 4 cpu master 1 cpu 1.6.2 1 cpu master 2 cpu 1.6.2 2 cpu
390 280 297 408 353 327
367 299 441 283 357 279
387 282 297 283 339 277
372 314 444 413 347 250
371 307 296 298 339 264
381 284 296 278 340 281
371 291 297 288 359 280
348 298 435 278 351 288
343 295 300 411 338 274
394 304 297 397 364 302
372.4 295.4 340 333.7 348.7 282.2

(last line is average)

I've seen a similar 30% increase in ns/op on an AMD Athlon II X2 270, but on that CPU the 1 cpu benchmark had the same result as the 2 cpu benchmark.

On the two more modern Intel CPUs I briefly tested, this simple polygon does not show a difference between master and 1.6.2. I added a second polygon (BenchmarkContains2 and BenchmarkStructContains2) that does show a difference, with 1.6.2 again being faster. On the Q6600, go 1.6.2 performs twice as fast in these benchmarks, on a Xeon server, go 1.6.2 is about 100-200 ns/op faster.

@josharian josharian changed the title performance regression cmd/compile: SSA performance regression on polygon code May 3, 2016
@randall77
Copy link
Contributor

Looks like we're making unnecessary copies as a leftover from inlining. Here's a small repro:

func f(a [2]float64) float64 {
    return a[0] + a[1]
}

func g(a [2]float64) float64 {
    return f(a)
}

f looks fine, but g has an extra copy operation:

        0x0004 00004 (/home/khr/go/tmp4.go:8)   MOVUPS  "".a+24(FP), X0
        0x0009 00009 (/home/khr/go/tmp4.go:8)   MOVUPS  X0, "".a(SP)
        0x000d 00013 (/home/khr/go/tmp4.go:8)   MOVSD   "".a(SP), X0
        0x0012 00018 (/home/khr/go/tmp4.go:8)   MOVSD   "".a+8(SP), X1
        0x0018 00024 (/home/khr/go/tmp4.go:8)   ADDSD   X1, X0
        0x001c 00028 (/home/khr/go/tmp4.go:8)   MOVSD   X0, "".~r1+40(FP)

This copy is not needed. The old compiler can get rid of the copy, but SSA doesn't.

I don't see any obvious fix. I will ponder.

@bradfitz bradfitz added this to the Go1.7Maybe milestone May 4, 2016
@randall77
Copy link
Contributor

This isn't going to get fixed for 1.7, too late.

As a workaround, you can replace

type PolygonCoordinates [][][2]float64

with

type PolygonCoordinates [][]struct{x,y float64}

SSA does better with structs than arrays.

We'll look again into fixing this for 1.8.

@randall77 randall77 modified the milestones: Go1.8, Go1.7Maybe May 9, 2016
@quentinmit quentinmit added the NeedsFix The path to resolution is known, but the work has not been done. label Oct 11, 2016
@rsc
Copy link
Contributor

rsc commented Oct 21, 2016

Basically the same issue as #15925.

@rsc rsc modified the milestones: Go1.9Early, Go1.8 Oct 21, 2016
@bradfitz bradfitz modified the milestones: Go1.10Early, Go1.9Early May 3, 2017
@bradfitz bradfitz added early-in-cycle A change that should be done early in the 3 month dev cycle. and removed early-in-cycle A change that should be done early in the 3 month dev cycle. labels Jun 14, 2017
@bradfitz bradfitz modified the milestones: Go1.10Early, Go1.10 Jun 14, 2017
@bradfitz bradfitz modified the milestones: Go1.10, Go1.11 Nov 28, 2017
@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Go1.12 Jun 30, 2018
@randall77
Copy link
Contributor

Punting to 1.13, too late for anything major in 1.12.

@randall77 randall77 modified the milestones: Go1.12, Go1.13 Dec 12, 2018
@randall77 randall77 modified the milestones: Go1.13, Go1.14 May 6, 2019
@rsc rsc removed this from the Go1.14 milestone Oct 9, 2019
@rsc rsc added this to the Backlog milestone Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
early-in-cycle A change that should be done early in the 3 month dev cycle. NeedsFix The path to resolution is known, but the work has not been done. Performance
Projects
None yet
Development

No branches or pull requests

6 participants