chore: preallocate slices #1842

estensen · 2022-10-30T11:56:04Z

When we know the final size of a slice preallocate for it to save memory.
Assigning directly is slightly faster than using append.

MarcoPolo · 2022-10-30T12:04:55Z

Thanks! Did you use a tool to find these spots or something else?

estensen · 2022-10-30T12:19:37Z

yeah, repro by running:
golangci-lint run ./... -E prealloc --disable-all

MarcoPolo · 2022-11-03T23:18:14Z

p2p/protocol/circuitv2/util/pbconv.go

-	if len(pi.Addrs) > 0 {
-		addrs = make([][]byte, 0, len(pi.Addrs))
-	}
+	addrs := make([][]byte, len(pi.Addrs))


Let’s keep the old behaviour of having this be nil. In this case I don’t think this matters, but generally we can’t be sure if callers check this.

marten-seemann

I don't understand why you're replacing appends with assignments to a specific index. append is the nicer way to do this, please don't change that.

core/peer/addrinfo.go

MarcoPolo · 2022-11-03T23:45:04Z

I don't understand why you're replacing appends with assignments to a specific index. append is the nicer way to do this, please don't change that.

I don’t have a preference, but assignments are technically faster. That said, I don’t think this matters here.

marten-seemann · 2022-11-05T09:28:59Z

I don't understand why you're replacing appends with assignments to a specific index. append is the nicer way to do this, please don't change that.

I don’t have a preference, but assignments are technically faster. That said, I don’t think this matters here.

I'd expect it to compile to exactly the same code. I ran a benchmark, and the two perform exactly the same. If at all, appending is slightly faster, but probably not statistically significant.

goos: darwin
goarch: arm64
pkg: github.com/libp2p/go-libp2p/p2p/protocol/holepunch
BenchmarkAppend-10      24834030                48.23 ns/op
BenchmarkAssign-10      24644346                48.32 ns/op

const l = 128

func BenchmarkAppend(b *testing.B) {
	a := make([]int, 0, l)
	for i := 0; i < b.N; i++ {
		a = a[:0]
		for j := 0; j < l; j++ {
			a = append(a, j)
		}
	}
}

func BenchmarkAssign(b *testing.B) {
	a := make([]int, l)
	for i := 0; i < b.N; i++ {
		for j := 0; j < l; j++ {
			a[j] = j
		}
	}
}

The reason people tend to prefer append is that it's safer: If you get your length calculation wrong, all that happens is that you do an additional alloc. If you get your length calculation wrong and do an assignment, your process will panic.

estensen · 2022-11-06T10:21:00Z

I've always gotten better perf from assignment than appending. Same code on my machine:

goos: darwin
goarch: amd64
pkg: github.com/estensen/benchmarks/tmp
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkAppend-12    	15084253	        68.56 ns/op
BenchmarkAssign-12    	17883320	        66.12 ns/op
PASS
ok  	github.com/estensen/benchmarks/tmp	2.536s

But I do agree that it is safer to use append. I'll revert it.

p2p/net/connmgr/connmgr_test.go

p2p/net/swarm/swarm_listen.go

p2p/protocol/circuitv2/relay/relay.go

p2p/protocol/internal/circuitv1-deprecated/util.go

marten-seemann · 2022-11-06T12:48:31Z

Thank you!

MarcoPolo · 2022-11-06T16:56:47Z

Let me preface this by saying this doesn't matter at all :)

I'd expect it to compile to exactly the same code.

I went into a rabbit hole here because this doesn't make sense to me. Append has to do more work than an assign (which should be a single instruction). So it should be slower. Sure enough if we load up the code in godbolt we see that the append has one more compare and mov than assign should have. I say should have because this benchmark actually is flawed since the compiler optimizes the assignment and makes it the same as the noop (you don't see a MOVQ in the asm output). To fix this you actually should do something like this. This optimization actually makes it a bit more surprising that append (an extra cmp and 2 movs) is as fast as a noop, but then again CPUs are pretty good at branch prediction and pipelining and parallelizing especially in this very simple case.

Also I'm impressed with how well the compiler optimized the append loop. It really keeps the hot path very small. If we run without compiler optimizations (gcflags=-N) we also see the expected speed diffs.

Results from my tests here: https://gist.github.com/MarcoPolo/18b39366c6f57270059d0ef03c29351b#file-results-md

Preallocate slices

367613b

MarcoPolo self-requested a review October 30, 2022 12:05

MarcoPolo reviewed Nov 3, 2022

View reviewed changes

marten-seemann requested changes Nov 3, 2022

View reviewed changes

core/peer/addrinfo.go Outdated Show resolved Hide resolved

estensen added 2 commits November 6, 2022 11:23

Move slice allocs to right before they're used

99a8148

Revert slice assignments to append

de2f46c

estensen requested a review from marten-seemann November 6, 2022 10:32

marten-seemann requested changes Nov 6, 2022

View reviewed changes

estensen added 2 commits November 6, 2022 11:52

Don't preallocate for tests or deprecated code

a654074

Don't preallocate too much

48b1b04

marten-seemann approved these changes Nov 6, 2022

View reviewed changes

marten-seemann changed the title ~~Preallocate slices~~ chore: preallocate slices Nov 6, 2022

marten-seemann merged commit 21dc42b into libp2p:master Nov 6, 2022

estensen deleted the prealloc branch November 6, 2022 12:53

marten-seemann mentioned this pull request Nov 8, 2022

Align struct fields to use less memory #1856

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: preallocate slices #1842

chore: preallocate slices #1842

estensen commented Oct 30, 2022

MarcoPolo commented Oct 30, 2022

estensen commented Oct 30, 2022

MarcoPolo Nov 3, 2022

marten-seemann left a comment

MarcoPolo commented Nov 3, 2022

marten-seemann commented Nov 5, 2022

estensen commented Nov 6, 2022

marten-seemann commented Nov 6, 2022

MarcoPolo commented Nov 6, 2022

chore: preallocate slices #1842

chore: preallocate slices #1842

Conversation

estensen commented Oct 30, 2022

MarcoPolo commented Oct 30, 2022

estensen commented Oct 30, 2022

MarcoPolo Nov 3, 2022

Choose a reason for hiding this comment

marten-seemann left a comment

Choose a reason for hiding this comment

MarcoPolo commented Nov 3, 2022

marten-seemann commented Nov 5, 2022

estensen commented Nov 6, 2022

marten-seemann commented Nov 6, 2022

MarcoPolo commented Nov 6, 2022