Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lookup methods to linker.Symbols; improve performance/reduce allocations #286

Merged
merged 3 commits into from
Apr 22, 2024

Conversation

jhump
Copy link
Member

@jhump jhump commented Apr 22, 2024

This adds a lookup methods, which can be used to improve the efficiency of a resolver that is based on a linker.Files: instead of O(n) to query each file for an item, we can look the item up from the symbol table and directly determine the file it is in.

To further aid efficiency, this adds some benchmarks and then optimizes the Symbols.getPackage function so both symbol table construction and lookup are more efficient. In particular, instead of strings.Split, which must allocate a slice, we now use a new nameEnumerator type, which can be used to iterate through name components without allocation.

Benchmark improvements below.

before

BenchmarkSymbols-10    	 2,073,348	       568.7 ns/op	     240 B/op	       7 allocs/op

after

BenchmarkSymbols-10    	 3,570,336	       324.1 ns/op	       0 B/op	       0 allocs/op

linker/symbols.go Outdated Show resolved Hide resolved
linker/symbols.go Show resolved Hide resolved
@jhump
Copy link
Member Author

jhump commented Apr 22, 2024

Could the lookup be reversed? Is it needed to go from a -> a.b -> a.b.c and not just lookup a.b.c directly?

Maybe, but certainly not with the existing data structure: it is a trie, so you must traverse the path to find the node. There is no entry in the ancestor children maps for a.b.c; it only appears in its parent package, a.b.

So to do something like that, we'd need to change the structure and add some alternate index. I can look into it and see how that changes performance. It makes locking a little more complicated: the trie is thread-safe because the compiler is inherently multi-threaded. It's structured how it is to allow concurrent writers to the symbol table (as long as they are recording symbols for different packages). An alternate index would require separate synchronization.

@jhump jhump merged commit 792268a into main Apr 22, 2024
8 checks passed
@jhump jhump deleted the jh/symbol-lookup branch April 22, 2024 15:38
jhump added a commit that referenced this pull request Apr 22, 2024
After measuring the impact of #286, #287, and #290 and seeing it to be
too modest. I decided to use a memory profiler, and it found "the good
stuff".

These changes had the largest impact on allocations and performance.
When linking inputs that come from descriptor protos (as opposed to
inputs that are compiled from sources and have ASTs), this resulted in
a 23% reduction in latency and 70% reduction in allocations.

This change features the following improvements:
1. `ast.NoSourceNode` now has a pointer receiver, so wrapping one in an
`ast.Node` interface value doesn't incur an allocation to put the value
on the heap. This also updates `parser.ParseResult` to refer to a single
`*ast.NoSourceNode` when it has no AST, instead of allocating one in
each call to get a node value. The `NoSourceNode`'s underlying type is
now `ast.FileInfo` so that it can be allocation-free, even for the
`NodeInfo` method (which previously was allocating a new `FileInfo` each
time).
3. Don't allocate a slice to hold the set of checked files for each
element being resolved. Instead, we allocate a single slice up front,
and re-use that throughout.
4. Don't pro-actively allocate strings that only are used for error
messages; instead defer construction of the change to the construction
of the error.
kralicky pushed a commit to kralicky/protocompile that referenced this pull request May 19, 2024
…cations (bufbuild#286)

This adds a lookup methods, which can be used to improve the efficiency
of a resolver that is based on a `linker.Files`: instead of O(n) to
query each file for an item, we can look the item up from the symbol
table and directly determine the file it is in.

To further aid efficiency, this adds some benchmarks and then optimizes
the `Symbols.getPackage` function so both symbol table construction and
lookup are more efficient. In particular, instead of `strings.Split`,
which must allocate a slice, we now use a new `nameEnumerator` type,
which can be used to iterate through name components without allocation.

(cherry picked from commit 792268a)
kralicky pushed a commit to kralicky/protocompile that referenced this pull request May 19, 2024
After measuring the impact of bufbuild#286, bufbuild#287, and bufbuild#290 and seeing it to be
too modest. I decided to use a memory profiler, and it found "the good
stuff".

These changes had the largest impact on allocations and performance.
When linking inputs that come from descriptor protos (as opposed to
inputs that are compiled from sources and have ASTs), this resulted in
a 23% reduction in latency and 70% reduction in allocations.

This change features the following improvements:
1. `ast.NoSourceNode` now has a pointer receiver, so wrapping one in an
`ast.Node` interface value doesn't incur an allocation to put the value
on the heap. This also updates `parser.ParseResult` to refer to a single
`*ast.NoSourceNode` when it has no AST, instead of allocating one in
each call to get a node value. The `NoSourceNode`'s underlying type is
now `ast.FileInfo` so that it can be allocation-free, even for the
`NodeInfo` method (which previously was allocating a new `FileInfo` each
time).
3. Don't allocate a slice to hold the set of checked files for each
element being resolved. Instead, we allocate a single slice up front,
and re-use that throughout.
4. Don't pro-actively allocate strings that only are used for error
messages; instead defer construction of the change to the construction
of the error.

(cherry picked from commit 016b009)
kralicky pushed a commit to kralicky/protocompile that referenced this pull request Jun 8, 2024
…cations (bufbuild#286)

This adds a lookup methods, which can be used to improve the efficiency
of a resolver that is based on a `linker.Files`: instead of O(n) to
query each file for an item, we can look the item up from the symbol
table and directly determine the file it is in.

To further aid efficiency, this adds some benchmarks and then optimizes
the `Symbols.getPackage` function so both symbol table construction and
lookup are more efficient. In particular, instead of `strings.Split`,
which must allocate a slice, we now use a new `nameEnumerator` type,
which can be used to iterate through name components without allocation.

(cherry picked from commit 792268a)
kralicky pushed a commit to kralicky/protocompile that referenced this pull request Jun 8, 2024
After measuring the impact of bufbuild#286, bufbuild#287, and bufbuild#290 and seeing it to be
too modest. I decided to use a memory profiler, and it found "the good
stuff".

These changes had the largest impact on allocations and performance.
When linking inputs that come from descriptor protos (as opposed to
inputs that are compiled from sources and have ASTs), this resulted in
a 23% reduction in latency and 70% reduction in allocations.

This change features the following improvements:
1. `ast.NoSourceNode` now has a pointer receiver, so wrapping one in an
`ast.Node` interface value doesn't incur an allocation to put the value
on the heap. This also updates `parser.ParseResult` to refer to a single
`*ast.NoSourceNode` when it has no AST, instead of allocating one in
each call to get a node value. The `NoSourceNode`'s underlying type is
now `ast.FileInfo` so that it can be allocation-free, even for the
`NodeInfo` method (which previously was allocating a new `FileInfo` each
time).
3. Don't allocate a slice to hold the set of checked files for each
element being resolved. Instead, we allocate a single slice up front,
and re-use that throughout.
4. Don't pro-actively allocate strings that only are used for error
messages; instead defer construction of the change to the construction
of the error.

(cherry picked from commit 016b009)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants