Optimizations for large builds #37

aras-p · 2020-02-29T20:53:23Z

Redo how the data is gathered & stored to better scale for large builds (ref #10, #30).

TLDR: 3x faster, 15x lower memory usage, 4x smaller file size.

Previously, --stop was just smashing all input JSON files together, and --analyze was parsing that huge JSON file, and performing the analysis. The memory allocator used was a simple "bump a pointer, never free" allocator. On a decent size project (e.g. Unity editor build), that was taking 4.8sec / 5.8GB for stop, and 9.9sec / 7.3GB for analyze (so total 14.7sec, and max 7.3GB memory usage). The resulting file was 1.01GB size.

Now:

do the JSON parsing during --stop step into the build events data structures (which includes string/name deduplication), and as a result just store the binary data structure file. --analyze reads that binary file and does no JSON parsing at all now.
use regular built-in memory allocator (saves a ton of memory for large analyses).
multi-threaded --stop part that parses all the JSON files, using enkiTS task scheduler.
replace sajson JSON parser with simdjson.
use bytell_hash_map instead of C++ built-in one.
use xxHash hash function instead of C++ built-in one.
optimize "Template sets" and "Function sets" analysis calculation

On the same Unity editor build: 2.4sec / 0.35GB for stop, and 2.7sec / 0.51GB for analyze (so total 5.1sec, and max 0.5GB memory usage). The resulting file is 280MB size.

However the data files produced by --stop and used by --analyze are no longer compatible with previous versions (previous were just JSON, now custom binary).

On a snapshot of editor build: --stop: 4.8s (5834.4MB alloc), --analyze: 9.9s (7261.7MB alloc).

…ing and name deduplication happens during --stop, not during --analyze. On a snapshot of unity editor build: --stop: 4.8s 5834MB -> 9.5s 7260MB --analyze: 9.9s 7262MB -> 4.5s 1270MB data file: 1.01GB -> 268MB

…of memory too: On a snapshot of unity editor build: --stop: 9.5s 7260MB -> 9.5s 6968MB --analyze: 4.5s 1270MB (unchanged) data file: 268MB (unchanged)

On a snapshot of unity editor build: --stop: 9.5s 6968MB -> 9.3s 6975MB

…tell_hash_map (https://probablydance.com/2018/05/28/a-new-fast-hash-table-in-response-to-googles-new-fast-hash-table/) On a snapshot of unity editor build: --stop: 9.3s 6975MB -> 8.2s 6984MB

On a snapshot of unity editor build: --stop: 8.2s 6984MB -> 7.8s 6071MB

On a snapshot of unity editor build: --stop: 7.8s 6071MB -> 6.5s 6071MB

On a snapshot of unity editor build: --analyze: 4.5s 1270MB -> 3.4s 1519MB

On a snapshot of unity editor build: --analyze: 3.4s 1519MB -> 3.3s 1437MB

ben-craig

aras-p doesn't optimize runtime code.
aras-p doesn't optimize build throughput.
aras-p optimizes the build throughput profiler.

I'm now expecting performance patches for whatever perf visualizer you use.

src/Analysis.cpp

…ry usage on large codebases is a problem, and is not thread safe either. On a snapshot of unity editor build: --stop: 6.5s 6071MB -> 7.0s 493MB --analyze: 3.3s 1437MB -> 4.7s 779MB

…hash function. On a snapshot of unity editor build: --stop: 7.0s 493MB -> 5.4s 466MB; BuildEventsParser::NameToIndex 1.47s -> 960ms

…g analysis. On a snapshot of unity editor build: --stop: 5.4s 466MB -> 5.4s 342MB --analyze: 4.7s 779MB -> 4.1s 750MB

On a snapshot of unity editor build: --stop: 5.4s 342MB -> 5.1s 477MB

On a snapshot of unity editor build (MacBookPro 2018, 6c/12t): --stop: 5.1s -> 2.4s

…uilder too

…dly loading wrong files

…alyze On a snapshot of unity editor build: --stop: 2.4s -> 2.3s, 368MB --analyze: 4.4s 750MB -> 4.4s, 673MB

On a snapshot of unity editor build: --stop: 2.3s 368MB -> 2.4s, 348MB --analyze: 4.4s 673MB -> 3.6s 641MB

On a snapshot of unity editor build: --stop: unchanged at 2.4s 348MB --analyze: 3.6s 641MB -> 2.9s 578MB

…psedTemplateOpt), only actually process templated functions. This makes it faster (skips over many more functions without attempting to collapse the template name), and also "fixes" output where function sets are actual sets and not random slow individual functions. On a snapshot of unity editor build: --stop: unchanged at 2.4s 348MB --analyze: 2.9s 578MB -> 2.7s 505MB

aras-p added 10 commits February 29, 2020 15:18

Print memory allocated.

01c2e29

On a snapshot of editor build: --stop: 4.8s (5834.4MB alloc), --analyze: 9.9s (7261.7MB alloc).

Optimize: --stop produces binary data dump instead of JSON; JSON pars…

91671fe

…ing and name deduplication happens during --stop, not during --analyze. On a snapshot of unity editor build: --stop: 4.8s 5834MB -> 9.5s 7260MB --analyze: 9.9s 7262MB -> 4.5s 1270MB data file: 1.01GB -> 268MB

Encapsulate internal JSON parsing & data building state, saves a bit …

4d4a6e6

…of memory too: On a snapshot of unity editor build: --stop: 9.5s 7260MB -> 9.5s 6968MB --analyze: 4.5s 1270MB (unchanged) data file: 268MB (unchanged)

Use precomputed string hashes in nameToIndex map:

b821b02

On a snapshot of unity editor build: --stop: 9.5s 6968MB -> 9.3s 6975MB

Replace unordered_map for name deduplication with Malte Skarupke's by…

ff9f42a

…tell_hash_map (https://probablydance.com/2018/05/28/a-new-fast-hash-table-in-response-to-googles-new-fast-hash-table/) On a snapshot of unity editor build: --stop: 9.3s 6975MB -> 8.2s 6984MB

Reuse string buffer memory for reading JSON files into.

151569e

On a snapshot of unity editor build: --stop: 8.2s 6984MB -> 7.8s 6071MB

Optimize file writing by using a buffer instead of tiny fwrite calls.

862dd68

On a snapshot of unity editor build: --stop: 7.8s 6071MB -> 6.5s 6071MB

Optimize file reading by not doing tiny fread calls.

3009485

On a snapshot of unity editor build: --analyze: 4.5s 1270MB -> 3.4s 1519MB

Optimize EmitCollapsedTemplateOpt.

2dfbdd1

On a snapshot of unity editor build: --analyze: 3.4s 1519MB -> 3.3s 1437MB

Fixing Linux & Windows builds

e02e42c

ben-craig reviewed Mar 1, 2020

View reviewed changes

src/Analysis.cpp Show resolved Hide resolved

src/Analysis.cpp Show resolved Hide resolved

aras-p added 18 commits March 1, 2020 11:13

Remove "bump pointer, never free" allocator. It's fast, but huge memo…

bad7fa1

…ry usage on large codebases is a problem, and is not thread safe either. On a snapshot of unity editor build: --stop: 6.5s 6071MB -> 7.0s 493MB --analyze: 3.3s 1437MB -> 4.7s 779MB

Add enkiTS task scheduler library from master branch 3686b99

1416fb5

Add xxHash hashing library from dev branch 5dd837e

ae993bd

Use arena allocator for nameToIndex string names, and xxHash (XXH64) …

d417f81

…hash function. On a snapshot of unity editor build: --stop: 7.0s 493MB -> 5.4s 466MB; BuildEventsParser::NameToIndex 1.47s -> 960ms

Use arena allocator for all BuildNames and other names produced durin…

055e853

…g analysis. On a snapshot of unity editor build: --stop: 5.4s 466MB -> 5.4s 342MB --analyze: 4.7s 779MB -> 4.1s 750MB

Replace sajson with simdjson (master branch 140e4dd)

4452f1c

Change code to use simdjson

9659ba6

On a snapshot of unity editor build: --stop: 5.4s 342MB -> 5.1s 477MB

Multi-threaded JSON parsing

1a5a87c

On a snapshot of unity editor build (MacBookPro 2018, 6c/12t): --stop: 5.1s -> 2.4s

Update Make & CMake projects

70b7430

Set timeout for Github Actions (5min), and test CMake path in Linux b…

00e9ed3

…uilder too

Fixing cmake test action

c2ed649

Actions: try to use gcc 7 on Linux

d3dc4a2

Actions: try to use gcc 7 on Linux

7ba5db4

Actions: try to use gcc 7 on Linux

6142f41

Windows: switch to VS2019 and fix project files

c009810

Update readme

bbfe83b

Linux: link to -lpthread

7d74e74

Fix double fclose

5ecdb30

aras-p mentioned this pull request Mar 2, 2020

Improve performance / resource usage for big codebases #10

Closed

aras-p mentioned this pull request Mar 2, 2020

Segmentation fault during --stop #30

Closed

aras-p added 6 commits March 2, 2020 18:29

Make data file have header "magic" and a checksum hash, to avoid blin…

6f4597d

…dly loading wrong files

Move filename prettifying to --stop phase instead of doing it in --an…

a7418ce

…alyze On a snapshot of unity editor build: --stop: 2.4s -> 2.3s, 368MB --analyze: 4.4s 750MB -> 4.4s, 673MB

Move name demangling to --stop phase instead of doing it in --analyze

c705536

On a snapshot of unity editor build: --stop: 2.3s 368MB -> 2.4s, 348MB --analyze: 4.4s 673MB -> 3.6s 641MB

Use ska::bytell_hash_map in analysis phase as well.

561c5cb

On a snapshot of unity editor build: --stop: unchanged at 2.4s 348MB --analyze: 3.6s 641MB -> 2.9s 578MB

Fixing Linux build

258c624

aras-p merged commit 5b2550e into master Mar 2, 2020

aras-p deleted the optimize-large-builds branch March 3, 2020 06:08

aras-p mentioned this pull request Mar 3, 2020

Add logging during --finish and --analyze step #33

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizations for large builds #37

Optimizations for large builds #37

aras-p commented Feb 29, 2020 •

edited

Loading

ben-craig left a comment

Optimizations for large builds #37

Optimizations for large builds #37

Conversation

aras-p commented Feb 29, 2020 • edited Loading

ben-craig left a comment

Choose a reason for hiding this comment

aras-p commented Feb 29, 2020 •

edited

Loading