Reduce memory and CPU use when scanning #222
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Rework git metadata calculation to reduce peak memory and wall clock time when scanning. This includes many changes. The net effect of all this is a typical 30% speedup and 50% memory reduction when scanning Git repositories; in pathological cases, up to 5x speedup and 20x memory reduction.
Git metadata graph:
SmallVec
to reduce heap fragmentation and small heap allocationsBStringTable
:get_or_intern
to avoid heap-allocated temporaries when an entry already existsScanning:
Arc<CommitMetadata>
instead ofCommitMetadata
andArc<PathBuf>
instead ofPathBuf
within git blob provenance entries (allows sharing; sometimes reduces memory use of these object types 10,000x)