-
Notifications
You must be signed in to change notification settings - Fork 81
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Rework parallelism in enumeration and scanning (#221)
- The `FilesystemEnumerator` type now _only_ enumerates directories and files, and as such is a relatively lightweight task. - Heavyweight enumeration tasks, including walking through Git history and extensible enumerator output (i.e., from the `--enumerator=FILE` option) are now performed by scanner threads instead of filesystem enumerator threads. - Git repositories are now opened just a single time instead of twice. - A new `ParallelBlobIterator` trait codifies things that can be enumerated to produce blobs with provenance metadata. This includes for now Git repositories, regular files, extensible enumerators. This starts to put in place necessary pieces to allow for enumeration of tarfiles, compressed archives, etc. - Large parts of the `scan` command logic are now extracted into functions. - The overall parallelism approach is a bit easier to see now: parallel filesystem input enumeration ==channel==> parallel scanning ==channel==> sequential datastore persistence.
- Loading branch information
1 parent
5b0bd98
commit 0518d80
Showing
6 changed files
with
742 additions
and
672 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.