Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Created by
brew bump
Created with
brew bump-formula-pr
.release notes
An experimental "extensible enumerator mechanism" has been added to the
scan
command (#220). This allows Nosey Parker to scan inputs produced by any program that can emit JSON objects to stdout, without having to first write the inputs to the filesystem. It is invoked with the new--enumerator=FILE
option, whereFILE
is a JSON Lines file. Each line of the enumerator file should be a JSON object with one of the following forms:Shell process substitution can make streaming invocation ergonomic, e.g.,
scan --enumerator=<(my-enumerator-program)
.Changes
Inputs are now enumerated incrementally as scanning proceeds rather than done in an initial batch step (#216). This reduces peak memory use and wall clock time 10-20%, particularly in environments with slow I/O. A consequence of this change is that the total amount of data to scan is not known until it has actually been scanned, and so the scanning progress bar no longer shows a completion percentage.
When cloning Git repositories while scanning, the progress bar for now includes the current repository URL (#212).
When scanning, automatically cloned Git repositories are now recorded with the path given on the command line instead of the canonicalized path (#212). This makes datastores slightly more portable across different environments, such as within a Docker container and on the host machine, as relative paths can now be recorded.
The deprecated
--rules=PATH
alias for--rules-path=PATH
has been removed from thescan
andrules
commands.The built-in support for enumerating and interacting with GitHub is now a compile time-selectable feature that is enabled by default (#213). This makes it possible to build a slimmer release for environments where GitHub functionality is unused.
A new rule has been added:
The default number of parallel scanner jobs is now higher on many systems (#222). This value is determined in part by the amount of system RAM; due to several memory use improvements, the required minim RAM per job has been reduced, allowing for more parallelism.
Fixes
The
Google OAuth Credentials
rule has been revised to avoid runtime errors about an empty capture group.The
AWS Secret Access Key
rule has been revised to avoid runtimeRegex failed to match
errors.The code that determines first-commit provenance information for blobs from Git repositories has been reworked to improve memory use (#222). In typical cases of scanning Git repositories, this reduces both peak memory use and wall clock time by 20-50%. In certain pathological cases, such as Homebrew or nixpkgs, the new implementation uses up to 20x less peak memory and up to 5x less wall clock time.
When determining blob provenance informatino from Git repositories, blobs that first appear multiple times within a single commit will now be reported with all names they appear with (#222). Previously, one of the pathnames would be arbitrarily selected.