Apply Uniq filter to remove duplicate issues #649
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello. This simple patch applies a
uniq!
call right before the problems printer to remove duplicate issue objects from the array.This should mitigate the noise from the duplicate issue problem self stacking coming from the
.concat
calls.I setup a set stub repo showing this off, https://github.com/uberfuzzy/proofer-dupe-test
The cause is the places where new issues are being collected, and added to the global collectors using
.concat
This is what is roughly what is happening from what I've been able to trace down with a lot of debug prints:
[A]
)[A]
)[A,B]
)[A,A,B]
)[A,B,C]
)[A,A,B,A,B,C]
)You can see this in the "before" output output files in the test repo, compared to the patched output output.
This is happening in any place where .concat is used for collecting issues from one array into the larger array for later printing, in both internal page file-exist checks, and also in the local hash checker. In most places the individual checks internal array isnt being cleared reset between, or the concat is being run inside a loop, rather than after it.
I will say that this is only a mitigation patch. This does not fix the core problem of the arrays being amplified onto themselves. I will leave that cleanup to someone more comfortable with the code to make larger structure and logic changes. I did try my hand at this, but ended up breaking more tests than I was fixing, and settled on this quick win to at least dampen the noise.