Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change category uniqueness test #872

Merged
merged 3 commits into from
Jan 17, 2022
Merged

Conversation

aeisenberg
Copy link
Contributor

@aeisenberg aeisenberg commented Jan 12, 2022

Turboscan only allows a single combination of tool name and automation
details id for testing category uniqueness.

Previously, the check in the action was not entirely correct since it
only looked at the category and not the combination of the category
and the tool name.

It's even more precise now since it is looking at the actual, computed
value of the automation details id, rather than an inputted value of
the category.

This change also includes a refactoring where the action is now avoiding
multiple parsing/stringifying of the sarif files. Instead, sarif is
parsed once at the start of the process and stringified once, after
sarif processing is completely finished.

Merge / deployment checklist

  • Confirm this change is backwards compatible with existing workflows.
  • Confirm the readme has been updated if necessary.
  • Confirm the changelog has been updated if necessary.

Turboscan only allows a single combination of tool name and automation
details id for testing category uniqueness.

Previously, the check in the action was not entirely correct since it
only looked at the _category_ and not the combination of the category
and the tool name.

It's even more precise now since it is looking at the actual, computed
value of the automation details id, rather than an inputted value of
the category.

This change also includes a refactoring where the action is now avoiding
multiple parsing/stringifying of the sarif files. Instead, sarif is
parsed once at the start of the process and stringified once, after
sarif processing is completely finished.
@aeisenberg aeisenberg requested a review from a team as a code owner January 12, 2022 23:27
@aeisenberg
Copy link
Contributor Author

Interesting...quite a few failing tests, but all of these are legitimately trying to upload multiple times with the same tool/automation id. Here are the failing jobs and they are all failing since they are merging runs from multiple files into a single sarif.

  • __test-local-codeql.yml
  • __unset-environment.yml
  • __multi-language-autodetect.yml

I understand why this is happening, but I'm not sure if this is the best behaviour.

The default automation details id and tool are the same for all runs in a single job. So, we can't enforce uniqueness on them. However, since these really are analyses different languages, they really should have distinct categories. I guess it's too late now to change this without breaking existing users.

I think I need to change things so that identical automation ids/tools in different runs of the same sarif are tolerated.

@starcke
Copy link

starcke commented Jan 13, 2022

I dont think I a lot of knowledge about the action - so I have assigned @Daverlo instead as he did the initial category work here. Happy to sync more generally on the behavior. But since https://github.com/github/code-scanning/issues/1555#issuecomment-843367976 (which was after the initial category work) I think you might have more information on the wanted direction than me :)

@starcke starcke requested review from Daverlo and removed request for starcke January 13, 2022 09:37
Copy link
Contributor

@robertbrignull robertbrignull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you if you want to wait for further review, but I think this is good and implements the check correctly now.

The fact that the sanitization of the env var makes us throw the error is slightly more cases is fine I think. It's a corner case that you're using two categories that only differ in case or non-alphanumeric characters. I think we can just go with this for now and change it in the future if it ever becomes a problem.

}
core.exportVariable(sentinelEnvVar, sentinelEnvVar);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This closing brace is indented one space too many? Not sure why the formatter isn't complaining about it as my editor spots it 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That explains the failing linting PR check.

A single SARIF file should be allowed to have duplicated
categories.
@aeisenberg aeisenberg force-pushed the aeisenberg/category-with-tool branch from 43018ce to ab1f709 Compare January 13, 2022 18:35
@aeisenberg
Copy link
Contributor Author

I'll keep this open for another day or two to see if @Daverlo has any comments, and then I'll merge if there's nothing else.

@aeisenberg
Copy link
Contributor Author

Merging this based on prior approval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants