Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do no use subprocess for the ScanPackage (scan_package) pipeline #798

Closed
tdruez opened this issue Jul 5, 2023 · 1 comment
Closed

Do no use subprocess for the ScanPackage (scan_package) pipeline #798

tdruez opened this issue Jul 5, 2023 · 1 comment

Comments

@tdruez
Copy link
Contributor

tdruez commented Jul 5, 2023

Using subprocess is problematic for multiple reasons:

  • Introduce security issues
  • Cannot log the progress
  • Cannot catch errors and exceptions while running
  • Cannot interact with the running process
  • Cannot use mock in the unit test context
  • Cannot use the status system to flag and exclude resources from scan
  • ....

From #556 (comment)

The underlying issue is that the ScanPackage pipeline depends on a subprocess call to the scancode command.
If scancode fails, there's no easy way to save this as a ProjectError and continue the pipeline run.

A better approach would be to replace this approach with proper API calls, as we did for the Scanners in the ScanCodebase pipeline (For example: get_copyrights, get_licenses, get_package_data)


The 2 missing pieces as callable API for the ScanPackage pipeline are:

  • --classify: FileClassifier(PostScanPlugin)
  • --summary: ScanSummary(PostScanPlugin)
tdruez added a commit that referenced this issue Aug 10, 2023
Signed-off-by: Thomas Druez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Aug 11, 2023
Signed-off-by: Thomas Druez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Aug 11, 2023
Signed-off-by: Thomas Druez <tdruez@nexb.com>
tdruez added a commit that referenced this issue Aug 11, 2023
Signed-off-by: Thomas Druez <tdruez@nexb.com>
@tdruez
Copy link
Contributor Author

tdruez commented Aug 11, 2023

Following the merge of #855 , we are not relying on a subprocess call anymore as we call directly the scancode.cli.run_scan function.

This is better but there's still the following limitations:

  • Cannot catch errors and exceptions while running
  • Cannot interact with the running process
  • Cannot use the status system to flag and exclude resources from scan

tdruez added a commit that referenced this issue Aug 11, 2023
Signed-off-by: Thomas Druez <tdruez@nexb.com>
@tdruez tdruez closed this as completed Aug 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant