Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/secureli-435: Implement custom PII scan #496

Conversation

kathleen-hogan-slalom
Copy link
Contributor

secureli-435

This PR implements a new, custom PII scan. This scan occurs separately from our existing scans which utilize pre-commit hooks external to the secureli codebase.

Changes

  • Creates new PII Scanner and adds the scan to the existing Scan Action

Testing

  • Unit tests updated
  • Screenshot of manual test below:
Screenshot 2024-03-22 at 9 49 55 AM

Clean Code Checklist

  • Meets acceptance criteria for issue
  • New logic is covered with automated tests
  • Appropriate exception handling added
  • Thoughtful logging included
  • Documentation is updated
  • Follow-up work is documented in TODOs
  • TODOs have a ticket associated with them
  • No commented-out code included

@@ -27,15 +27,15 @@ def __init__(
echo: EchoAbstraction,
language_analyzer: language_analyzer.LanguageAnalyzerService,
language_support: language_support.LanguageSupportService,
scanner: ScannerService,
hooks_scanner: HooksScannerService,
Copy link
Contributor Author

@kathleen-hogan-slalom kathleen-hogan-slalom Mar 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to leave context from a prior conversation during standup, I decided to rename this service so that it's more explicitly clear what it's used for, in contrast to any custom scans that are implemented (e.g., the PII scan). This scanner solely scans using external pre-commit hooks. Calling it a generic ScannerService didn't quite capture that nuance IMO.

I'd also like to rename this file in a follow-up PR to match this change. I didn't do this as part of this PR because it wouldn't be as clear which in-file changes I made.

@@ -18,15 +22,7 @@ class OutputParseErrors(str, Enum):
REPO_NOT_FOUND = "repo-not-found"


class ScanOuput(pydantic.BaseModel):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this to the models folder along with ScanFailure, ScanMode, and ScanResult. It felt more appropriate there.

@kathleen-hogan-slalom kathleen-hogan-slalom linked an issue Mar 22, 2024 that may be closed by this pull request
:raises ValueError: The specified path does not exist or is not a git repo
:return: The list of staged file names
"""
self._confirm_is_git_repo(folder_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote similar logic for #436. I added an abstraction for GitPython to get committed files. I'm wondering if we can use one or the other implementations or combine them. Adding it as an abstraction allowed for easier unit testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call out. I am planning on having a follow-up PR to optimize some things. Would you be OK with me addressing this in that follow-up PR? That way we can get the meat of the functionality in today while we're all still here, and then I can have a smaller PR that won't be so difficult to get reviews on from new Secureli folks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Captured here: #497

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, definitely!

"""
git_path = folder_path / ".git"

if not git_path.exists() or not git_path.is_dir():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Can probably add a similar function for this in the GitRepo abstraction.

@kathleen-hogan-slalom kathleen-hogan-slalom marked this pull request as ready for review March 22, 2024 19:14
@kathleen-hogan-slalom kathleen-hogan-slalom changed the title [DRAFT] Feature/secureli-435: Implement custom PII scan Feature/secureli-435: Implement custom PII scan Mar 22, 2024
Copy link
Contributor

@isaac-heist-slalom isaac-heist-slalom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

@kathleen-hogan-slalom kathleen-hogan-slalom merged commit 9c43ac7 into refactor/secureli-000-modular-refactor Mar 22, 2024
3 checks passed
@kathleen-hogan-slalom kathleen-hogan-slalom deleted the feature/secureli-435-pii-from-scratch branch March 22, 2024 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Increase security scan to check for PII
4 participants