Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add support for pyproject.toml and env.yml/env.yaml files #550

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

dylanpulver
Copy link
Collaborator

@dylanpulver dylanpulver commented Jul 9, 2024

Tested this manually by adding these files to the repo and conifrming they are picked up. This supports both versions: env.yml and env.yaml

image

Original issue: #505

Summary by CodeRabbit

  • New Features

    • Introduced support for scanning projects defined by pyproject.toml files.
    • Added functionality to read and parse dependencies from pyproject.toml.
    • Enhanced file processing to recognize and handle pyproject.toml files.
  • Improvements

    • Real-time feedback added during dependency scanning for better visibility.
    • Updated ecosystem handling to include pyproject.toml for improved dependency management.
  • Bug Fixes

    • Adjusted handling for specific file types to ensure correct processing.
    • Improved readability and organization within the code related to file handling.

@dylanpulver dylanpulver marked this pull request as ready for review July 9, 2024 15:20
@dylanpulver
Copy link
Collaborator Author

Please note that this PR in the safety_schemas repo is needed for this to work: https://github.com/pyupio/safety_schemas/pull/1

Copy link

coderabbitai bot commented Sep 16, 2024

Walkthrough

The changes introduce support for handling pyproject.toml files within the project. A new dependency, "insecure-package", is added to the build requirements. The codebase is updated to recognize and process pyproject.toml, including new functions and classes to read dependencies and handle file processing. Enhancements to visibility during scanning operations are also included, along with modifications to existing functions to accommodate the new file type.

Changes

File Change Summary
pyproject.toml Added "insecure-package" to the requires list under [build-system].
safety/scan/command.py Added PYPROJECT_TOML to ScannableEcosystems and enhanced the scan function with print statements.
safety/scan/ecosystems/python/dependencies.py Introduced read_pyproject_toml_dependencies function to parse dependencies and modified get_dependencies to handle FileType.PYPROJECT_TOML.
safety/scan/finder/file_finder.py Added PyProjectTomlHandler and modified process_directory to recognize pyproject.toml.
safety/scan/finder/handlers.py Introduced PyProjectTomlHandler class with methods for handling pyproject.toml files and updated ECOSYSTEM_HANDLER_MAPPING.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant FileFinder
    participant PyProjectTomlHandler
    participant DependencyParser

    User->>FileFinder: Scan directory
    FileFinder->>PyProjectTomlHandler: Process pyproject.toml
    PyProjectTomlHandler->>DependencyParser: Extract dependencies
    DependencyParser-->>PyProjectTomlHandler: Return dependencies
    PyProjectTomlHandler-->>FileFinder: Provide dependency info
    FileFinder-->>User: Display scan results
Loading

Poem

🐇 In the garden where code does bloom,
New dependencies chase away the gloom.
With pyproject.toml now in sight,
Scanning the files brings pure delight.
Hop along, let’s code and play,
For every change, a brighter day! 🌼


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 15e757c and 333421d.

Files selected for processing (5)
  • pyproject.toml (1 hunks)
  • safety/scan/command.py (2 hunks)
  • safety/scan/ecosystems/python/dependencies.py (3 hunks)
  • safety/scan/finder/file_finder.py (3 hunks)
  • safety/scan/finder/handlers.py (3 hunks)
Additional context used
Ruff
safety/scan/finder/handlers.py

5-5: typing.Tuple imported but unused

Remove unused import: typing.Tuple

(F401)

safety/scan/finder/file_finder.py

11-11: .handlers.PyProjectTomlHandler imported but unused

Remove unused import: .handlers.PyProjectTomlHandler

(F401)


161-161: Use enumerate() for index variable level in for loop

(SIM113)

Additional comments not posted (9)
safety/scan/finder/handlers.py (4)

112-113: LGTM!

The empty implementation of the download_required_assets method is appropriate since there are no required assets to download for Safety project files, as indicated by the docstring.


116-162: LGTM!

The implementation of the PyProjectTomlHandler class looks good:

  • It correctly extends the FileHandler class and sets the ecosystem to Ecosystem.PYPROJECT_TOML.
  • The download_required_assets method fetches both the full and partial Safety databases, consistent with the PythonFileHandler class.
  • The can_handle method correctly checks for the pyproject.toml file name and returns the corresponding FileType.
  • The handle method reads the pyproject.toml file using the toml library and extracts dependencies from the relevant sections.
  • The extracted dependencies are returned as a set, which is an appropriate data structure for unique dependencies.

Great job!


177-177: LGTM!

Adding the PyProjectTomlHandler to the ECOSYSTEM_HANDLER_MAPPING dictionary is necessary and correctly implemented. This associates the handler with the corresponding Ecosystem.PYPROJECT_TOML ecosystem.


Line range hint 1-179: Overall, the changes in this file look great!

The introduction of the PyProjectTomlHandler class is a valuable addition to support handling pyproject.toml files. The implementation is consistent with the existing FileHandler classes and follows the necessary conventions. The class correctly extends the FileHandler class and implements the required methods such as download_required_assets, can_handle, and handle appropriately.

The integration of the PyProjectTomlHandler with the existing codebase is smooth, and the class is correctly added to the ECOSYSTEM_HANDLER_MAPPING dictionary to associate it with the corresponding ecosystem.

Apart from the minor issues mentioned in the previous comments, the overall changes in this file are well-structured and contribute positively to the project's functionality.

Great work!

Tools
Ruff

167-167: Redefinition of unused download_required_assets from line 121

(F811)

safety/scan/finder/file_finder.py (1)

153-160: LGTM!

The new code block correctly handles the special files pyproject.toml, env.yml, and env.yaml by adding them to the files dictionary under their respective file types. This enhances the functionality of the file finder to support these important configuration files.

safety/scan/ecosystems/python/dependencies.py (2)

273-308: LGTM!

The read_pyproject_toml_dependencies function correctly reads dependencies from a pyproject.toml file and yields PythonDependency objects. It handles both versioned and unversioned dependencies in the relevant sections of the TOML file.

The implementation integrates well with the existing PythonDependency model and the PythonSpecification class, enabling support for pyproject.toml files in the project.


330-331: LGTM!

The changes to the get_dependencies function correctly integrate the handling of FileType.PYPROJECT_TOML files by calling the newly introduced read_pyproject_toml_dependencies function.

This modification ensures that dependencies from pyproject.toml files are properly processed and returned as PythonDependency objects, seamlessly integrating with the existing dependency resolution logic.

safety/scan/command.py (2)

48-48: LGTM!

The new member PYPROJECT_TOML is added correctly to the ScannableEcosystems enumeration to support scanning projects defined by a pyproject.toml file.


Line range hint 1-1: Skipped reviewing system_scan function.

The system_scan function is not changed in the provided code, so there is nothing to review.

pyproject.toml Show resolved Hide resolved
safety/scan/finder/handlers.py Outdated Show resolved Hide resolved
safety/scan/finder/handlers.py Outdated Show resolved Hide resolved
safety/scan/finder/file_finder.py Show resolved Hide resolved
safety/scan/command.py Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (3)
safety/scan/finder/handlers.py (3)

112-113: Consider providing a meaningful implementation or documentation.

In the SafetyProjectFileHandler, the download_required_assets method currently contains only a pass statement. If no assets are required, it's a good practice to document this explicitly.

Consider adding a docstring:

 def download_required_assets(self, session):
+    """
+    No required assets to download for Safety project files.
+    """
     pass

138-139: Remove debug print statement in can_handle method.

The print("recognized") statement is likely used for debugging purposes. Consider removing it or replacing it with appropriate logging.

Apply this diff to remove the print statement:

     def can_handle(self, root: str, file_name: str, include_files: Dict[FileType, List[Path]]) -> Optional[FileType]:
         if file_name == 'pyproject.toml':
-            print("recognized")
             return FileType.PYPROJECT_TOML
         return None

146-146: Remove debug print statement in handle method.

The print("printing data", data) statement may not be suitable for production code. Consider removing it or using a logging framework.

Apply this diff to remove the print statement:

         with open(file_path, 'r') as file:
             data = toml.load(file)
-            print("printing data", data)
             dependencies = set()
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 958a39e and 3630e88.

📒 Files selected for processing (1)
  • safety/scan/finder/handlers.py (3 hunks)
🧰 Additional context used
🔇 Additional comments (2)
safety/scan/finder/handlers.py (2)

55-55: ⚠️ Potential issue

Correct the use of NotImplementedError in abstract method.

In the FileHandler class, the download_required_assets method should raise NotImplementedError instead of returning it. This ensures that subclasses are properly notified to implement this method.

Apply this diff to fix the issue:

 def download_required_assets(self, session):
     """
     Abstract method to download required assets for handling files. Should be implemented
     by subclasses.

     Args:
         session: The session object for making network requests.

-    Returns:
-        Dict[str, str]: A dictionary of downloaded assets.
     """
-    return NotImplementedError(NOT_IMPLEMENTED)
+    raise NotImplementedError(NOT_IMPLEMENTED)

Likely invalid or redundant comment.


5-5: Verify necessity of all imported types from typing module.

Ensure that all imported types (Dict, List, Optional, Set) are used in the code. Unused imports can be removed to keep the code clean.

Run the following script to check for unused imports:

# Mapping of ecosystems to their corresponding file handlers
ECOSYSTEM_HANDLER_MAPPING = MappingProxyType({
Ecosystem.PYTHON: PythonFileHandler,
Ecosystem.SAFETY_PROJECT: SafetyProjectFileHandler,
# Ecosystem.PYPROJECT_TOML: PyProjectTomlHandler,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Uncomment PyProjectTomlHandler in the ecosystem handler mapping.

The PyProjectTomlHandler is commented out in ECOSYSTEM_HANDLER_MAPPING, which means it won't be used. To enable support for pyproject.toml files, uncomment the handler.

Apply this diff:

 ECOSYSTEM_HANDLER_MAPPING = MappingProxyType({
     Ecosystem.PYTHON: PythonFileHandler,
     Ecosystem.SAFETY_PROJECT: SafetyProjectFileHandler,
-    # Ecosystem.PYPROJECT_TOML: PyProjectTomlHandler,
+    Ecosystem.PYPROJECT_TOML: PyProjectTomlHandler,
 })

Committable suggestion was skipped due to low confidence.

Comment on lines +116 to +135
class PyProjectTomlHandler(FileHandler):
def __init__(self) -> None:
super().__init__()
self.ecosystem = Ecosystem.PYTHON

def download_required_assets(self, session):
from safety.safety import fetch_database

SAFETY_DB_DIR = os.getenv("SAFETY_DB_DIR")

db = False if SAFETY_DB_DIR is None else SAFETY_DB_DIR


fetch_database(session=session, full=False, db=db, cached=True,
telemetry=True, ecosystem=Ecosystem.PYTHON,
from_cache=False)

fetch_database(session=session, full=True, db=db, cached=True,
telemetry=True, ecosystem=Ecosystem.PYTHON,
from_cache=False)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid code duplication in download_required_assets method.

The download_required_assets method in PyProjectTomlHandler duplicates the implementation from PythonFileHandler. To adhere to the DRY (Don't Repeat Yourself) principle, consider refactoring to reuse the common functionality.

One approach is to have PyProjectTomlHandler inherit from PythonFileHandler:

-class PyProjectTomlHandler(FileHandler):
+class PyProjectTomlHandler(PythonFileHandler):
     def __init__(self) -> None:
         super().__init__()
         self.ecosystem = Ecosystem.PYTHON

-    def download_required_assets(self, session):
-        from safety.safety import fetch_database
-
-        SAFETY_DB_DIR = os.getenv("SAFETY_DB_DIR")
-
-        db = False if SAFETY_DB_DIR is None else SAFETY_DB_DIR
-
-        fetch_database(session=session, full=False, db=db, cached=True,
-                       telemetry=True, ecosystem=Ecosystem.PYTHON,
-                       from_cache=False)
-
-        fetch_database(session=session, full=True, db=db, cached=True,
-                                telemetry=True, ecosystem=Ecosystem.PYTHON,
-                                from_cache=False)

This way, PyProjectTomlHandler will inherit the download_required_assets method from PythonFileHandler.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class PyProjectTomlHandler(FileHandler):
def __init__(self) -> None:
super().__init__()
self.ecosystem = Ecosystem.PYTHON
def download_required_assets(self, session):
from safety.safety import fetch_database
SAFETY_DB_DIR = os.getenv("SAFETY_DB_DIR")
db = False if SAFETY_DB_DIR is None else SAFETY_DB_DIR
fetch_database(session=session, full=False, db=db, cached=True,
telemetry=True, ecosystem=Ecosystem.PYTHON,
from_cache=False)
fetch_database(session=session, full=True, db=db, cached=True,
telemetry=True, ecosystem=Ecosystem.PYTHON,
from_cache=False)
class PyProjectTomlHandler(PythonFileHandler):
def __init__(self) -> None:
super().__init__()
self.ecosystem = Ecosystem.PYTHON

Comment on lines +158 to +161
if 'tool' in data and 'poetry' in data['tool'] and 'dependencies' in data['tool']['poetry']:
for dep, version in data['tool']['poetry']['dependencies'].items():
dependencies.add(f"{dep}=={version}" if isinstance(version, str) else dep)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Improve handling of complex version specifications in Poetry dependencies.

When processing tool.poetry.dependencies, the current implementation might not handle complex version specifications or cases where the version is a table or None. This could lead to incorrect dependency parsing.

Consider enhancing the version handling logic:

             for dep, version in data['tool']['poetry']['dependencies'].items():
-                dependencies.add(f"{dep}=={version}" if isinstance(version, str) else dep)
+                if isinstance(version, str):
+                    dependencies.add(f"{dep}{version}")
+                elif version is None:
+                    dependencies.add(dep)
+                elif isinstance(version, dict):
+                    # Handle complex version specifications
+                    # Example: {'version': '^1.2.3'}
+                    if 'version' in version:
+                        dependencies.add(f"{dep}{version['version']}")
+                else:
+                    dependencies.add(dep)

This will ensure that all forms of version specifications are correctly parsed.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if 'tool' in data and 'poetry' in data['tool'] and 'dependencies' in data['tool']['poetry']:
for dep, version in data['tool']['poetry']['dependencies'].items():
dependencies.add(f"{dep}=={version}" if isinstance(version, str) else dep)
if 'tool' in data and 'poetry' in data['tool'] and 'dependencies' in data['tool']['poetry']:
for dep, version in data['tool']['poetry']['dependencies'].items():
if isinstance(version, str):
dependencies.add(f"{dep}{version}")
elif version is None:
dependencies.add(dep)
elif isinstance(version, dict):
# Handle complex version specifications
# Example: {'version': '^1.2.3'}
if 'version' in version:
dependencies.add(f"{dep}{version['version']}")
else:
dependencies.add(dep)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants