Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write packages and vulnerabilities at the time of import #1280

Merged

Conversation

TG1999
Copy link
Contributor

@TG1999 TG1999 commented Aug 28, 2023

public instance of VCIO contains more than 35 million advisories ( the volume is so high, due to minor changes in every advisory maybe sometime or the changes in models), which takes a lot of time. If we improve the advisories just after the import this will drop volume hugely.

Copy link
Collaborator

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits for your consideration.

@@ -36,11 +37,18 @@ class ImproveRunner:
improver and parsing the returned Inferences into proper database fields
"""

def __init__(self, improver_class):
self.improver_class = improver_class
def __init__(self, improver_class=None, improver: Improver = None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be keeping the class design and passing kwargs can work too?

Suggested change
def __init__(self, improver_class=None, improver: Improver = None):
def __init__(self, improver_class, **kwargs):
self.improver = improver_class(**kwargs)

f"Inconsistent summary for {vulnerability!r}. "
f"Existing: {vulnerability.summary}, provided: {summary}"
)
summary = summary.strip()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be best in another PR

@TG1999
Copy link
Contributor Author

TG1999 commented Aug 31, 2023

@DennisClark we need your input on this, we use advisories as a screenshot of the status of the advisory at a given point of time. A single advisory goes through many improver processes to draw inferences, what should be the best possible way to get a status on advisory model to know which improvers have already ran through a particular advisory ?

@DennisClark
Copy link

@TG1999 Here are a few thoughts about advisories:

  • every advisory should be associated with a "date_detected" (or similar), which can be the date+time that the advisory was originally posted to VC.
  • since the number and types of potential improvers can vary, we may want to create a field (perhaps "associated_improvers") that can contain one or more pairs of improver_name:improver_date values, perhaps in a json structure or similar, that is updated whenever an improver is processed on the advisory
  • alternatively, the new field would initially identify all the potential improver_name values with a null in the improver_date that is set when the improver is actually run for that advisory
  • all of those new elements should be visible in the VCIO UI.
  • the process(es) to initiate advisory improvements should select those where an appropriate improver has not yet been processed for that advisory.

I don't think we necessarily need a "status" on the advisory, since the list of improver_name:improver_date values ought to give us what would be really useful here (what has been run and when).

@TG1999
Copy link
Contributor Author

TG1999 commented Sep 1, 2023

@DennisClark thanks for these excellent suggestions.

@TG1999 TG1999 force-pushed the write_packages_and_vulns_while_importing branch 2 times, most recently from 0fc89ac to 6b4dc69 Compare September 5, 2023 17:41
@TG1999 TG1999 force-pushed the write_packages_and_vulns_while_importing branch 4 times, most recently from e92ab1f to 8234a49 Compare September 11, 2023 13:16
Copy link
Collaborator

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some nits for your consideration

vulnerabilities/import_runner.py Outdated Show resolved Hide resolved
vulnerabilities/import_runner.py Show resolved Hide resolved
vulnerabilities/import_runner.py Show resolved Hide resolved
vulnerabilities/importer.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are a few nits... Either apply now or later!
LGTM otherwise!

logger.info(f"Finished import for {importer_name}. Imported {count} advisories.")

def do_import(self, advisories) -> None:
advisory_importer = AdvisoryBasedDefaultImprover(advisories=advisories)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are now using importer ... why still keeping the improver around in names of variables and objects?

)
except Exception as e:
logger.info(f"Failed to process advisory: {advisory!r} with error {e!r}")
logger.info("Finished improving using %s.", advisory_importer.__class__.qualified_name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.info("Finished improving using %s.", advisory_importer.__class__.qualified_name)
logger.info("Finished importing using %s.", advisory_importer.__class__.qualified_name)

Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
@TG1999 TG1999 force-pushed the write_packages_and_vulns_while_importing branch from d74d9fa to 0ef58f2 Compare September 21, 2023 17:52
@TG1999 TG1999 merged commit 7ece023 into aboutcode-org:main Sep 21, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants