Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix trivy scan and enable caching of vulnerability DB #3192

Merged
merged 5 commits into from
Oct 31, 2024

Conversation

DaMandal0rian
Copy link
Contributor

@DaMandal0rian DaMandal0rian commented Oct 29, 2024

This PR fixes the trivy scanner by caching the vulnerability DB and downloading a new update every 24 hours. We are currently being rate-limited which is related to this issue

  • bumped trivy-action to v0.28.0

Code contributor checklist:

Copy link
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure this workflow will be reliable, can we tweak it slightly to avoid some documented issues?

.github/workflows/trivy-security-scan.yml Show resolved Hide resolved
.github/workflows/trivy-security-scan.yml Outdated Show resolved Hide resolved
.github/workflows/trivy-security-scan.yml Outdated Show resolved Hide resolved

- name: Download and extract the vulnerability DB
run: |
mkdir -p $GITHUB_WORKSPACE/.cache/trivy/db
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$GITHUB_WORKSPACE is already the default, why using it explicitly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just my preference for readability but I have changed it.

key: cache-trivy-${{ steps.date.outputs.date }}

trivy_scan_image:
needs: update-trivy-db
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will result in update happening every single time, essentially defeating the purpose of caching, in fact caching just does more work here for no reason.

What should be done instead is a separate workflow like described in https://github.com/aquasecurity/trivy-action?tab=readme-ov-file#updating-caches-in-the-default-branch that updates cache periodically and then use it in this workflow.

Even better approach and something I'd do would be to only update the cache if it is missing and then if cache recovery (for current date) is successful, we'll skip downloading. I have implemented this logic for GTK4 building here, the only major difference here is that the cache key will include a date.

- name: Cache DBs
uses: actions/cache@v4
with:
path: ${{ github.workspace }}/.cache/trivy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is equivalent:

Suggested change
path: ${{ github.workspace }}/.cache/trivy
path: .cache/trivy

Copy link
Member

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will download cache at 3 times, which is slow and will cost more in case bandwidth is not free.

What do you think about combining check-trivy-db-cache, update-trivy-db and trivy_scan_image into a single job that tries to download the cache, downloads database if cache is missing, then does all the checks it needs with database already in place?

Same as Space Acres repo checks cache, builds GTK4 if cache was missing and then continues doing what it was supposed to.

Comment on lines 75 to 79
- name: Cache DBs
uses: actions/cache@v4
with:
path: .cache/trivy
key: cache-trivy-${{ needs.check-trivy-db-cache.outputs.date }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this download and override the cache you wanted to store instead?

@DaMandal0rian
Copy link
Contributor Author

check-trivy-db-cache, update-trivy-db and trivy_scan_image into a single job that tries to download the cache, downloads database if cache is missing, then does all the checks it needs with database already in place?

That makes sense, will make it into one job, that way will be more efficient.

Copy link
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good now, just one minor nitpick

.github/workflows/trivy-security-scan.yml Show resolved Hide resolved
teor2345

This comment was marked as duplicate.

Copy link
Member

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as well, thanks!

@DaMandal0rian DaMandal0rian added this pull request to the merge queue Oct 31, 2024
Merged via the queue into main with commit 30d1504 Oct 31, 2024
8 checks passed
@DaMandal0rian DaMandal0rian deleted the hotfix/trivy-scan branch October 31, 2024 07:26
@nazar-pc
Copy link
Member

Well, looks like this didn't quite work, there are some errors, for example here: https://github.com/autonomys/subspace/actions/runs/11616240335/job/32348713553

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants