-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TRAP caching causes GitHub Actions Cache to churn #2030
Comments
Thanks for raising this issue and the suggestion. We are considering making some changes to the trap caching feature so that it only runs on the default branch. This will ensure that each repo will cache no more than one set of TRAP files. TRAP caching only really makes sense for when a repo runs codeql analysis on a schedule. Most repos only use scheduled runs on the default branch, so this should be workable for most repos. For your monorepo, is the issue that you are running analyses in PR branches when the code to be analyzed hasn't changed? If so, perhaps you can use path expressions to avoid running codeql in this case. |
Almost all our our trap cache is from our default branch, which we merge to more than 60 times a day on average. Even if this only ran on the default branch, it would continue to exhaust the available space in the GHA cache due to the differing cache keys If the desire is to continue using the GHA Cache, and there's only a point in caching the most recent run, we'll have to choose an arbitrary cache key and manually invalidate the old cache when we want to update it. Otherwise, each CodeQL run against our default branch will add another 100mb to our usage (edit: i'd expect race conditions to emerge if multiple CodeQL workflow runs attempt to manage the cache simultaneously, however) |
Another possible approach might be to save the trap cache as an artifact, then pull the most recent artifact from the workflow run on the default branch when restoring the cache, though I seem to recall that arbitrary artifacts access might need a PAT, which would be something of a dealbreaker... |
Our hope is that there will only ever be one cached item and it would contain the most recent TRAP on the default branch. Before uploading the new TRAP cache, the old TRAP cache would be removed. (There would be a few moments when there is no cache since the old one was deleted and the new one isn't uploaded yet, but hopefully that window will be small.) |
Ah, then the idea is to update the cache key to remain consistent and use the API to remove the old cache so the key can be updated? That should work |
Yes. That's right. |
any progress on this? this is causing us a lot of caching issues given the very limited amount of cache available and the fact that you cannot increase it we have already disabled codeql on all branches other than |
I recommend that you disable trap caching. You can set the environment variable |
I'd like to discuss alternative storage solutions for TRAP caching, as the current implementation causes our monorepo to exhaust our available cache. Each run appears to consume about 100mb of space, and our current merge-to-default velocity means we run out of cache in about a day.
I am aware of the ability to disable trap_caching within the init action, however I would also like to continue using trap caching at some point in the future.
I have successfully leveraged GitHub Packages (i.e., GHCR) for caching Docker Images and other build artifacts in the past. Would that be another possible path forward for this useful feature?
The text was updated successfully, but these errors were encountered: