-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved "GetPublishedAtUtcAsync" method efficiency #436
Conversation
…lone from the cached json doc.
/azp run |
Commenter does not have sufficient privileges for PR 436 in repo microsoft/OSSGadget |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Commenter does not have sufficient privileges for PR 436 in repo microsoft/OSSGadget |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is generally straightforward - it gets straight to the point, and should run very fast. I'm curious whether you have measured the speed of this change before and after for a package having many thousands of versions, like say https://www.npmjs.com/package/@graphql-codegen/cli?
As a side note, I would point out that the implementations of GetPackageMetadataAsync
are left without any improvement with this approach. That's not necesssarily a problem, but there is a certain amount of elegance in the approach taken by the prior implentation by leaning on GetPackageMetadataAsync
to be the one thing that parses package metadata (whereas this approach duplicates some of that parsing logic a bit). That's not a dealbreaker - the approach in this PR simplifies things - but it's a tradeoff that might need to be revisited later.
However, there are other more significant points that we must address when we start thinking about caching:
- The memory cache object employed by
GetJsonCache
has a size limit of about 8 MB, whereas the size of the JSON for graphql-codegen/cli is ~25 MB. So we likely need to increase the size limit of the cache, perhaps to say 100 MB. To go with that, we would likely need to bump the memory allocation for Terrapin containers up 50 to 100 additional MB. - When adding entries to the cache, we do not set an expiration time. As a result, objects may be cached indefinitely. This does not matter all that much for OSS Gadget CLIs that terminate after a few moments, but will matter a lot for Terrapin processes which will keep the running for hours or days at a time. We should probably specify a sliding expiration of something like 30 minutes and an absolute expiration of something like 6 hours to ensure that data in Terrapin stays fresh. It would be ideal to do this for all places where we set values in the cache.
Those are what come to mind first - getting caching right can be tricky, so I'll let you know if I think of anything else later. Let me know if you have any questions on the above.
We also discussed offline how NPM does not set the |
Increased cache size. Added CacheInvalidation and CacheExpiration.
As discussed offline, increased the cache limit and added cache expiration. |
I have changed this to get contentlength from the Http response body. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good progress, but a few things could still use tweaking.
1821588
to
ee22d66
Compare
ee22d66
to
2e7609a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest changes look great! Thanks Mounika.
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Current implementation of "GetPublishedAtUtcAsync" is fetching all the metadata for a given package version, which is causing significant delay. To improve the latency, optimized "GetPublishedAtUtcAsync" method to fetch the uploadtime alone.