-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: export rcmgr metrics to prometheus #8785
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks fine to me, but I don't know how this prometheus business works in ipfs.
5a98324
to
d23a188
Compare
f9ff9d1
to
18b9d31
Compare
@marten-seemann : just checking my understanding, but it looks like one sharness test is not passing: https://app.circleci.com/pipelines/github/ipfs/go-ipfs/6284/workflows/1dc038d2-dd08-442e-93e5-30512e10193d/jobs/68339 I assume:
|
Yes, that test checks all exported metrics against a list, and fails if any is missing / not expected. I was planning to fix this later. Note: once libp2p gets a coherent metrics story (see libp2p/go-libp2p#1356), this test should probably be modified to exclude libp2p metrics. I'll leave that decision to the IPFS stewards though. |
33fbdbf
to
26b8c42
Compare
@marten-seemann : please comment/ping when the test is updated/passing. Also, I assume we need to update so that this PR only shows the incremental diff on top of #8680 ? I'll make sure we then get reviewer eyes to land this. |
26b8c42
to
859e648
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
- Rebased this on top of feat: opt-in Swarm.ResourceMgr (go-libp2p v0.18) #8680
- Refactored a bit to ensure these metrics are created and exposed only when
Swarm.ResourceMgr.Enabled
* update go-libp2p to v0.18.0 * initialize the resource manager * add resource manager stats/limit commands * load limit file when building resource manager * log absent limit file * write rcmgr to file when IPFS_DEBUG_RCMGR is set * fix: mark swarm limit|stats as experimental * feat(cfg): opt-in Swarm.ResourceMgr This ensures we can safely test the resource manager without impacting default behavior. - Resource manager is disabled by default - Default for Swarm.ResourceMgr.Enabled is false for now - Swarm.ResourceMgr.Limits allows user to tweak limits per specific scope in a way that is persisted across restarts - 'ipfs swarm limit system' outputs human-readable json - 'ipfs swarm limit system new-limits.json' sets new runtime limits (but does not change Swarm.ResourceMgr.Limits in the config) Conventions to make libp2p devs life easier: - 'IPFS_RCMGR=1 ipfs daemon' overrides the config and enables resource manager - 'limit.json' overrides implicit defaults from libp2p (if present) * docs(config): small tweaks * fix: skip libp2p.ResourceManager if disabled This ensures 'ipfs swarm limit|stats' work only when enabled. * fix: use NullResourceManager when disabled This reverts commit b19f7c9. after clarification feedback from #8680 (comment) * style: rename IPFS_RCMGR to LIBP2P_RCMGR preexisting libp2p toggles use LIBP2P_ prefix * test: Swarm.ResourceMgr * fix: location of opt-in limit.json and rcmgr.json.gz Places these files inside of IPFS_PATH * Update docs/config.md * feat: expose rcmgr metrics when enabled (#8785) * add metrics for the resource manager * export protocol and service name in Prometheus metrics * fix: expose rcmgr metrics only when enabled Co-authored-by: Marcin Rataj <lidel@lidel.org> * refactor: rcmgr_metrics.go * refactor: rcmgr_defaults.go This file defines implicit limit defaults used when Swarm.ResourceMgr.Enabled We keep vendored copy to ensure go-ipfs is not impacted when go-libp2p decides to change defaults in any of the future releases. * refactor: adjustedDefaultLimits Cleans up the way we initialize defaults and adds a fix for case when connection manager runs with high limits. It also hides `Swarm.ResourceMgr.Limits` until we have a better understanding what syntax makes sense. * chore: cleanup after a review * fix: restore go-ipld-prime v0.14.2 * fix: restore go-ds-flatfs v0.5.1 Co-authored-by: Lucas Molas <schomatis@gmail.com> Co-authored-by: Marcin Rataj <lidel@lidel.org>
Part of #8761
Adds basic metrics under
libp2p_rcmgr_*
Demo sample