Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in go routine unmarshalling secrets #385

Closed
ckadner opened this issue May 31, 2023 · 1 comment · Fixed by #397
Closed

Memory leak in go routine unmarshalling secrets #385

ckadner opened this issue May 31, 2023 · 1 comment · Fixed by #397
Assignees
Labels
bug Something isn't working
Milestone

Comments

@ckadner
Copy link
Member

ckadner commented May 31, 2023

Describe the bug

Memory leak in go routine that unmarshalls secrets. Repprted by @Jooho after a load test with 1000 simulated users.

https://github.com/kserve/modelmesh-serving/blob/main/pkg/mmesh/etcdrangewatcher.go#L97-L196

Analysis

From @tjohnson31415:

The EtcdRangeWatcher does not have a way to exit the go-routine: there is no break out of the "main" refresh_loop.
The next question is if the number of watches should be increasing. I'm looking at how these are created now.
The watchers are created in refreshWatches
Which is called from UpdateWatchedService
which has if checks in both cases that it calls refreshWatches:

  1. if etcdSecretName != mes.secretName
  2. else if serviceName != nw.watchedServiceName
    The second check with serviceName might be the problem (the etcd secret shouldn't be changing much). It looks like the nw objects are created without a watchedServiceName being set:
		nw = &namespaceWatch{}

REF
So nw.watchedServiceName might always be an empty string 🤔

Screenshots

image

Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context

TODO

@ckadner ckadner added the bug Something isn't working label May 31, 2023
@ckadner ckadner modified the milestone: v0.11.0 Jun 2, 2023
@ckadner
Copy link
Member Author

ckadner commented Jun 13, 2023

TODO: break out separate issue for Secret unmarshalling

@heyselbi heyselbi moved this from Backlog to Groomed Issues and Trackers in ODH Model Serving Planning Jul 11, 2023
@ckadner ckadner added this to the v0.11.0 milestone Jul 11, 2023
@taneem-ibrahim taneem-ibrahim moved this from Groomed Issues to In Progress in ODH Model Serving Planning Jul 13, 2023
@taneem-ibrahim taneem-ibrahim moved this from In Progress to Under Review in ODH Model Serving Planning Jul 13, 2023
@ckadner ckadner linked a pull request Aug 15, 2023 that will close this issue
@ckadner ckadner modified the milestones: v0.11.0, v0.11.1 Aug 29, 2023
ckadner pushed a commit that referenced this issue Sep 19, 2023
Stop goroutines when the label `modelmesh-enabled` is
removed from a namespace or the namespace is deleted.
If the request namespace does not exist, do not retry to
get the information again.

Closes #385

---------

Signed-off-by: jooho <jlee@redhat.com>
@github-project-automation github-project-automation bot moved this from Under Review to Done in ODH Model Serving Planning Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants