Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kopf does not fire handlers after re-authentication #1036

Closed
lkoniecz opened this issue Jul 11, 2023 · 3 comments · Fixed by #1109
Closed

kopf does not fire handlers after re-authentication #1036

lkoniecz opened this issue Jul 11, 2023 · 3 comments · Fixed by #1109
Labels
bug Something isn't working

Comments

@lkoniecz
Copy link

Long story short

Process was running for a while.
Re-authentication process was ran and after that kopf does not react to any resources being created.

Kopf version

1.36.1

Kubernetes version

1.24

Python version

3.9

Code

default @kopf.on.startup()

Logs

[2023-07-11 14:04:20,360] kopf.objects         [DEBUG   ] [lukasz-operator/health] Handling cycle is finished, waiting for new changes.
[2023-07-11 14:39:56,795] kopf.objects         [DEBUG   ] [lukasz-operator/hello-world-1] Adding the finalizer, thus preventing the actual deletion.
[2023-07-11 14:39:56,796] kopf.objects         [DEBUG   ] [lukasz-operator/hello-world-1] Patching with: {'metadata': {'finalizers': ['kopf.zalando.org/KopfFinalizerMarker']}}
[2023-07-11 14:39:57,265] kopf._core.engines.a [INFO    ] Re-authentication has been initiated.
[2023-07-11 14:39:57,266] kopf.activities.auth [DEBUG   ] Activity 'login_via_client' is invoked.
[2023-07-11 14:39:58,376] kopf.activities.auth [DEBUG   ] Client is configured via kubeconfig file.
[2023-07-11 14:39:58,377] kopf.activities.auth [INFO    ] Activity 'login_via_client' succeeded.
[2023-07-11 14:39:58,377] kopf._core.engines.a [INFO    ] Re-authentication has finished.

Additional information

No response

@lkoniecz lkoniecz added the bug Something isn't working label Jul 11, 2023
@portswigger-tim
Copy link

I suspect the issue I have is the same root cause (whatever it might be) in 1.36.1.

I have an aiocron task that runs regularly to refresh credentials from a 3rd party API in a Memo.

Works on 1.36.0, not on 1.36.1 🤔

@asteven
Copy link
Contributor

asteven commented Jul 22, 2023

I'm having the same problem.

After several hours of debugging I think I found the reason.

kopf initiates the re-authentication when no usable ConnectionInfo objects exist in it's so called Vault.

This can happen when either kopf got an 'unauthorized error' from the API server or if the ConnectionInfo that you are using has expired. kopf will remove the no longer working ConnectionInfo objects from it's Vault and when doing this also calls .close() on the underlying aiohttp.ClientSession. The next task that needs a ConnectionInfo will trigger the 're-authentication'.

The problem now is that the old aiohttp.ClientSession instance is closed, but there may still be open aiohttp.ClientResponse objects in the system that depend on that session. These tasks will hang until they are interrupted in some way, e.g. by a connection reset or timeout. Depending on your settings this can take long or forever.

You can work around this by setting a low client_timeout, e.g.

settings.watching.client_timeout = 60

With this setting your operator should recover after that timeout has expired.

I have a patch that fixes the problem by keeping track of all the unclosed response objects so that those can be properly closed before closing the session. With this patch, I can no longer reproduce this dead-lock situation.
I'll test it some more and will then submit a PR.

@asteven
Copy link
Contributor

asteven commented Jul 22, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants