-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault from ECDS update #14930
Comments
Is there any more information you can provide? A repro would be ideal, but config used for Envoy, config pushed via ECDS, envoy version, etc. would be helpful in debugging this |
This is from a running istio cluster. I am not yet able to reproduce it. Trying to get a core dump if this happens again. |
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions. |
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions. |
I can easily reproduce this now by deleting and recreating a filter which uses ECDS. I added some log instrument and find that crash is from this line at
I enabled debug log and found there is a simultaneous listener update when ecds update happens. @lambdai has a theory that the crash is because factory context here is from the old listener. cc @kyessenov |
The factory context looks like server factory context, right? So it's the same across listeners. |
@kyessenov I am not sure.. you and @lambdai know this better than me : ) Could someone please reopen this issue? Maybe @snowp ? |
Ok, feel free to assign to me or @lambdai. |
Asan logs:
Seems like the raw pointer from subscription to provider is used because of async cross thread post and deleted from main as owned by a listener. Will need to restructure using shared and weak pointers. |
/reopen |
The problem is still producible with the 1.21.2-dev release. |
@rahulanand16nov do you have a stack trace or reproducer? |
@kyessenov Sorry for the noise but for some reason it's working now even though it was reproducible when I recreated the whole cluster previously. Very weird. |
2022-06-29T08:40:44.324552Z critical envoy assert assert failure: Thread::MainThread::isMainThread(). 2022-06-29T08:40:45.308881Z critical envoy backtrace #14: std::__shared_count<>:: 2022-06-29T08:40:46.607164Z critical envoy backtrace #34: std::__shared_ptr<>::operator=() [0x563ae820b953] it seems it happend again, and stack trace ,it seems the lambada update cause ecds segmentfault ? |
Can you confirm this is on latest Envoy? The error is about removal of a filter from a non-main worker, which I think we handled. |
we used envoy version is and we cherry-pick this pr |
I don't seem to have SHA |
@kyessenov I use istio1.13.7 and envoy version: e0c6f64173cd0db9370e7ad8b1fbcee370a9f0f3/1.21.5/Clean/RELEASE/BoringSSL Segmentation fault occured when I modify a wasm filter configruation.
|
Istio 1.13 doesn't have this fix. |
Got a crash when updating filter config with ECDS.
The text was updated successfully, but these errors were encountered: