-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix data leak between mux.cool connections #3718
Conversation
Fix XTLS#116 This issue does not appear at all if mux.cool is disabled. Maybe the issue can be avoided if routing is disabled too, but it's possible that the same issue appears elsewhere. Tests should be added. ***The mux server has no tests at all!*** Step-by-step: 1. The mux server is calling handleStatusNew with the same context multiple times. That function then calls w.dispatcher.Dispatch 2. Dispatch assigns some values specific to the connection onto the outbound here: https://github.com/XTLS/Xray-core/blob/83eef6bc1f554be84aeb799417688a070cd32ab8/app/dispatcher/default.go#L241-L242 3. Because the outbounds array on the connection object is the same across multiple connections (it's the same context), the value of outbound.Target flips back and forth between values such as "1.1.1.1" or "chat.facebook.com" (IPAddress vs DomainAddress) 4. Many layers deeper in routing, as part of this Dispatch call, this function is called: https://github.com/XTLS/Xray-core/blob/83eef6bc1f554be84aeb799417688a070cd32ab8/features/routing/session/context.go#L53-L55 It does two things, in order: 1. Checks if Target is an IPAddress (IsIP) 2. If it's an IP, call IP() Because Target keeps mutating back and forth between multiple kinds of address, it sometimes is an IP for Step 1, and a DomainAddress for Step 2 5. Calling IP() on a DomainAddress panics: panic: Calling IP() on a DomainAddress. goroutine 123018 [running]: github.com/xtls/xray-core/common/net.domainAddress.IP(...) github.com/xtls/xray-core/common/net/address.go:172 github.com/xtls/xray-core/features/routing/session.(*Context).GetTargetIPs(0xc004175428) github.com/xtls/xray-core/features/routing/session/context.go:54 +0x5f github.com/xtls/xray-core/app/router.(*MultiGeoIPMatcher).Apply(0xc0003a63e0, {0x156c770?, 0xc004175428?}) github.com/xtls/xray-core/app/router/condition.go:143 +0x3e github.com/xtls/xray-core/app/router.(*ConditionChan).Apply(0x495c9d?, {0x156c770, 0xc004175428}) github.com/xtls/xray-core/app/router/condition.go:32 +0x5c github.com/xtls/xray-core/app/router.(*Rule).Apply(...) github.com/xtls/xray-core/app/router/config.go:30 github.com/xtls/xray-core/app/router.(*Router).pickRouteInternal(0xc000367a40, {0x156c770, 0xc004175428}) github.com/xtls/xray-core/app/router/router.go:196 +0x176 github.com/xtls/xray-core/app/router.(*Router).PickRoute(0x1562810?, {0x156c770?, 0xc004175428?}) github.com/xtls/xray-core/app/router/router.go:84 +0x25 github.com/xtls/xray-core/app/dispatcher.(*DefaultDispatcher).routedDispatch(0xc000398fc0, {0x1562810, 0xc0050cd350}, 0xc00418dea0, {{0x15625a8, 0xc005974c10}, 0x1bb, 0x2}) github.com/xtls/xray-core/app/dispatcher/default.go:420 +0x307 created by github.com/xtls/xray-core/app/dispatcher.(*DefaultDispatcher).Dispatch in goroutine 123015 github.com/xtls/xray-core/app/dispatcher/default.go:252 +0x56c
Without mux, the outbounds array is reset entirely for every connection: https://github.com/mmmray/Xray-core/blob/7a80df07b2aa83685d41d72885d201ab47ef9c05/app/proxyman/inbound/worker.go#L83 maybe it's a more appropriate fix? |
@yuhan6665 or @RPRX please review, if you have ideas on how to test, tell me as well. |
Thanks for your great investigation @mmmray! |
Going even further, I don't know why the context is used to carry Target at all. I suppose that somewhere, it was too annoying to pass as function argument, but I did not notice it. But i'm also new to the code. |
In general, context should contain (as much as) session level information. I think inbound, outbounds so on and so forth info is ok. XTLS also use it pass raw conn object. A bit strange but I think it is also ok. |
thanks for investigating and fixing , please build new version of core soon as possible |
Indeed I think a new version should be released soon. There are a few bugfixes, and this new fragment PR sitting around in master doesn't help with its lifetime. I was hoping to wait for #3711 though. Let's see. I think the next version was planned to be related to splithttp mux settings, but... it's ready when it's ready IMO and shouldn't hold up critical bugfixes. |
The new fragment pr has no problem with ws or httpupgrade any operator in I r a n i used now, please don't make a cover for it 😆 and splithttp almost blocked in iran , beacuse udp blocked in many isp |
@sssagsag nope , thats not true splithttp isnt based on quic , udp mode - quic - h3 is optional |
yes iknow , but i cant used without h3 , with many isp |
maybe this pr helped #3677 |
Why not just func (ctx *Context) GetTargetIPs() []net.IP {
if ctx.Outbound == nil {
return nil
}
target := ctx.Outbound.Target
if !target.IsValid() {
return nil
}
if target.Address.Family().IsIP() {
return []net.IP{target.Address.IP()}
}
return nil
} to prevent And I think this is introduced in 017f53b, not a bug report from 2020. |
To me the problem is the racing write in Dispatch, not the racing read in GetTargetIPs. If I only fix GetTargetIPs as you said, the following sequence is still possible. (Sub-)Connection A and B are part of the same physical mux.cool connection:
So now, xray may route connection A based on information found in connection B! And beside GetTargetIPs, there are many other reads for multiple routing rules. Really, the most problematic part of this bug is not the crash, but the potential mis-routing as a result of the race condition. |
Fix #116
This issue does not appear at all if mux.cool is disabled. Maybe the issue can be avoided if routing is disabled too, but it's possible that the same issue appears elsewhere.
I have debugged this issue in prod with a lot of print statements, and can confirm that this patch fixes it.
Tests should be added. The mux server has no tests at all!Step-by-step:
The mux server is calling handleStatusNew with the same context multiple times. That function then calls w.dispatcher.Dispatch
Dispatch assigns some values specific to the connection onto the outbound here:
Xray-core/app/dispatcher/default.go
Lines 241 to 242 in 83eef6b
Because the outbounds array on the connection object is the same across multiple connections (it's the same context), the value of outbound.Target flips back and forth between values such as "1.1.1.1" or "chat.facebook.com" (IPAddress vs DomainAddress)
Many layers deeper in routing, as part of this Dispatch call, this function is called:
Xray-core/features/routing/session/context.go
Lines 53 to 55 in 83eef6b
It does two things, in order:
Because Target keeps mutating back and forth between multiple kinds
of address, it sometimes is an IP for Step 1, and a DomainAddress for
Step 2
Calling IP() on a DomainAddress panics: