-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace unsafe.Slice with memory copying to avoid potential fault memory issue #6664
Conversation
602534b
to
eab0b6c
Compare
eab0b6c
to
cdbe6e0
Compare
/test-windows-all |
cdbe6e0
to
c531e9c
Compare
/test-windows-containerd-e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please break down the PR into multiple ones as it is doing essentially 3 unrelated changes. You can then backport specifically what's needed using the normal backporting process, instead of opening an independent / "duplicate" PR (#6665).
return unsafe.Slice(&table.Table[0], table.NumEntries), nil | ||
rows := make([]MibIPForwardRow, table.NumEntries, table.NumEntries) | ||
|
||
pFirstRow := uintptr(unsafe.Pointer(&table.Table[0])) | ||
rowSize := unsafe.Sizeof(table.Table[0]) | ||
|
||
for i := uint32(0); i < table.NumEntries; i++ { | ||
row := *(*MibIPForwardRow)(unsafe.Pointer(pFirstRow + rowSize*uintptr(i))) | ||
rows[i] = row | ||
} | ||
return rows, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this fix a specific issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We received a report indicating that when host memory is less than 8GB, using unsafe.Slice can lead to memory faults. It’s possible that this function didn’t perform a deep copy, so after calling freeMibTable
, we may encounter invalid memory addresses when access rows again. We have already verified that this deep copy method resolves this issue in the same environment.
@antoninbas Can I merge the agent crash fix and windows CI fix commits into one PR? Otherwise, the Windows CI pipeline will still fail due to rate limit issue. |
c531e9c
to
10d9351
Compare
10d9351
to
f003d82
Compare
Yes, that's ok given that |
@@ -368,7 +399,16 @@ func (n *netIO) ListIPForwardRows(family uint16) ([]MibIPForwardRow, error) { | |||
if err != nil { | |||
return nil, os.NewSyscallError("iphlpapi.GetIpForwardTable", err) | |||
} | |||
return unsafe.Slice(&table.Table[0], table.NumEntries), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know the only issue with the previous code is that we were not making a copy of the slice prior to returning, while at the same time freeing the underlying memory by calling n.freeMibTable(unsafe.Pointer(table))
.
The Wireguard Windows code has a simpler implementation of this, which still uses unsafe.Slice
. It achieves correctness by making a proper copy of the slice: https://github.com/WireGuard/wireguard-windows/blob/e70799b1440690e7d4140bffc7c73baf903c7b54/tunnel/winipcfg/winipcfg.go#L151
That being said, if this new version has already been tested thoroughly, there is probably no string reason to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@antoninbas , we used to think about the solution as what Wiregurad is taking, a concern is some fields in MibIPForwardRow are using type []byte
, which still exists the risk with a shadow copy if the memory resource is limited (e.g., Windows OS re-allocate the memory to other objects after the MibIPForwardTable
is free). So we finally choose the solution to deep copy the objects before free the memory. The latest code is verified on the problematic cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean [N]byte
types like the [26]byte
field in RawSockAddrInet
([]byte
would not be possible here AFAIK)? These should be no different than integer types, they should be inlined in the struct with no indirection.
The code before this change was wrong for obvious reasons. I don't think the Wireguard code is wrong. But if you feel like manual copy is the best approach and this has been well-tested, as I wrote before we can merge it as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fields in MibIPForwardRow is array not slice. It will be deepcopied automatically.
And the current code is inconsistent: it makes a manual deepcopy for RawSockAddrInet but not AddressPrefix which also has a RawSockAddrInet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example: https://play.golang.com/p/z6Gc0YhEQSV
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
f003d82
to
6721cc6
Compare
/test-windows-all |
Hi @antoninbas @tnqn could you help to double check if we can more forward? this issue is a blocker for downstream customers. Thanks. |
/test-windows-containerd-e2e |
@@ -368,7 +399,16 @@ func (n *netIO) ListIPForwardRows(family uint16) ([]MibIPForwardRow, error) { | |||
if err != nil { | |||
return nil, os.NewSyscallError("iphlpapi.GetIpForwardTable", err) | |||
} | |||
return unsafe.Slice(&table.Table[0], table.NumEntries), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean [N]byte
types like the [26]byte
field in RawSockAddrInet
([]byte
would not be possible here AFAIK)? These should be no different than integer types, they should be inlined in the struct with no indirection.
The code before this change was wrong for obvious reasons. I don't think the Wireguard code is wrong. But if you feel like manual copy is the best approach and this has been well-tested, as I wrote before we can merge it as is.
sockData := [26]byte{} | ||
copy(a.data[:], sockData[:]) | ||
return RawSockAddrInet{Family: a.Family, data: sockData} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return *a
would make a copy given that data
is an array, not a slice
am I missing something @tnqn ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. The whole function is what struct copy does exactly.
…ory issue * Refactored ListIPForwardRows to copy IP forwarding table rows. * Removed unsafe.Slice and replaced with manual pointer dereferencing and copying. This change addresses a potential fault memory issue when iterating through the IP forwarding table, caused by the use of slices after corresponding memory has been freed, leading to access failure. Signed-off-by: Shuyang Xin <gavinx@vmware.com> Signed-off-by: Wenying Dong <wenyingd@vmware.com>
10ba34b
6721cc6
to
10ba34b
Compare
Thanks for all comments @antoninbas @tnqn , I have updated the code based on the current Windows network management API implementation(https://github.com/tailscale/winipcfg-go/blob/84d0a1a3b210638dfe24d6169581a7af315c13f0/wt_mib_ipforward_row2.go#L51C1-L78C2). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Please let me know once the verification is done |
/test-windows-all |
Both are correct, The code in wireguard windows is more concise. |
@tnqn Windows CI passed. |
/skip-all |
1 similar comment
/skip-all |
…ory issue (antrea-io#6664) * Refactored ListIPForwardRows to copy IP forwarding table rows. * Removed unsafe.Slice and replaced with manual pointer dereferencing and copying. This change addresses a potential fault memory issue when iterating through the IP forwarding table, caused by the use of slices after corresponding memory has been freed, leading to access failure. Signed-off-by: Shuyang Xin <gavinx@vmware.com> Signed-off-by: Wenying Dong <wenyingd@vmware.com>
…ory issue (antrea-io#6664) * Refactored ListIPForwardRows to copy IP forwarding table rows. * Removed unsafe.Slice and replaced with manual pointer dereferencing and copying. This change addresses a potential fault memory issue when iterating through the IP forwarding table, caused by the use of slices after corresponding memory has been freed, leading to access failure. Signed-off-by: Shuyang Xin <gavinx@vmware.com> Signed-off-by: Wenying Dong <wenyingd@vmware.com>
…ory issue (antrea-io#6664) * Refactored ListIPForwardRows to copy IP forwarding table rows. * Removed unsafe.Slice and replaced with manual pointer dereferencing and copying. This change addresses a potential fault memory issue when iterating through the IP forwarding table, caused by the use of slices after corresponding memory has been freed, leading to access failure. Signed-off-by: Shuyang Xin <gavinx@vmware.com> Signed-off-by: Wenying Dong <wenyingd@vmware.com>
…ory issue (#6664) * Refactored ListIPForwardRows to copy IP forwarding table rows. * Removed unsafe.Slice and replaced with manual pointer dereferencing and copying. This change addresses a potential fault memory issue when iterating through the IP forwarding table, caused by the use of slices after corresponding memory has been freed, leading to access failure. Signed-off-by: Shuyang Xin <gavinx@vmware.com> Signed-off-by: Wenying Dong <wenyingd@vmware.com>
…ory issue (#6664) * Refactored ListIPForwardRows to copy IP forwarding table rows. * Removed unsafe.Slice and replaced with manual pointer dereferencing and copying. This change addresses a potential fault memory issue when iterating through the IP forwarding table, caused by the use of slices after corresponding memory has been freed, leading to access failure. Signed-off-by: Shuyang Xin <gavinx@vmware.com> Signed-off-by: Wenying Dong <wenyingd@vmware.com>
…ory issue (#6664) * Refactored ListIPForwardRows to copy IP forwarding table rows. * Removed unsafe.Slice and replaced with manual pointer dereferencing and copying. This change addresses a potential fault memory issue when iterating through the IP forwarding table, caused by the use of slices after corresponding memory has been freed, leading to access failure. Signed-off-by: Shuyang Xin <gavinx@vmware.com> Signed-off-by: Wenying Dong <wenyingd@vmware.com>
…ory issue (antrea-io#6664) * Refactored ListIPForwardRows to copy IP forwarding table rows. * Removed unsafe.Slice and replaced with manual pointer dereferencing and copying. This change addresses a potential fault memory issue when iterating through the IP forwarding table, caused by the use of slices after corresponding memory has been freed, leading to access failure. Signed-off-by: Shuyang Xin <gavinx@vmware.com> Signed-off-by: Wenying Dong <wenyingd@vmware.com>
This change addresses a potential fault memory issue when iterating through the IP forwarding table,
caused by the use of slices after corresponding memory has been freed, leading to access failure.