-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't assume memory layout of std::net::SocketAddr #106
Conversation
c1fd439
to
6d08187
Compare
All of these PRs look a bit different. Due to how the libraries are structured, but also because they have different minimum supported Rust versions etc. I'm assuming we have to keep the compatibility with Rust 1.21.0 for this fix. We want as many people as possible to be able to upgrade as smoothly as possible to get rid of the possible unsoundness this code had. |
Ok. Done. Ready for review I believe. Ping @pfmooney ? For some context, see this PR: rust-lang/rust#78802. And possibly this internals thread: https://internals.rust-lang.org/t/why-are-socketaddrv4-socketaddrv6-based-on-low-level-sockaddr-in-6/13321 |
da298a9
to
4eda40f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same changes were applied to socket2 and Mio.
I realize this PR still has a lot of transmutes that I aimed to get rid of. The other PRs have better solutions to this. I will fix this later today. |
I was looking at that but the |
Oh yeah. That's why I moved over to I don't want any byte swapping. I just want to convert |
I have verified that both @pfmooney Would it be possible to have this reviewed (and published) 🙈? This crate is at the bottom of a pretty large dependency chain doing this invalid memory assumption. So things kind of have to start here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together, and please pardon the review delay.
src/socket.rs
Outdated
#[cfg(unix)] | ||
let sin6_addr = { | ||
let mut sin6_addr = unsafe { mem::zeroed::<c::in6_addr>() }; | ||
sin6_addr.s6_addr = addr.ip().octets(); | ||
sin6_addr | ||
}; | ||
#[cfg(windows)] | ||
let sin6_addr = unsafe { | ||
let mut u = mem::zeroed::<c::in6_addr_u>(); | ||
*u.Byte_mut() = addr.ip().octets(); | ||
c::IN6_ADDR { u } | ||
}; | ||
#[cfg(windows)] | ||
let u = unsafe { | ||
let mut u = mem::zeroed::<c::SOCKADDR_IN6_LH_u>(); | ||
*u.sin6_scope_id_mut() = addr.scope_id(); | ||
u | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are your thoughts about splitting addr2raw
into separate cfg(windows)
and cfg(unix)
variants, rather than interleaving like this? It might be a little easier to follow, event though the unix
version will still include the various specialization bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I also splitted the v4 and v6 versions since the function was so long. I think this makes sense and puts most of the code at way lower indentation.
src/socket.rs
Outdated
// So just transmuting the octects to the `u32` representation works. | ||
#[cfg(unix)] | ||
let sin_addr = c::in_addr { | ||
s_addr: unsafe { mem::transmute::<_, u32>(addr.ip().octets()) }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the alignment requirements of u32
be fulfilled by the array emitted from ip().octets()
? If that isn't guaranteed, is it really that expensive to do with safe rust?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure if this is true for all platforms, but I think it should be. The alignment of Ipv4Addr
is 4, which is the same as u32
. So the alignment of the [u8; 4]
should be correct unless I'm mistaken.
We could solve it safely by using the built in Into<u32>
etc. And the compiler might be able to optimize it away, but if not it will do a byte swap twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think avoiding the unsafe
is worth the minuscule cost of a double byte-swap (especially considering it might be optimized away).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correctness is of course way more important than performance. So if we can't say for sure that this is correct I agree we should change to a safer solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Less unsafe
is always nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. I realize Ipv4Addr::octets
of course return an owned [u8; 4]
and not a reference. As such, the IP data is copied onto the stack and the alignment is just 1, not 4. Good we changed it.
4eda40f
to
e7768a2
Compare
7de07c5
to
06ab86a
Compare
@pfmooney There were a lot of pushes in a row there. But now I have addressed all the feedback and I have tested what's currently in this PR under |
Great. I'll run it through a few smoke tests of my own while waiting for @Thomasdezeeuw to approve of the updated change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @faern I'm amusing you've tried this branch with Mio and all tests passes?
Yes. On Windows 10 and Linux only though. But all tests pass. Both on a standard build of Rust and on the custom build where I change the memory layout of |
@pfmooney ping? |
Merged as 49b43f2. Thanks |
...and published as 0.2.36. |
Yay! Thank you 🎉 |
…lett Implement network primitives with ideal Rust layout, not C system layout This PR is the result of this internals forum thread: https://internals.rust-lang.org/t/why-are-socketaddrv4-socketaddrv6-based-on-low-level-sockaddr-in-6/13321. Instead of basing `std:::net::{Ipv4Addr, Ipv6Addr, SocketAddrV4, SocketAddrV6}` on system (C) structs, they are encoded in a more optimal and idiomatic Rust way. This changes the public API of std by introducing structural equality impls for all four types here, which means that `match ipv4addr { SOME_CONSTANT => ... }` will now compile, whereas previously this was an error. No other intentional changes are introduced to public API. It's possible to observe the current layout of these types (e.g., by pointer casting); most but not all libraries which were found by Crater to do this have had updates issued and affected versions yanked. See report below. ### Benefits of this change - It will become possible to move these fundamental network types from `std` into `core` ([RFC](rust-lang/rfcs#2832)). - Some methods that can't be made `const fn`s today can be made `const fn`s with this change. - `SocketAddrV4` only occupies 6 bytes instead of 16 bytes. - These simple primitives become easier to read and uses less `unsafe`. - Makes these types support structural equality, which means you can now (for instance) match an `Ipv4Addr` against a constant ### ~Remaining~ Previous problems This change obviously changes the memory layout of the types. And it turns out some libraries invalidly assumes the memory layout and does very dangerous pointer casts to convert them. These libraries will have undefined behaviour and perform invalid memory access until patched. - [x] - `mio` - Issue: tokio-rs/mio#1386. - [x] `0.7` branch tokio-rs/mio#1388 - [x] `0.7.6` published tokio-rs/mio#1398 - [x] Yank all `0.7` versions older than `0.7.6` - [x] Report `<0.7.6` to RustSec Advisory Database https://rustsec.org/advisories/RUSTSEC-2020-0081.html - [x] - `socket2` - Issue: rust-lang/socket2#119. - [x] `0.3.x` branch rust-lang/socket2#120 - [x] `0.3.16` published - [x] `master` branch rust-lang/socket2#122 - [x] Yank all `0.3` versions older than `0.3.16` - [x] Report `<0.3.16` to RustSec Advisory Database https://rustsec.org/advisories/RUSTSEC-2020-0079.html - [x] - `net2` - Issue: deprecrated/net2-rs#105 - [x] deprecrated/net2-rs#106 - [x] `0.2.36` published - [x] Yank all `0.2` versions older than `0.2.36` - [x] Report `<0.2.36` to RustSec Advisory Database https://rustsec.org/advisories/RUSTSEC-2020-0078.html - [x] - `miow` - Issue: yoshuawuyts/miow#38 - [x] `0.3.x` - yoshuawuyts/miow#39 - [x] `0.3.6` published - [x] `0.2.x` - yoshuawuyts/miow#40 - [x] `0.2.2` published - [x] Yanked all `0.2` versions older than `0.2.2` - [x] Yanked all `0.3` versions older than `0.3.6` - [x] Report `<0.2.2` and `<0.3.6` to RustSec Advisory Database https://rustsec.org/advisories/RUSTSEC-2020-0080.html - [x] - `quinn master` (aka what became 0.7) - quinn-rs/quinn#968 quinn-rs/quinn#987 - [x] - `quinn 0.6` - quinn-rs/quinn#1045 - [x] - `quinn 0.5` - quinn-rs/quinn#1046 - [x] - Release `0.7.0`, `0.6.2` and `0.5.4` - [x] - `nb-connect` - smol-rs/nb-connect#1 - [x] - Release `1.0.3` - [x] - Yank all versions older than `1.0.3` - [x] - `shadowsocks-rust` - shadowsocks/shadowsocks-rust#462 - [ ] - `rio` - spacejam/rio#44 - [ ] - `seaslug` - spacejam/seaslug#1 #### Fixed crate versions All crates I have found that assumed the memory layout have been fixed and published. The crates and versions that will continue working even as/if this PR is merged is (please upgrade these to help unblock this PR): * `net2 0.2.36` * `socket2 0.3.16` * `miow 0.2.2` * `miow 0.3.6` * `mio 0.7.6` * `mio 0.6.23` - Never had the invalid assumption itself, but has now been bumped to only allow fixed dependencies (`net2` + `miow`) * `nb-connect 1.0.3` * `quinn 0.5.4` * `quinn 0.6.2` ### Release notes draft This release changes the memory layout of `Ipv4Addr`, `Ipv6Addr`, `SocketAddrV4` and `SocketAddrV6`. The standard library no longer implements these as the corresponding `libc` structs (`sockaddr_in`, `sockaddr_in6` etc.). This internal representation was never exposed, but some crates relied on it anyway by unsafely transmuting. This change will cause those crates to make invalid memory accesses. Notably `net2 <0.2.36`, `socket2 <0.3.16`, `mio <0.7.6`, `miow <0.3.6` and a few other crates are affected. All known affected crates have been patched and have had fixed versions published over a year ago. If any affected crate is still in your dependency tree, you need to upgrade them before using this version of Rust.
Implement network primitives with ideal Rust layout, not C system layout This PR is the result of this internals forum thread: https://internals.rust-lang.org/t/why-are-socketaddrv4-socketaddrv6-based-on-low-level-sockaddr-in-6/13321. Instead of basing `std:::net::{Ipv4Addr, Ipv6Addr, SocketAddrV4, SocketAddrV6}` on system (C) structs, they are encoded in a more optimal and idiomatic Rust way. This changes the public API of std by introducing structural equality impls for all four types here, which means that `match ipv4addr { SOME_CONSTANT => ... }` will now compile, whereas previously this was an error. No other intentional changes are introduced to public API. It's possible to observe the current layout of these types (e.g., by pointer casting); most but not all libraries which were found by Crater to do this have had updates issued and affected versions yanked. See report below. ### Benefits of this change - It will become possible to move these fundamental network types from `std` into `core` ([RFC](rust-lang/rfcs#2832)). - Some methods that can't be made `const fn`s today can be made `const fn`s with this change. - `SocketAddrV4` only occupies 6 bytes instead of 16 bytes. - These simple primitives become easier to read and uses less `unsafe`. - Makes these types support structural equality, which means you can now (for instance) match an `Ipv4Addr` against a constant ### ~Remaining~ Previous problems This change obviously changes the memory layout of the types. And it turns out some libraries invalidly assumes the memory layout and does very dangerous pointer casts to convert them. These libraries will have undefined behaviour and perform invalid memory access until patched. - [x] - `mio` - Issue: tokio-rs/mio#1386. - [x] `0.7` branch tokio-rs/mio#1388 - [x] `0.7.6` published tokio-rs/mio#1398 - [x] Yank all `0.7` versions older than `0.7.6` - [x] Report `<0.7.6` to RustSec Advisory Database https://rustsec.org/advisories/RUSTSEC-2020-0081.html - [x] - `socket2` - Issue: rust-lang/socket2#119. - [x] `0.3.x` branch rust-lang/socket2#120 - [x] `0.3.16` published - [x] `master` branch rust-lang/socket2#122 - [x] Yank all `0.3` versions older than `0.3.16` - [x] Report `<0.3.16` to RustSec Advisory Database https://rustsec.org/advisories/RUSTSEC-2020-0079.html - [x] - `net2` - Issue: deprecrated/net2-rs#105 - [x] deprecrated/net2-rs#106 - [x] `0.2.36` published - [x] Yank all `0.2` versions older than `0.2.36` - [x] Report `<0.2.36` to RustSec Advisory Database https://rustsec.org/advisories/RUSTSEC-2020-0078.html - [x] - `miow` - Issue: yoshuawuyts/miow#38 - [x] `0.3.x` - yoshuawuyts/miow#39 - [x] `0.3.6` published - [x] `0.2.x` - yoshuawuyts/miow#40 - [x] `0.2.2` published - [x] Yanked all `0.2` versions older than `0.2.2` - [x] Yanked all `0.3` versions older than `0.3.6` - [x] Report `<0.2.2` and `<0.3.6` to RustSec Advisory Database https://rustsec.org/advisories/RUSTSEC-2020-0080.html - [x] - `quinn master` (aka what became 0.7) - quinn-rs/quinn#968 quinn-rs/quinn#987 - [x] - `quinn 0.6` - quinn-rs/quinn#1045 - [x] - `quinn 0.5` - quinn-rs/quinn#1046 - [x] - Release `0.7.0`, `0.6.2` and `0.5.4` - [x] - `nb-connect` - smol-rs/nb-connect#1 - [x] - Release `1.0.3` - [x] - Yank all versions older than `1.0.3` - [x] - `shadowsocks-rust` - shadowsocks/shadowsocks-rust#462 - [ ] - `rio` - spacejam/rio#44 - [ ] - `seaslug` - spacejam/seaslug#1 #### Fixed crate versions All crates I have found that assumed the memory layout have been fixed and published. The crates and versions that will continue working even as/if this PR is merged is (please upgrade these to help unblock this PR): * `net2 0.2.36` * `socket2 0.3.16` * `miow 0.2.2` * `miow 0.3.6` * `mio 0.7.6` * `mio 0.6.23` - Never had the invalid assumption itself, but has now been bumped to only allow fixed dependencies (`net2` + `miow`) * `nb-connect 1.0.3` * `quinn 0.5.4` * `quinn 0.6.2` ### Release notes draft This release changes the memory layout of `Ipv4Addr`, `Ipv6Addr`, `SocketAddrV4` and `SocketAddrV6`. The standard library no longer implements these as the corresponding `libc` structs (`sockaddr_in`, `sockaddr_in6` etc.). This internal representation was never exposed, but some crates relied on it anyway by unsafely transmuting. This change will cause those crates to make invalid memory accesses. Notably `net2 <0.2.36`, `socket2 <0.3.16`, `mio <0.7.6`, `miow <0.3.6` and a few other crates are affected. All known affected crates have been patched and have had fixed versions published over a year ago. If any affected crate is still in your dependency tree, you need to upgrade them before using this version of Rust.
Fixes #105
This crate assumes the memory layout of
std::net::SocketAddr
, something it should not do. The layout might be correct now, but it's not part of the contractstd
makes to the outside world. Having commonly used crates like this one depend on unspecified things like that makes it harder forstd
to evolve.This PR changes that by correctly converting between the
std
representation and the system representation instead of just doing pointer casting.Same kind of fix as was recently done in rust-lang/socket2#120 and tokio-rs/mio#1388