Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid assumption about SocketAddrV{4,6} layout #968

Closed
djc opened this issue Jan 4, 2021 · 32 comments · Fixed by #987
Closed

Invalid assumption about SocketAddrV{4,6} layout #968

djc opened this issue Jan 4, 2021 · 32 comments · Fixed by #987
Labels
bug Something isn't working

Comments

@djc
Copy link
Member

djc commented Jan 4, 2021

It looks like we're assuming that std::net::Ipv4 matches the libc sockaddr_in (same for IPv6), for example in https://github.com/quinn-rs/quinn/blob/main/quinn/src/platform/unix.rs#L357. However, the standard library never guaranteed this, and in fact is looking at changing this in rust-lang/rust#78802. We should fix our usage.

@est31
Copy link
Contributor

est31 commented Jan 5, 2021

Are older versions affected too? Would it be possible to release point releases for them? Even the very old ones? That would be really nice.

@djc
Copy link
Member Author

djc commented Jan 5, 2021

Yeah, I went through older releases down to 0.3, which also look affected. I'm not sure it really makes sense to publish bugfix releases for anything older than 0.6, do you really those still get significant usage?

(On the other hand, if you'd be able to lend a hand in fixing/backporting, I wouldn't mind publishing some extra stuff on crates.io.)

@est31
Copy link
Contributor

est31 commented Jan 5, 2021

It would be cool if my older mimas releases still compiled. IDK how many changes a fix requires and how much the piece of code has changed between quinn releases.

@est31
Copy link
Contributor

est31 commented Jan 5, 2021

And yes, I'm willing to help.

@djc
Copy link
Member Author

djc commented Jan 5, 2021

I'm guessing they would still compile, just get issues at run-time.

Pretty sure the relevant code hasn't changed all that much between releases.

@Matthias247
Copy link
Contributor

FWIW, there seems to be an issue even today - without the Rust changes. we could observe some issues with IPv6 support where an incoming packet lead the endpoint to shut down due to an IO error.

After removing more MaybeUninits from socket layer I figured out that removing the one for socket addresses will break the receive path - decoded socket addresses get malformed. I uploaded a repro in #976

So far I don't understand why this happens, and whether there's some rust-UB issues to do having pointers of sockaddr_storage and sockaddr_in[6].

When I changed the socket address conversion to make use of https://docs.rs/socket2/0.3.19/socket2/struct.SockAddr.html#method.from_raw_parts and https://docs.rs/socket2/0.3.19/socket2/struct.SockAddr.html#method.as_std things seemed ok.

@djc
Copy link
Member Author

djc commented Jan 14, 2021

I wonder if we could run some simple Quinn tests through Miri.

@Matthias247
Copy link
Contributor

I think it doesn't do any threading. And even if it would, it would likely be too slow and not a great indicator if something is definitely wrong or if the Miri model just doesn't capture it yet.

@Ralith
Copy link
Collaborator

Ralith commented Jan 14, 2021

MIRI should only be needed for the unsafe platform API stuff, which we definitely could write a standalone test for. In any case, we should just replace this particular code ASAP.

@Matthias247
Copy link
Contributor

A question for the fix will be whether to always convert between std and libc representations on IO calls, or whether to switch to different address types internally which have guarantees about being platform lib compatible (I guess socket2 does). public accessors (e.g. for peer addresses) could still perform a conversion to std.

Converting the representation is definitely not costly. But given that quic library does this thousands of times per second (compared to a TCP stack which does a handful conversions at connection establishment) it might be worthwhile to avoid it.

@djc
Copy link
Member Author

djc commented Jan 14, 2021

I agree with what appears to be your assessment (that we should try to avoid the conversion on every send/recv).

@Matthias247
Copy link
Contributor

Also seems like socket2 only has SocketAddr types, not IpAddr ones which make guarantees about wrapping in_addr. But maybe there is no issue, since all are plain byte arrays?

@djc
Copy link
Member Author

djc commented Jan 14, 2021

Or we could ask about adding them to socket2.

@faern
Copy link

faern commented Jan 15, 2021

I don't think you have to worry too much about the performance of the conversion. The conversion takes very little time. If part of any operation that also contains a syscall it's dwarfed by the syscall.

You can see the benchmark results from my initial look at this before I created the PR that will change the std representation. Even having UdpSocket::local_addr in a tight loop suffers no measurable performance difference from having to do the conversion. https://internals.rust-lang.org/t/why-are-socketaddrv4-socketaddrv6-based-on-low-level-sockaddr-in-6/13321/15

You can find my benchmarks here if you are interested in verifying their correctness or apply them to this crate and do your own measurements: https://github.com/faern/socketaddr-benchmark

@Matthias247
Copy link
Contributor

@faern The thing with Quic is that it won't be just 1 conversion per syscall. With sendmmsg/recvmmsg calls and cmsgs (which contain additional addresses) it could easily become 100 conversions per syscall. The cost might indeed still be dwarfed by the one of other operations. But if this library e.g. aims for reaching 10Gbit/s per core (to get on parity with TLS+TCP), the per-packet cost of all operations needs to be minimized, and one has to start somewhere :-)

@faern
Copy link

faern commented Jan 16, 2021

I'd always prefer correctness over speed. If I were you I would fix the SocketAddr casts first and optimize them later (if benchmarks shows they are a bottleneck). I still think you overestimate how much time the conversion take.

@djc djc mentioned this issue Jan 25, 2021
@est31 est31 mentioned this issue Jan 25, 2021
@est31
Copy link
Contributor

est31 commented Jan 25, 2021

Made a PR for the latest release: #987

Once that one's merged and tested, one can think about backporting it to older releases.

@djc djc closed this as completed in #987 Jan 28, 2021
@faern
Copy link

faern commented Feb 15, 2021

Will there be patch releases to existing 0.x version of quinn? How far back was this invalid assumption made and are there still users of those versions?

@Matthias247
Copy link
Contributor

I don’t think it’s worthwhile to think a lot about old releases. Since the last release there had been a ton of bugfixes and perf improvements that people would miss out on anyway.

@est31
Copy link
Contributor

est31 commented Feb 16, 2021

It's definitely my plan to get this backported to older quinn versions. How far back, no idea.

@faern
Copy link

faern commented Feb 16, 2021

If it's not backported and there are actual users, this would either 1) block rust-lang/rust#78802 from progressing, or 2) cause UB for your users once that lands. But it of course depends on how many users you have. All other affected crates have not backported it all the way to 0.0.0 but that was mostly because the very early releases were very old with virtually no users, and those crates were being yanked anyway when this fix landed.

@djc
Copy link
Member Author

djc commented Feb 16, 2021

According to crates.io downloads, everything but 0.6.1 is practically unused. In the chart, the next most popular point is Other, which contains everything pre-0.4.0.

@est31
Copy link
Contributor

est31 commented Feb 16, 2021

Can we get a 0.6.x branch? I can then prepare PRs for 0.5.x and 0.6.x.

@djc
Copy link
Member Author

djc commented Feb 16, 2021

Done.

@est31
Copy link
Contributor

est31 commented Mar 2, 2021

So apparently f3d82d7 does not need backporting for 0.6.x and before, because the code it changed was added in 1ca7149, which wasn't present in 0.6.x.

Only 29d37aa seems to need backporting...

@Ralith
Copy link
Collaborator

Ralith commented Mar 2, 2021

@faern

rust-lang/rust#78802 states:

SocketAddrV4 only occupies 6 bytes instead of 16 bytes.

Our older versions run

        assert_eq!(
            mem::size_of::<SocketAddrV4>(),
            mem::size_of::<libc::sockaddr_in>()
        );

before invoking any code that relies on the invalid assumption, so if SocketAddrV4's size is changing then I think the worst-case scenario is reasonably traceable panics on startup.

@faern
Copy link

faern commented Mar 3, 2021

It's good that the size is asserted at least. But if the size remains the same and the memory layout changes you will still have a bad time. However, if my PR passes it's very likely that the size will change, yes.

@djc
Copy link
Member Author

djc commented Mar 4, 2021

Thanks to @est31, I've just published 0.5.4 and 0.6.2 releases that contain the relevant fixes.

@faern
Copy link

faern commented Mar 4, 2021

That's awesome! Thank you. I can now try to move the PR on std as the last known offenders have patches out.

@djc
Copy link
Member Author

djc commented Mar 4, 2021

FWIW, I also brought up the idea of yanking our buggy versions on Matrix, so we'll likely do that too (though we may take a few days before yanking 0.6.1). Do you think we should also put out a RustSec advisory for quinn, as you've done for the others?

@faern
Copy link

faern commented Mar 4, 2021

Yes. I think adding a RustSec advisory is recommended IMO! It does not hurt, is pretty easy to do and only has upsides. You can just copy what I did for the other crates basically.

Yanking + security advisory is kind of all we can do to try to encourage upgrades. I just don't want there to ever be a situation where anyone argues "There is no reported vulnerability on this version, so I'll keep it".

est31 added a commit to est31/mimas that referenced this issue Mar 4, 2021
This fixes an issue about quinn assuming
a specific layout for socket addresses:
quinn-rs/quinn#968

This might cause bugs on newer compilers.
@djc
Copy link
Member Author

djc commented Mar 4, 2021

Advisory PR submitted: rustsec/advisory-db#804.

bors added a commit to rust-lang-ci/rust that referenced this issue Jul 31, 2022
…lett

Implement network primitives with ideal Rust layout, not C system layout

This PR is the result of this internals forum thread: https://internals.rust-lang.org/t/why-are-socketaddrv4-socketaddrv6-based-on-low-level-sockaddr-in-6/13321.

Instead of basing `std:::net::{Ipv4Addr, Ipv6Addr, SocketAddrV4, SocketAddrV6}` on system (C) structs, they are encoded in a more optimal and idiomatic Rust way.

This changes the public API of std by introducing structural equality impls for all four types here, which means that `match ipv4addr { SOME_CONSTANT => ... }` will now compile, whereas previously this was an error. No other intentional changes are introduced to public API.

It's possible to observe the current layout of these types (e.g., by pointer casting); most but not all libraries which were found by Crater to do this have had updates issued and affected versions yanked. See report below.

### Benefits of this change

- It will become possible to move these fundamental network types from `std` into `core` ([RFC](rust-lang/rfcs#2832)).
- Some methods that can't be made `const fn`s today can be made `const fn`s with this change.
- `SocketAddrV4` only occupies 6 bytes instead of 16 bytes.
- These simple primitives become easier to read and uses less `unsafe`.
- Makes these types support structural equality, which means you can now (for instance) match an `Ipv4Addr` against a constant

### ~Remaining~ Previous problems

This change obviously changes the memory layout of the types. And it turns out some libraries invalidly assumes the memory layout and does very dangerous pointer casts to convert them. These libraries will have undefined behaviour and perform invalid memory access until patched.

- [x] - `mio` - Issue: tokio-rs/mio#1386.
  - [x] `0.7` branch tokio-rs/mio#1388
  - [x] `0.7.6` published tokio-rs/mio#1398
  - [x] Yank all `0.7` versions older than `0.7.6`
  - [x] Report `<0.7.6` to RustSec Advisory Database https://rustsec.org/advisories/RUSTSEC-2020-0081.html
- [x] - `socket2` - Issue: rust-lang/socket2#119.
  - [x] `0.3.x` branch rust-lang/socket2#120
  - [x] `0.3.16` published
  - [x] `master` branch rust-lang/socket2#122
  - [x] Yank all `0.3` versions older than `0.3.16`
  - [x] Report `<0.3.16` to RustSec Advisory Database https://rustsec.org/advisories/RUSTSEC-2020-0079.html
- [x] - `net2` - Issue: deprecrated/net2-rs#105
  - [x] deprecrated/net2-rs#106
  - [x] `0.2.36` published
  - [x] Yank all `0.2` versions older than `0.2.36`
  - [x] Report `<0.2.36` to RustSec Advisory Database https://rustsec.org/advisories/RUSTSEC-2020-0078.html
- [x] - `miow` - Issue: yoshuawuyts/miow#38
  - [x] `0.3.x` - yoshuawuyts/miow#39
  - [x] `0.3.6` published
  - [x] `0.2.x` - yoshuawuyts/miow#40
  - [x] `0.2.2` published
  - [x] Yanked all `0.2` versions older than `0.2.2`
  - [x] Yanked all `0.3` versions older than `0.3.6`
  - [x] Report `<0.2.2` and `<0.3.6` to RustSec Advisory Database https://rustsec.org/advisories/RUSTSEC-2020-0080.html
- [x] - `quinn master` (aka what became 0.7) - quinn-rs/quinn#968 quinn-rs/quinn#987
  - [x] - `quinn 0.6` - quinn-rs/quinn#1045
  - [x] - `quinn 0.5` - quinn-rs/quinn#1046
  - [x] - Release `0.7.0`, `0.6.2` and `0.5.4`
- [x] - `nb-connect` - smol-rs/nb-connect#1
  - [x] - Release `1.0.3`
  - [x] - Yank all versions older than `1.0.3`
- [x] - `shadowsocks-rust` - shadowsocks/shadowsocks-rust#462
- [ ] - `rio` - spacejam/rio#44
- [ ] - `seaslug` - spacejam/seaslug#1

#### Fixed crate versions

All crates I have found that assumed the memory layout have been fixed and published. The crates and versions that will continue working even as/if this PR is merged is (please upgrade these to help unblock this PR):

* `net2 0.2.36`
* `socket2 0.3.16`
* `miow 0.2.2`
* `miow 0.3.6`
* `mio 0.7.6`
* `mio 0.6.23` - Never had the invalid assumption itself, but has now been bumped to only allow fixed dependencies (`net2` + `miow`)
* `nb-connect 1.0.3`
* `quinn 0.5.4`
* `quinn 0.6.2`

### Release notes draft

This release changes the memory layout of `Ipv4Addr`, `Ipv6Addr`, `SocketAddrV4` and `SocketAddrV6`. The standard library no longer implements these as the corresponding `libc` structs (`sockaddr_in`, `sockaddr_in6` etc.). This internal representation was never exposed, but some crates relied on it anyway by unsafely transmuting. This change will cause those crates to make invalid memory accesses. Notably `net2 <0.2.36`, `socket2 <0.3.16`, `mio <0.7.6`, `miow <0.3.6` and a few other crates are affected. All known affected crates have been patched and have had fixed versions published over a year ago. If any affected crate is still in your dependency tree, you need to upgrade them before using this version of Rust.
workingjubilee pushed a commit to tcdi/postgrestd that referenced this issue Sep 15, 2022
Implement network primitives with ideal Rust layout, not C system layout

This PR is the result of this internals forum thread: https://internals.rust-lang.org/t/why-are-socketaddrv4-socketaddrv6-based-on-low-level-sockaddr-in-6/13321.

Instead of basing `std:::net::{Ipv4Addr, Ipv6Addr, SocketAddrV4, SocketAddrV6}` on system (C) structs, they are encoded in a more optimal and idiomatic Rust way.

This changes the public API of std by introducing structural equality impls for all four types here, which means that `match ipv4addr { SOME_CONSTANT => ... }` will now compile, whereas previously this was an error. No other intentional changes are introduced to public API.

It's possible to observe the current layout of these types (e.g., by pointer casting); most but not all libraries which were found by Crater to do this have had updates issued and affected versions yanked. See report below.

### Benefits of this change

- It will become possible to move these fundamental network types from `std` into `core` ([RFC](rust-lang/rfcs#2832)).
- Some methods that can't be made `const fn`s today can be made `const fn`s with this change.
- `SocketAddrV4` only occupies 6 bytes instead of 16 bytes.
- These simple primitives become easier to read and uses less `unsafe`.
- Makes these types support structural equality, which means you can now (for instance) match an `Ipv4Addr` against a constant

### ~Remaining~ Previous problems

This change obviously changes the memory layout of the types. And it turns out some libraries invalidly assumes the memory layout and does very dangerous pointer casts to convert them. These libraries will have undefined behaviour and perform invalid memory access until patched.

- [x] - `mio` - Issue: tokio-rs/mio#1386.
  - [x] `0.7` branch tokio-rs/mio#1388
  - [x] `0.7.6` published tokio-rs/mio#1398
  - [x] Yank all `0.7` versions older than `0.7.6`
  - [x] Report `<0.7.6` to RustSec Advisory Database https://rustsec.org/advisories/RUSTSEC-2020-0081.html
- [x] - `socket2` - Issue: rust-lang/socket2#119.
  - [x] `0.3.x` branch rust-lang/socket2#120
  - [x] `0.3.16` published
  - [x] `master` branch rust-lang/socket2#122
  - [x] Yank all `0.3` versions older than `0.3.16`
  - [x] Report `<0.3.16` to RustSec Advisory Database https://rustsec.org/advisories/RUSTSEC-2020-0079.html
- [x] - `net2` - Issue: deprecrated/net2-rs#105
  - [x] deprecrated/net2-rs#106
  - [x] `0.2.36` published
  - [x] Yank all `0.2` versions older than `0.2.36`
  - [x] Report `<0.2.36` to RustSec Advisory Database https://rustsec.org/advisories/RUSTSEC-2020-0078.html
- [x] - `miow` - Issue: yoshuawuyts/miow#38
  - [x] `0.3.x` - yoshuawuyts/miow#39
  - [x] `0.3.6` published
  - [x] `0.2.x` - yoshuawuyts/miow#40
  - [x] `0.2.2` published
  - [x] Yanked all `0.2` versions older than `0.2.2`
  - [x] Yanked all `0.3` versions older than `0.3.6`
  - [x] Report `<0.2.2` and `<0.3.6` to RustSec Advisory Database https://rustsec.org/advisories/RUSTSEC-2020-0080.html
- [x] - `quinn master` (aka what became 0.7) - quinn-rs/quinn#968 quinn-rs/quinn#987
  - [x] - `quinn 0.6` - quinn-rs/quinn#1045
  - [x] - `quinn 0.5` - quinn-rs/quinn#1046
  - [x] - Release `0.7.0`, `0.6.2` and `0.5.4`
- [x] - `nb-connect` - smol-rs/nb-connect#1
  - [x] - Release `1.0.3`
  - [x] - Yank all versions older than `1.0.3`
- [x] - `shadowsocks-rust` - shadowsocks/shadowsocks-rust#462
- [ ] - `rio` - spacejam/rio#44
- [ ] - `seaslug` - spacejam/seaslug#1

#### Fixed crate versions

All crates I have found that assumed the memory layout have been fixed and published. The crates and versions that will continue working even as/if this PR is merged is (please upgrade these to help unblock this PR):

* `net2 0.2.36`
* `socket2 0.3.16`
* `miow 0.2.2`
* `miow 0.3.6`
* `mio 0.7.6`
* `mio 0.6.23` - Never had the invalid assumption itself, but has now been bumped to only allow fixed dependencies (`net2` + `miow`)
* `nb-connect 1.0.3`
* `quinn 0.5.4`
* `quinn 0.6.2`

### Release notes draft

This release changes the memory layout of `Ipv4Addr`, `Ipv6Addr`, `SocketAddrV4` and `SocketAddrV6`. The standard library no longer implements these as the corresponding `libc` structs (`sockaddr_in`, `sockaddr_in6` etc.). This internal representation was never exposed, but some crates relied on it anyway by unsafely transmuting. This change will cause those crates to make invalid memory accesses. Notably `net2 <0.2.36`, `socket2 <0.3.16`, `mio <0.7.6`, `miow <0.3.6` and a few other crates are affected. All known affected crates have been patched and have had fixed versions published over a year ago. If any affected crate is still in your dependency tree, you need to upgrade them before using this version of Rust.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants