-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Rust UB problems #6393
Fix Rust UB problems #6393
Conversation
Memory copying in each struct accessor is a bit worrying, although I understand the motivation. We could end up in the same situation as in Java, where we have a performance trap with strings. |
I am pretty surprised Rust doesn't have a built-in way to have aligned buffers of There seems to be a lot in Rust that favors being pedantic over pragmatism, that is sad. |
Rust by default uses jemalloc and it does actually take advantage of lower alignment requirements to reduce fragmentation. This is why we have users observing alignment errors (#6359).
I mean you can get aligned raw pointers via There'll probably be a way to parameterize https://github.com/rust-lang/wg-allocators/issues The biggest issue is |
Ok, looks like all tests pass, can I get LGTMs or 👍s? |
There's some performance regression on my machine, but its not too significant, within the "+/- interval." Master:
This branch:
Maybe, but I wouldn't look into it. The regression is very tolerable, given we're eliminating UB. Anything to address before merging? |
Looks good. |
Nice! |
@CasperN WDYT about us running miri for all tests sometimes? |
It might be good idea but I don't know how long it takes, Miri seems around 5 orders of magnitude slower than normal rust. |
#[inline] | ||
pub fn emplace_scalar<T: EndianScalar>(s: &mut [u8], x: T) { | ||
let sz = size_of::<T>(); | ||
let mut_ptr = (&mut s[..sz]).as_mut_ptr() as *mut T; | ||
let val = x.to_little_endian(); | ||
let x_le = x.to_little_endian(); | ||
unsafe { | ||
*mut_ptr = val; | ||
core::ptr::copy_nonoverlapping( | ||
&x_le as *const T as *const u8, | ||
s.as_mut_ptr() as *mut u8, | ||
size_of::<T>() | ||
); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing checking the length of s
and still allows reading uninitialized padding bytes of structs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "still allows reading uninitialized padding bytes of structs"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
struct Struct {
x: u8,
y: u16,
}
This struct has one padding byte, which is uninitialized. Implementing EndianScalar
for that struct would allow reading such uninitialized byte.
#[inline] | ||
pub fn read_scalar<T: EndianScalar>(s: &[u8]) -> T { | ||
let sz = size_of::<T>(); | ||
|
||
let p = (&s[..sz]).as_ptr() as *const T; | ||
let x = unsafe { *p }; | ||
|
||
let mut mem = core::mem::MaybeUninit::<T>::uninit(); | ||
// Since [u8] has alignment 1, we copy it into T which may have higher alignment. | ||
let x = unsafe { | ||
core::ptr::copy_nonoverlapping( | ||
s.as_ptr(), | ||
mem.as_mut_ptr() as *mut u8, | ||
size_of::<T>() | ||
); | ||
mem.assume_init() | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing checking the length of s
and allows reading invalid representations of T
(e.g., when T
is an enum).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For length checks, we first verify the flatbuffer before accessing fields -- see #6161 for implementation. Granted, if you directly use the flatbuffers library, you can still misuse this function, but it should only be used by our generated code.
As for enums, I changed the representation in #6098 so Flatbuffers enums won't be represented by Rust enums, but unit structs of i8/u8/.../i64/u64 so all bit representations are valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Granted, if you directly use the flatbuffers library, you can still misuse this function, but it should only be used by our generated code.
Isn't the purpose of Rust to avoid UB even when (safe) functions are use incorrectly?
As for enums, I changed the representation in #6098 so Flatbuffers enums won't be represented by Rust enums, but unit structs of i8/u8/.../i64/u64 so all bit representations are valid.
There's still bool
.
@CasperN WDYT about us marking generated functions as unsafe, if they use unsafe? To @eduardosm 's point, it seems like users could potentially think that it's okay to call those functions themselves, when in general we don't want them to. Edit: or traits as unsafe. |
Yea that seems like a simple solution to this |
… problems fixed The major change of [flatbuffers 0.8.1](https://docs.rs/flatbuffers/0.8.1/flatbuffers/index.html) since 0.8.0 is google/flatbuffers#6393, which fixed some possible memory alignment issues. In this PR, the ipc/gen/*.rs files are generated by `regen.sh` as before, without any manual change. Closes #9176 from mqy/flatbuffers-0.8.1 Authored-by: mqy <meng.qingyou@gmail.com> Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
… problems fixed The major change of [flatbuffers 0.8.1](https://docs.rs/flatbuffers/0.8.1/flatbuffers/index.html) since 0.8.0 is google/flatbuffers#6393, which fixed some possible memory alignment issues. In this PR, the ipc/gen/*.rs files are generated by `regen.sh` as before, without any manual change. Closes #9176 from mqy/flatbuffers-0.8.1 Authored-by: mqy <meng.qingyou@gmail.com> Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
… problems fixed The major change of [flatbuffers 0.8.1](https://docs.rs/flatbuffers/0.8.1/flatbuffers/index.html) since 0.8.0 is google/flatbuffers#6393, which fixed some possible memory alignment issues. In this PR, the ipc/gen/*.rs files are generated by `regen.sh` as before, without any manual change. Closes #9176 from mqy/flatbuffers-0.8.1 Authored-by: mqy <meng.qingyou@gmail.com> Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
… problems fixed The major change of [flatbuffers 0.8.1](https://docs.rs/flatbuffers/0.8.1/flatbuffers/index.html) since 0.8.0 is google/flatbuffers#6393, which fixed some possible memory alignment issues. In this PR, the ipc/gen/*.rs files are generated by `regen.sh` as before, without any manual change. Closes apache#9176 from mqy/flatbuffers-0.8.1 Authored-by: mqy <meng.qingyou@gmail.com> Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
… problems fixed The major change of [flatbuffers 0.8.1](https://docs.rs/flatbuffers/0.8.1/flatbuffers/index.html) since 0.8.0 is google/flatbuffers#6393, which fixed some possible memory alignment issues. In this PR, the ipc/gen/*.rs files are generated by `regen.sh` as before, without any manual change. Closes apache#9176 from mqy/flatbuffers-0.8.1 Authored-by: mqy <meng.qingyou@gmail.com> Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
Hi @rw, @aardappel, @krojew
This should fix #6359, #6203, #5854, #5825, and the last issue mentioned in #4916.
Motivation
Miri is a rust interpreter that detects bad memory activity, i.e. UB. I've fixed the detected problems, which turned out to be alignment issues, and added it to
RustTest.sh
. With the verifier and UB issues sorted (so far as a core Rust tool can detect) I think the library is now acceptably safe according to the common standards of the Rust community.Details
I didn't run all tests through Miri. Its incredibly slow so I've disabled it for fuzzing and other expensive tests.
verifier_one_byte_errors_do_not_crash
is disabled but I did run it manually since its quite important. That took like an hour on my weak little macbook air. Miri tests are only running on linux since it tests an abstract memory model and there shouldn't be any difference on other platforms.Since Rust's
Vec<u8>
and[u8]
only promises alignment 1, we have to copy out of the buffer and onto the stack to get aligned access. This is a very minor performance hit that's incurred when scalars are accessed. The alternative would be to introduce a nonstandard type such asVec<u128>
or even a custom vector that promises 16byte alignment. That's pretty inconvenient for us and users.I had to rework the implementation of structs in rust to use an array of bytes rather than
repr(C, align=4)
. This is because we return references to structs from fb-vectors, and these references are alignment 1. Please take a closer look at the new implementation.Alignment checks from the verifier are now with respect to
buffer[0]
as per the Flatbuffers specification. They may not be aligned w.r.t. the system because of[u8]
's alignment. I kept this check in case a verified buffer in Rust is assumed aligned when passed to another language.