Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Rust UB problems #6393

Merged
merged 11 commits into from
Jan 11, 2021
Merged

Fix Rust UB problems #6393

merged 11 commits into from
Jan 11, 2021

Conversation

CasperN
Copy link
Collaborator

@CasperN CasperN commented Jan 8, 2021

Hi @rw, @aardappel, @krojew

This should fix #6359, #6203, #5854, #5825, and the last issue mentioned in #4916.

Motivation

Miri is a rust interpreter that detects bad memory activity, i.e. UB. I've fixed the detected problems, which turned out to be alignment issues, and added it to RustTest.sh. With the verifier and UB issues sorted (so far as a core Rust tool can detect) I think the library is now acceptably safe according to the common standards of the Rust community.

Details

I didn't run all tests through Miri. Its incredibly slow so I've disabled it for fuzzing and other expensive tests.
verifier_one_byte_errors_do_not_crash is disabled but I did run it manually since its quite important. That took like an hour on my weak little macbook air. Miri tests are only running on linux since it tests an abstract memory model and there shouldn't be any difference on other platforms.

Since Rust's Vec<u8> and [u8] only promises alignment 1, we have to copy out of the buffer and onto the stack to get aligned access. This is a very minor performance hit that's incurred when scalars are accessed. The alternative would be to introduce a nonstandard type such as Vec<u128> or even a custom vector that promises 16byte alignment. That's pretty inconvenient for us and users.

I had to rework the implementation of structs in rust to use an array of bytes rather than repr(C, align=4). This is because we return references to structs from fb-vectors, and these references are alignment 1. Please take a closer look at the new implementation.

Alignment checks from the verifier are now with respect to buffer[0] as per the Flatbuffers specification. They may not be aligned w.r.t. the system because of [u8]'s alignment. I kept this check in case a verified buffer in Rust is assumed aligned when passed to another language.

@github-actions github-actions bot added c++ codegen Involving generating code from schema rust labels Jan 8, 2021
@krojew
Copy link
Contributor

krojew commented Jan 8, 2021

Memory copying in each struct accessor is a bit worrying, although I understand the motivation. We could end up in the same situation as in Java, where we have a performance trap with strings.

@aardappel
Copy link
Collaborator

I am pretty surprised Rust doesn't have a built-in way to have aligned buffers of u8. That is pretty fundamental for any serialized data, file formats, memory mapping etc. No Rust allocator will align at 1 byte granularity so it isn't even a practical issue.

There seems to be a lot in Rust that favors being pedantic over pragmatism, that is sad.

@CasperN
Copy link
Collaborator Author

CasperN commented Jan 8, 2021

No Rust allocator will align at 1 byte granularity so it isn't even a practical issue.

Rust by default uses jemalloc and it does actually take advantage of lower alignment requirements to reduce fragmentation. This is why we have users observing alignment errors (#6359).

I am pretty surprised Rust doesn't have a built-in way to have aligned buffers of u8

I mean you can get aligned raw pointers via std::alloc but its lower level than what most people prefer to use.

There'll probably be a way to parameterize Vec<u8> with an allocator that guarantees minimum 16byte alignment, and eventually maybe even a way for us to restrict allowable allocators via the type system. That doesn't exist yet but there's a working group talking about it.

https://github.com/rust-lang/wg-allocators/issues

The biggest issue is &[u8]. That's a very fundamental and widely used type and I don't see a good way for &[u8] promise a minimum alignment since you can take arbitrary subslices.

@CasperN
Copy link
Collaborator Author

CasperN commented Jan 10, 2021

Ok, looks like all tests pass, can I get LGTMs or 👍s?

@CasperN CasperN requested a review from rw January 10, 2021 19:46
@rw
Copy link
Collaborator

rw commented Jan 10, 2021

@CasperN Nice work!

What are the benchmarks like with these changes? I wonder if representing structs as arrays will cause the compiler to use the stack more than it should (instead of registers). And I'm curious about the effects of the copying that @krojew mentioned.

@CasperN
Copy link
Collaborator Author

CasperN commented Jan 10, 2021

@rw

What are the benchmarks like with these changes?

There's some performance regression on my machine, but its not too significant, within the "+/- interval."

Master:

running 16 tests
test create_byte_vector_100_naive       ... bench:         856 ns/iter (+/- 93) = 116 MB/s
test create_byte_vector_100_optimal     ... bench:          28 ns/iter (+/- 4) = 3571 MB/s
test create_canonical_buffer_then_reset ... bench:       1,096 ns/iter (+/- 170) = 277 MB/s
test create_string_10                   ... bench:          35 ns/iter (+/- 25) = 285 MB/s
test create_string_100                  ... bench:          35 ns/iter (+/- 7) = 2828 MB/s
test traverse_canonical_buffer          ... bench:         128 ns/iter (+/- 35) = 2375 MB/s

This branch:

running 16 tests
test create_byte_vector_100_naive       ... bench:         907 ns/iter (+/- 181) = 110 MB/s
test create_byte_vector_100_optimal     ... bench:          29 ns/iter (+/- 4) = 3448 MB/s
test create_canonical_buffer_then_reset ... bench:       1,120 ns/iter (+/- 173) = 267 MB/s
test create_string_10                   ... bench:          30 ns/iter (+/- 6) = 333 MB/s
test create_string_100                  ... bench:          36 ns/iter (+/- 6) = 2750 MB/s
test traverse_canonical_buffer          ... bench:         127 ns/iter (+/- 25) = 2362 MB/s

I wonder if representing structs as arrays will cause the compiler to use the stack more than it should (instead of registers).

Maybe, but I wouldn't look into it. The regression is very tolerable, given we're eliminating UB.

Anything to address before merging?

@krojew
Copy link
Contributor

krojew commented Jan 11, 2021

Looks good.

@CasperN CasperN merged commit 408cf58 into google:master Jan 11, 2021
@rw
Copy link
Collaborator

rw commented Jan 11, 2021

Nice!

@rw
Copy link
Collaborator

rw commented Jan 11, 2021

@CasperN WDYT about us running miri for all tests sometimes?

@CasperN
Copy link
Collaborator Author

CasperN commented Jan 12, 2021

WDYT about us running miri for all tests sometimes?

It might be good idea but I don't know how long it takes, Miri seems around 5 orders of magnitude slower than normal rust.

Comment on lines 151 to 161
#[inline]
pub fn emplace_scalar<T: EndianScalar>(s: &mut [u8], x: T) {
let sz = size_of::<T>();
let mut_ptr = (&mut s[..sz]).as_mut_ptr() as *mut T;
let val = x.to_little_endian();
let x_le = x.to_little_endian();
unsafe {
*mut_ptr = val;
core::ptr::copy_nonoverlapping(
&x_le as *const T as *const u8,
s.as_mut_ptr() as *mut u8,
size_of::<T>()
);
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing checking the length of s and still allows reading uninitialized padding bytes of structs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "still allows reading uninitialized padding bytes of structs"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

struct Struct {
    x: u8,
    y: u16,
}

This struct has one padding byte, which is uninitialized. Implementing EndianScalar for that struct would allow reading such uninitialized byte.

Comment on lines 172 to +183
#[inline]
pub fn read_scalar<T: EndianScalar>(s: &[u8]) -> T {
let sz = size_of::<T>();

let p = (&s[..sz]).as_ptr() as *const T;
let x = unsafe { *p };

let mut mem = core::mem::MaybeUninit::<T>::uninit();
// Since [u8] has alignment 1, we copy it into T which may have higher alignment.
let x = unsafe {
core::ptr::copy_nonoverlapping(
s.as_ptr(),
mem.as_mut_ptr() as *mut u8,
size_of::<T>()
);
mem.assume_init()
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing checking the length of s and allows reading invalid representations of T (e.g., when T is an enum).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For length checks, we first verify the flatbuffer before accessing fields -- see #6161 for implementation. Granted, if you directly use the flatbuffers library, you can still misuse this function, but it should only be used by our generated code.

As for enums, I changed the representation in #6098 so Flatbuffers enums won't be represented by Rust enums, but unit structs of i8/u8/.../i64/u64 so all bit representations are valid.

Copy link

@eduardosm eduardosm Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Granted, if you directly use the flatbuffers library, you can still misuse this function, but it should only be used by our generated code.

Isn't the purpose of Rust to avoid UB even when (safe) functions are use incorrectly?

As for enums, I changed the representation in #6098 so Flatbuffers enums won't be represented by Rust enums, but unit structs of i8/u8/.../i64/u64 so all bit representations are valid.

There's still bool.

@rw
Copy link
Collaborator

rw commented Jan 15, 2021

@CasperN WDYT about us marking generated functions as unsafe, if they use unsafe? To @eduardosm 's point, it seems like users could potentially think that it's okay to call those functions themselves, when in general we don't want them to.

Edit: or traits as unsafe.

@CasperN
Copy link
Collaborator Author

CasperN commented Jan 15, 2021

WDYT about us marking generated functions as unsafe, if they use unsafe? To @eduardosm 's point, it seems like users could potentially think that it's okay to call those functions themselves, when in general we don't want them to.

Yea that seems like a simple solution to this

alamb pushed a commit to apache/arrow that referenced this pull request Jan 19, 2021
… problems fixed

The major change of [flatbuffers 0.8.1](https://docs.rs/flatbuffers/0.8.1/flatbuffers/index.html) since 0.8.0 is google/flatbuffers#6393, which fixed some possible memory alignment issues.

In this PR, the ipc/gen/*.rs files are generated by `regen.sh` as before, without any manual change.

Closes #9176 from mqy/flatbuffers-0.8.1

Authored-by: mqy <meng.qingyou@gmail.com>
Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
kszucs pushed a commit to apache/arrow that referenced this pull request Jan 25, 2021
… problems fixed

The major change of [flatbuffers 0.8.1](https://docs.rs/flatbuffers/0.8.1/flatbuffers/index.html) since 0.8.0 is google/flatbuffers#6393, which fixed some possible memory alignment issues.

In this PR, the ipc/gen/*.rs files are generated by `regen.sh` as before, without any manual change.

Closes #9176 from mqy/flatbuffers-0.8.1

Authored-by: mqy <meng.qingyou@gmail.com>
Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
alamb pushed a commit to apache/arrow-rs that referenced this pull request Apr 20, 2021
… problems fixed

The major change of [flatbuffers 0.8.1](https://docs.rs/flatbuffers/0.8.1/flatbuffers/index.html) since 0.8.0 is google/flatbuffers#6393, which fixed some possible memory alignment issues.

In this PR, the ipc/gen/*.rs files are generated by `regen.sh` as before, without any manual change.

Closes #9176 from mqy/flatbuffers-0.8.1

Authored-by: mqy <meng.qingyou@gmail.com>
Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
GeorgeAp pushed a commit to sirensolutions/arrow that referenced this pull request Jun 7, 2021
… problems fixed

The major change of [flatbuffers 0.8.1](https://docs.rs/flatbuffers/0.8.1/flatbuffers/index.html) since 0.8.0 is google/flatbuffers#6393, which fixed some possible memory alignment issues.

In this PR, the ipc/gen/*.rs files are generated by `regen.sh` as before, without any manual change.

Closes apache#9176 from mqy/flatbuffers-0.8.1

Authored-by: mqy <meng.qingyou@gmail.com>
Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
michalursa pushed a commit to michalursa/arrow that referenced this pull request Jun 13, 2021
… problems fixed

The major change of [flatbuffers 0.8.1](https://docs.rs/flatbuffers/0.8.1/flatbuffers/index.html) since 0.8.0 is google/flatbuffers#6393, which fixed some possible memory alignment issues.

In this PR, the ipc/gen/*.rs files are generated by `regen.sh` as before, without any manual change.

Closes apache#9176 from mqy/flatbuffers-0.8.1

Authored-by: mqy <meng.qingyou@gmail.com>
Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
@CasperN CasperN deleted the miri2 branch August 24, 2021 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ codegen Involving generating code from schema rust
Projects
None yet
5 participants