Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undefined behavior during decoding #75

Closed
StarStarJ opened this issue Nov 27, 2024 · 14 comments · Fixed by #76
Closed

Undefined behavior during decoding #75

StarStarJ opened this issue Nov 27, 2024 · 14 comments · Fixed by #76

Comments

@StarStarJ
Copy link

Tried to use this crate as an alternative to the zstd crate, to allow tests running with miri, but it crashed due to UB:

test test::my_test ... error: Undefined Behavior: using uninitialized data, but this operation requires initialized memory
   --> /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ruzstd-0.7.2/src/decoding/ringbuffer.rs:462:30
    |
462 |             .write_unaligned(src.0.cast::<CopyType>().read_unaligned())
    |                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ using uninitialized data, but this operation requires initialized memory
    |
    = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
    = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
    = note: BACKTRACE on thread `test::my_test`:
    = note: inside `ruzstd::decoding::ringbuffer::copy_bytes_overshooting` at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ruzstd-0.7.2/src/decoding/ringbuffer.rs:462:30: 462:71
    = note: inside `ruzstd::decoding::ringbuffer::RingBuffer::extend_from_within_unchecked` at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ruzstd-0.7.2/src/decoding/ringbuffer.rs:307:17: 307:62
    = note: inside `ruzstd::decoding::decodebuffer::DecodeBuffer::repeat_in_chunks` at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ruzstd-0.7.2/src/decoding/decodebuffer.rs:153:17: 154:72
    = note: inside `ruzstd::decoding::decodebuffer::DecodeBuffer::repeat` at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ruzstd-0.7.2/src/decoding/decodebuffer.rs:108:17: 108:71
    = note: inside `ruzstd::decoding::sequence_execution::execute_sequences` at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ruzstd-0.7.2/src/decoding/sequence_execution.rs:75:13: 77:65
    = note: inside `ruzstd::decoding::block_decoder::BlockDecoder::decompress_block::<&mut &mut &[u8]>` at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ruzstd-0.7.2/src/decoding/block_decoder.rs:448:13: 448:41
    = note: inside `ruzstd::decoding::block_decoder::BlockDecoder::decode_block_content::<&mut &mut &[u8]>` at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ruzstd-0.7.2/src/decoding/block_decoder.rs:354:17: 354:65
    = note: inside `ruzstd::frame_decoder::FrameDecoder::decode_blocks::<&mut &[u8]>` at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ruzstd-0.7.2/src/frame_decoder.rs:416:44: 417:94
    = note: inside `<ruzstd::streaming_decoder::StreamingDecoder<&[u8], ruzstd::frame_decoder::FrameDecoder> as std::io::Read>::read` at /home/me/.cargo/registry/src/index.crates.io-6f17d22bba15001f/ruzstd-0.7.2/src/streaming_decoder.rs:116:19: 119:14
    = note: inside `std::io::default_read_to_end::small_probe_read::<ruzstd::streaming_decoder::StreamingDecoder<&[u8], ruzstd::frame_decoder::FrameDecoder>>` at /home/me/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/mod.rs:424:19: 424:37
    = note: inside `std::io::default_read_to_end::<ruzstd::streaming_decoder::StreamingDecoder<&[u8], ruzstd::frame_decoder::FrameDecoder>>` at /home/me/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/mod.rs:439:20: 439:44
    = note: inside `<ruzstd::streaming_decoder::StreamingDecoder<&[u8], ruzstd::frame_decoder::FrameDecoder> as std::io::Read>::read_to_end` at /home/me/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/mod.rs:876:9: 876:45

The invoking code is not really special:

let mut uncompressed_file = <Vec<u8>>::new();
#[cfg(feature = "pure_rust_zstd")]
{
    let mut decoder =
        ruzstd::StreamingDecoder::new(&file[offset..offset + size])?;
    decoder.read_to_end(&mut uncompressed_file)?;
}
#[cfg(not(feature = "pure_rust_zstd"))]
{
    let mut dec = zstd::Decoder::new(&file[offset..offset + size])?;
    dec.read_to_end(&mut uncompressed_file)?;
    dec.finish();
}

where offset might be an arbitrary offset within a bigger buffer

@paolobarbolini
Copy link
Contributor

paolobarbolini commented Nov 27, 2024

I was able to reproduce this by running tests with miri (cargo miri test -- short_writer). I'll try fixing it if nobody else gets to it faster.

@paolobarbolini
Copy link
Contributor

paolobarbolini commented Nov 27, 2024

EDIT: the first point ends up being wrong, but fixing it allowed miri to continue and expose the real problem in point 2.

Here's what I found for now:

  1. The memory copy at the end has to be done with MaybeUninit if some of the memory may be uninitialized.
  2. Fixing the first issue breaks another thing: we try to read past the heap allocation, which provenance does not allow. I'm still not sure how to fix this part. See comments in the diff.
diff --git a/src/decoding/ringbuffer.rs b/src/decoding/ringbuffer.rs
index e364d90..92f2987 100644
--- a/src/decoding/ringbuffer.rs
+++ b/src/decoding/ringbuffer.rs
@@ -1,5 +1,5 @@
 use alloc::alloc::{alloc, dealloc};
-use core::{alloc::Layout, ptr::NonNull, slice};
+use core::{alloc::Layout, mem::MaybeUninit, ptr::NonNull, slice};
 
 pub struct RingBuffer {
     // Safety invariants:
@@ -295,6 +295,18 @@ impl RingBuffer {
     /// 2. More then len reserved space so we do not write out-of-bounds
     #[warn(unsafe_op_in_unsafe_fn)]
     pub unsafe fn extend_from_within_unchecked(&mut self, start: usize, len: usize) {
+        std::println!("-- extend_from_within_unchecked --");
+        std::println!("buf ptr: {}..{}", self.buf.as_ptr() as usize, unsafe {
+            self.buf.as_ptr().add(self.cap)
+        }
+            as usize);
+        std::println!("len: {}", self.len());
+        std::println!("capacity: {}", self.cap);
+        std::println!("head: {}", self.head);
+        std::println!("tail: {}", self.tail);
+        std::println!("copy start: {}", start);
+        std::println!("copy len: {}", len);
+
         if self.head < self.tail {
             // continous data slice  |____HDDDDDDDT_____|
             let after_tail = usize::min(len, self.cap - self.tail);
@@ -307,7 +319,22 @@ impl RingBuffer {
                 copy_bytes_overshooting(src, dst, after_tail);
 
                 if after_tail < len {
-                    let src = (src.0.add(after_tail), src.1 - after_tail);
+                    let src = (
+                        src.0.add(after_tail),
+                        if (src.1 - after_tail) == 93 {
+                            // The next issue that comes up is that one of the calls
+                            // to `copy_bytes_overshooting` gives a src len of 93,
+                            // which is incorrect because we have just 10 bytes
+                            // before the end of the allocation. Giving it a correct
+                            // value makes it happy.
+                            // Maybe it's ok to give it 93 but then something is wrong
+                            // when handling the wrapping around?
+                            std::println!("INCORRECTLY CALCULATED");
+                            10
+                        } else {
+                            src.1 - after_tail
+                        },
+                    );
                     let dst = (self.buf.as_ptr(), self.head);
                     copy_bytes_overshooting(src, dst, len - after_tail);
                 }
@@ -455,11 +482,23 @@ unsafe fn copy_bytes_overshooting(
     const COPY_AT_ONCE_SIZE: usize = core::mem::size_of::<CopyType>();
     let min_buffer_size = usize::min(src.1, dst.1);
 
+    std::println!("src ptr: {}", src.0 as usize);
+    std::println!("src len: {}", src.1);
+    std::println!("dst ptr: {}", dst.0 as usize);
+    std::println!("dst len: {}", dst.1);
+
     // Can copy in just one read+write, very common case
     if min_buffer_size >= COPY_AT_ONCE_SIZE && copy_at_least <= COPY_AT_ONCE_SIZE {
+        /*
         dst.0
             .cast::<CopyType>()
-            .write_unaligned(src.0.cast::<CopyType>().read_unaligned())
+            .write_unaligned(src.0.cast::<CopyType>().read_unaligned());
+        */
+
+        // We need `MaybeUninit` because part of the memory we are copying may be uninitialized
+        dst.0
+            .cast::<MaybeUninit<CopyType>>()
+            .write_unaligned(src.0.cast::<MaybeUninit<CopyType>>().read_unaligned());
     } else {
         let copy_multiple = copy_at_least.next_multiple_of(COPY_AT_ONCE_SIZE);
         // Can copy in multiple simple instructions

@paolobarbolini
Copy link
Contributor

I think I get what's happening. This debug_assert seems to catch the issue:

diff --git a/src/decoding/ringbuffer.rs b/src/decoding/ringbuffer.rs
index e364d90..573cbea 100644
--- a/src/decoding/ringbuffer.rs
+++ b/src/decoding/ringbuffer.rs
@@ -309,7 +309,11 @@ impl RingBuffer {
                 if after_tail < len {
                     let src = (src.0.add(after_tail), src.1 - after_tail);
                     let dst = (self.buf.as_ptr(), self.head);
-                    copy_bytes_overshooting(src, dst, len - after_tail);
+                    let copy_at_least = len - after_tail;
+                    debug_assert!(
+                        (self.head + start) + after_tail + src.1 + copy_at_least <= self.cap
+                    );
+                    copy_bytes_overshooting(src, dst, copy_at_least);
                 }
             }
         } else {

@paolobarbolini
Copy link
Contributor

Could anyone help me test #76?
I found the RingBuffer very low level so I went by intuition and guessing.
It'd be useful to test this with miri on real world .zst files. I'm currently waiting for MIRIFLAGS=-Zmiri-disable-isolation cargo miri test to finish on the test_decode_corpus_files test. Testing with debug assertions could also be useful, if miri is too slow, at least to check that the subtractions I've added can't underflow.

@paolobarbolini
Copy link
Contributor

Meanwhile I've opened a draft for the RustSec advisory at rustsec/advisory-db#2147

@Shnatsel
Copy link
Contributor

I tried to reproduce this on latest master (without the PR with the fix) under Memory Sanitizer, which should catch any reads from uninitialized memory. However, I was not able to trigger this issue.

I wonder if this is a limitation of the fuzzing harness? Could someone try this code under miri and let me know if the issue is observable under miri instead of MSAN?

let mut frame_dec = frame_decoder::FrameDecoder::new();
match frame_dec.reset(&mut content){
Ok(_) => {
let _ = frame_dec.decode_blocks(&mut content,frame_decoder::BlockDecodingStrategy::All);
}
Err(_) => {/* nothing */}
}

@paolobarbolini
Copy link
Contributor

I'm a noob at this, but tests were able to catch the issue both with Memory Sanitizer and Address Sanitizer. I'm not sure how to run fuzzing with miri, but given enough iterations the default config may catch it on decode or interop? I'll see tomorrow.

@Shnatsel
Copy link
Contributor

There's no way to run fuzzing with miri.

@KillingSpark
Copy link
Owner

Sorry for chiming in so late, that's definitely something to wake up to.

I think @paolobarbolini has found and fixed the issue in #76 (Thanks!!)

I'll look into including miri in the CI to hopefully catch something like this earlier in the future.

@KillingSpark
Copy link
Owner

KillingSpark commented Nov 28, 2024

Update:

@KillingSpark
Copy link
Owner

@Shnatsel I wanted to share my thoughts on your input regarding the sanitizers and fuzzing

I wonder if this is a limitation of the fuzzing harness? Could someone try this code under miri and let me know if the issue is observable under miri instead of MSAN?

Actually hitting this unintentionally is pretty unlikely. This only happens when

  • The tail of the ring buffer has wrapped around but the head hasn't
  • A data region is being copied that is also split or comes very close to the end of the allocation

Making the fuzzer generate valid sequences isn't easy, the format is pretty unforgiving in that regard. So the fuzzer doesn't really run this code a lot. In combination this explains why the fuzzer doesn't easily hit this UB.

Thetest_decode_corpus_files test contains at least one file that runs into this under miri (before the fix). It does take a lot of time to get to it though. Does msan maybe catch that? Those files are generated by a tool of the original zstd project, meant to provoke uncommon situations in decoders.

It could be worthwile to fuzz the ringbuffer/sequence-execution separately by generating random sequences though.

@Shnatsel
Copy link
Contributor

@KillingSpark thanks, those steps all seem correct to me!

If miri proves to be too slow, tests with Address Sanitizer should cover most of the same issues but run much faster.

Regarding fuzzing: I've changed the fuzzing harness to use a StreamingDecoder locally and the fuzzer triggered the issue in under a minute. This issue would have been found a lot sooner if the fuzzing harness was representative of how this library is typically used. I'm going to make a PR for that, and run more fuzzing just to be sure.

@Shnatsel
Copy link
Contributor

This issue has been assigned RUSTSEC-2024-0400. The Github Security Advisory system should pick it up in a few hours.

@StarStarJ
Copy link
Author

StarStarJ commented Nov 29, 2024

ASan cannot detect uninitialized memory read/writes sadly.

One alternative to MSan is using valgrind (with memcheck), but it's also pretty slow, but is easier to use than MSan (and miri if non-rust code is used -- not interesting for this crate). For my use case valgrind is faster than miri

(But I'd still recommend Miri, since Rust defines things like two mutable refs to the same address as UB, which can easily happen when dereferencing pointers)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants