Skip to content
This repository has been archived by the owner on Apr 4, 2023. It is now read-only.

Unexpected key alterations when writing in LMDB can be caused by an undefined behavior #261

Closed
irevoire opened this issue Jun 24, 2021 · 10 comments · Fixed by #264
Closed
Labels
bug Something isn't working

Comments

@irevoire
Copy link
Member

irevoire commented Jun 24, 2021

Hello,

I was trying to reproduce this bug meilisearch/transplant#237, so I made a simple binary running only the necessary command to reproduce the bug, you can find it here: https://github.com/irevoire/milli_bug. I use this repository to run all of the commands below.

I also inserted a lot of debug print in heed (the LMDB Rust wrapper) so, as you can see in the Cargo.toml, I'm using my own version of heed that you can find here: https://github.com/irevoire/heed, on the "main" branch.

So basically, to summarize what I found, at some point, milli tries to insert an entry with a key equal to "and" by calling heed::mdb::lmdb_ffi::mdb_cursor_put, but in the end, the &[u8] milli gave to mdb_cursor_put got modified and contains "toi". I also find out that this "toi" string is a view inside of the word "antoine", I didn't check but it is likely that it is in the same memory allocation.

It seems like the Mdb_val struct, which is the equivalent of a slice, i.e. a pointer and a length, is modified in a way that it moves the original pointer forward and makes us read memory at the wrong address, making us read "toi" instead of "and", but we keep the correct length. Also, I checked this recent Reddit post about a similar bug, I tried to patch the RwCursor::put_current method, by putting the &mut inside of the unsafe block but it didn't change anything, at least it didn't fix the bug.

@Kerollmops already tried to run cargo miri on the heed project but, unfortunately, as it is a wrapper on top of LMDB and miri doesn't check memory mapping correctness, it can't be checked with this great tool.

If you want to reproduce the bug, you can clone my repository and run cargo build. Then run rust-gdb target/debug/milli_bug then:

  • Put a breakpoint in b lmdb_ffi.rs:118
  • Run the executable: r
  • Continue till the second indexation: c
  • Look at the variable named keyx/s (*key).mv_data, the value should be and.
  • Go just after the call to mdb_cursor_put: n 5
  • Look at the variable named key: x/s key.data_ptr, the value should be toi.

This is not supposed to be possible since LMDB does not modify this variable in place, at least that's what we think, but LMDB is a well-battle-tested library.

@Kerollmops Kerollmops added the bug Something isn't working label Jun 24, 2021
@Kerollmops Kerollmops changed the title Undefined Behaviour? An undefined behavior could be the cause of unexpected key alterations when writing in LMDB Jun 24, 2021
@Kerollmops Kerollmops changed the title An undefined behavior could be the cause of unexpected key alterations when writing in LMDB Unexpected key alterations when writing in LMDB can be caused by an undefined behavior Jun 24, 2021
@Diggsey
Copy link

Diggsey commented Jun 24, 2021

Don't transmute pointers to integers. Just cast them with as: https://github.com/irevoire/heed/blob/b93ff8d4b185e0984396e6778c03f0677957f3eb/heed/src/mdb/lmdb_ffi.rs#L175

There's a chance transmuting pointers to integers could cause LLVM miscompilations (if this turns out to be the cause please LMK!)

@irevoire
Copy link
Member Author

irevoire commented Jun 24, 2021

Thanks for the answer! That was our original code actually but I tried to use a transmute hoping it would change something... It didn't 😔

EDIT: I patched heed to use value.as_ptr() as *mut libc::c_void instead of std::mem::transmute(value.as_ptr()), you can just run cargo update in milli_bug 😁

@Kerollmops
Copy link
Member

Kerollmops commented Jun 25, 2021

Hey @RalfJung,

Can I ask you for a little bit of help on this one? Do you have time to help us?

I am sure the bug is coming from an undefined behavior somewhere, maybe in the RwCursor::put_current function (we use the v0.10.6 tag) as it occurs around it. As explained in the original issue comment I tried to run cargo miri on heed, where there is a lot of unsafe blocks, unfortunately, it is an LMDB wrapper with a lot of memory mapping everywhere and cargo miri doesn't support that.

You can easily reproduce the bug by cloning the miri_bug repository and do a cargo run. The program should panic with a key < &last_key invalid assertion. The reason for this is that somewhere inside of the RwCursor::put_current method (from the heed crate) the key argument is modified in a way that LMDB sees the original "and" key becomes "toi" and writes it inside of the database. Further engine operation sees an unordered "toi" key and makes this assert to panic.

@wagnerf42
Copy link

a watchpoint in gdb should catch the change no ?

@irevoire
Copy link
Member Author

irevoire commented Jun 25, 2021

Yes, it was a good idea!
We probably found the bug. It looks like there is a bug in lmdb, when we call mdb_cursor_put, it modifies our key in place instead of... like... not doing it?
So, either we are not using lmdb right, or there is a bug in it.
Monday, we'll try to reproduce the bug in pure C to create a minimal bug report for Howard Chu.

@RalfJung
Copy link

RalfJung commented Jun 25, 2021

I'm afraid I don't really have the time to dig into large foreign codebases like this. I'm happy to answer specific questions such as "is this small self-contained bit of code UB under these circumstances". :)

But it looks like you got a lead, so I hope you can figure it out that way. :D

@Pointerbender
Copy link

It seems like the Mdb_val struct, which is the equivalent of a slice, i.e. a pointer and a length, is modified in a way that it moves the original pointer forward and makes us read memory at the wrong address, making us read "toi" instead of "and", but we keep the correct length.
(...)
This is not supposed to be possible since LMDB does not modify this variable in place, at least that's what we think, but LMDB is a well-battle-tested library.

LMDB can modify the position of the cursor in case of a failure ("The cursor is positioned at the new item, or on failure usually near it." docs). However, in this case ffi::mdb_cursor_put() returns 0 (success) so it should not move the cursor. If I add some extra debug statements before and after the call to ffi::mdb_cursor_put(cursor, key, data, flags), then the pointer address stored at (*key).mv_data remains the same before and after the call. This means that LMDB indeed does not move the pointer. Rather, somehow the bytes located at *(*key).mv_data appear to be overwritten. It's not much yet and I'm not sure if it helps at all, but hopefully it's a bread crumb that will eventually lead closer to the solution :-)

@irevoire
Copy link
Member Author

Yes, thanks for your investigation!
We noticed the same thing, the pointer does not change, and the data inside key.mv_data get modified. I can even tell you that mdb_cursor_put modify the data at this line exactly:

	memmove(base + sz, base, ptr - mp->mp_upper);

From the function mdb_node_del that was called by at this line in the function mdb_cursor_put
But I still have no idea of why lmdb needs to modify my key 😠

@Pointerbender
Copy link

Thanks for those pointers into the LMDB code base. I have a theory as to why this is happening and why this is undefined behavior on the Rust side.

When the call to ffi::mdb_cursor_put changes the key buffer from "and" to "toi", I believe the following is happening:

It ends up calling mdb_node_del, because even though the key length of 3 stays the same, the length of the data is different (see line 6915). To be more precise, before the insert under observation, the length of the data that is sitting in the B+ tree of LMDB at the current cursor position has a length of 20. The newly inserted data has a length of 18.

As per the comment related to line 6915, LMDB chooses to shorten the data entry in the B+ tree to reclaim unused space on the leaf page (2 bytes). It does this by first deleting the entry with key "and" (the call to memmove that you found, moves all the nodes to the right of the current entry one entry to the left, overwriting the memory area where the "and" node used to be).

Now comes the tricky part. The instances of the MDB_val structs (*key and *value) remain immutable, because the cursor is not moved in case mdb_cursor_put returns 0 (success). But the data located at *(*key).mv_data and *(*data).mv_data do get modified, because of the node being removed. Just by itself, this not undefined behavior yet, because the memory that holds the B+tree page is disjoint from the MDB_val structs.

So where does the UB then come from? Let's take a look at the method signature of pub fn put_current(&mut self, key: &[u8], data: &[u8]) -> Result<bool>. The key part (pun intended) is that key is an immutable reference &[u8]. What this tells the Rust compiler, is that the memory at *key will not be changed anywhere, we have shared read-only access to these bytes. The Rust compiler makes optimizations based on this assumption (hence the call to dbg!(key2); in put_current shows key2 = [97, 110, 100,] = b"and", because the compiler believes it could not have been modified since it was passed to put_current as an immutable argument, and does not re-read the bytes at this memory location).

(Bear with me, I'm getting to the part how this causes UB :-) ). Next, let look at line let mut key_val = unsafe { crate::into_val(&key) }; and the body of that method:

pub unsafe fn into_val(value: &[u8]) -> ffi::MDB_val {
    ffi::MDB_val {
        mv_data: value.as_ptr() as *mut libc::c_void,
        mv_size: value.len(),
    }
}

value.as_ptr() returns a *const u8 pointer which is cast to a *mut libc::c_void pointer. So far, still no undefined behavior. Pointer casts are not UB (in fact, this part is safe Rust). However, modifying the bytes pointed to by a pointer that was derived from a shared reference &[u8] (which is read-only), is definitely UB. Now, we recall that mdb_node_del indeed modifies the bytes at this location. This is UB under the promise of the API signatures of put_current and into_val, where these bytes are claimed to be immutable.

If we had to pinpoint the single point where this goes wrong, then where would it be? I don't believe LMDB is at fault. The documentation at mdb_cursor_put makes no claims about immutability of the MDB_val nor immutability of the memory location it points to. If we look at it's FFI signature pub unsafe fn mdb_cursor_put(cursor: *mut MDB_cursor, key: *mut MDB_val, data: *mut MDB_val, flags: ::libc::c_uint) -> ::libc::c_int, it is also semantically correct in the sense that anyone can read and write to a *mut pointer location. The *mut MDB_val type signals us that the memory at this location may be written to, but this says nothing about the memory location of the node in the B+ tree, it merely says this about the MDB_val struct itself. If we look at the definition of MDB_val, we see that its field mv_data also holds a mutable pointer *mut c_void, so this is also semantically correct when mdb_node_del decides to modify the memory behind mv_data.

The sequence of events that eventually triggers the UB, is that it takes an immutable reference reference &[u8] and derives a shared read-only *mut pointer from it (through into_val), which then actually gets used by LMDB to modify the bytes behind the &[u8] reference. Even though the UB happens inside LMDB, it is a side effect of calling the mdb_cursor_put function when passing a shared read-only *mut pointer from Rust, where a shared read-write *mut pointer is expected. Casting a *const pointer to a *mut pointer, does not automatically make it have "shared read-write" status. It is in fact still a shared read-only *const pointer disguised as a *mut pointer.

But I still have no idea of why lmdb needs to modify my key

As to why LMDB modifies the key, it's because the entire node for that particular key is removed due to the shorter data value that is being upserted (and then a new shorter node is re-inserted, but the program is then already in a state of UB, so all bets are off).

I hope this analysis helps!

@Diggsey
Copy link

Diggsey commented Jun 26, 2021

@Pointerbender I'm not too familiar with the LMDB API, but what you are describing should only be a problem if the key being passed in was borrowed directly from an existing entry.

Specifically, there's this line:

Values returned from the database are valid only until a subsequent update operation, or the end of the transaction.

If it's the case that key is borrowed directly from an existing entry, then the UB here is caused by yielding a reference to the key with an overly long lifetime from an earlier place in the code.

If key is not being borrowed from an existing entry, then mdb_cursor_put should not modify it. The call to mdb_node_del should only modify memory owned by LMDB. Also, the parameter is specifically annotated as an [in] parameter. The fact that it's a mutable pointer means nothing - it is fairly common in C++ to use mutable pointers where immutable ones would suffice.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants