Skip to content
This repository has been archived by the owner on Oct 27, 2023. It is now read-only.

Ord implementations that don't implement a total order can lead to UB #2

Open
Voultapher opened this issue May 22, 2023 · 0 comments
Open

Comments

@Voultapher
Copy link

Voultapher commented May 22, 2023

Classically sort is seen as an operation that only changes the order of the elements. Modifying or duplicating elements is surprising behavior. And in Rust this can make code that relies on this property exhibit UB.

Below is an example of such an application, that when using slice::sort or slice::sort_unstable is fine and exhibits no UB. However when using crumsort::ParCrumSort::par_crumsort it exhibits use-after-free UB. That's because crumsort_rs and the original crumsort may duplicate elements if the comparison function does not implement a total order.

use std::cmp::Ordering;
use std::collections::HashSet;
use std::sync::Mutex;

struct ObjStore {
    scratch_arena: Vec<String>,
    occupied_indices: Vec<ObjIndex>,
}

impl ObjStore {
    fn new(values: &[usize]) -> Self {
        // Ensure that there are no duplicate values.
        let values_check = HashSet::<usize>::from_iter(values.iter().copied());
        assert_eq!(values_check.len(), values.len());

        let max_value = (*values.iter().max().unwrap()) + 1;
        let mut scratch_arena: Vec<String> = Vec::with_capacity(max_value);

        let scratch_arena_ptr = scratch_arena.as_mut_ptr();
        for idx in values {
            // SAFETY: We made sure that scratch_arena is large enough so that
            // scratch_arena_ptr.add(*idx) is a valid pointer.
            unsafe {
                scratch_arena_ptr.add(*idx).write(format!("{idx:010}"));
            }
        }

        Self {
            scratch_arena,
            occupied_indices: values.iter().map(|val| ObjIndex { val: *val }).collect(),
        }
    }
}

impl Drop for ObjStore {
    fn drop(&mut self) {
        let scratch_arena_ptr = self.scratch_arena.as_mut_ptr();
        for idx in &self.occupied_indices {
            // SAFETY: We made sure that scratch_arena is large enough and the ensured that
            // scratch_arena and occupied_indices stay in sync, so that
            // scratch_arena_ptr.add(idx.val) is a valid pointer.
            unsafe {
                std::ptr::drop_in_place(scratch_arena_ptr.add(idx.val));
            }
        }
    }
}

#[derive(Copy, Clone, PartialEq, Eq, Default)]
struct ObjIndex {
    val: usize,
}

const ORD_VALS: [Ordering; 8] = [
    Ordering::Less,
    Ordering::Equal,
    Ordering::Less,
    Ordering::Greater,
    Ordering::Greater,
    Ordering::Greater,
    Ordering::Less,
    Ordering::Equal,
];

static COMP_COUNT: Mutex<usize> = Mutex::new(0);

impl PartialOrd for ObjIndex {
    fn partial_cmp(&self, _other: &Self) -> Option<Ordering> {
        // A logically wrong Ord implementation that doesn't implement a total order is considered
        // safe code in Rust.

        // This is here to have something simple and deterministic, real code would be more complex.
        let comp_count = {
            let mut c = COMP_COUNT.lock().unwrap();
            let comp_count = *c;
            *c += 1;
            comp_count
        };

        Some(ORD_VALS[comp_count % ORD_VALS.len()])
    }
}

impl Ord for ObjIndex {
    fn cmp(&self, other: &Self) -> Ordering {
        self.partial_cmp(other).unwrap()
    }
}

fn main() {
    let mut obj_store = ObjStore::new(&[218, 47, 7, 124, 163, 135, 222, 36, 31, 212]);

    // Duplicates elements -> UB
    crumsort::ParCrumSort::par_crumsort(obj_store.occupied_indices.as_mut_slice());

    // works
    // obj_store.occupied_indices.sort();
    // obj_store.occupied_indices.sort_unstable();
}

And while crumsort::ParCrumSort::par_crumsort doesn't directly trigger UB here, it's entirely conceivable that a user uses indices in a similar fashion in a real application, and that they accidentally implement that looks a lot like a total order but isn't under some circumstances. This topic has been discussed in the context of the standard library sort implementations and the consensus was that not only is this behavior surprising and may lead to other logic issues down the line, it may also lead to UB as seen here.

I discovered this issue by running crumsort_rs in my custom sort implementation test harness https://github.com/Voultapher/sort-research-rs/blob/3d5175737ccc426cbf186ddffcf2f6344171dd98/sort_test_tools/src/tests.rs wich has tests specifically for this property, e.g. violate_ord_retain_original_set.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant