Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rust] Generated enums can cause undefined behavior #5467

Closed
jean-airoldie opened this issue Jul 30, 2019 · 28 comments
Closed

[rust] Generated enums can cause undefined behavior #5467

jean-airoldie opened this issue Jul 30, 2019 · 28 comments
Labels

Comments

@jean-airoldie
Copy link
Contributor

jean-airoldie commented Jul 30, 2019

Just remembered that a discriminant absent from an C-like enum is considered undefined behaviour in rust. This means that all currently generated enums could potentially cause undefined behavior if someone adds a new enum field and sends that buffer to someone using a older version of the schema.

I think the ideal solution would be to use a standard enums and add the appropriate TryFrom implementation (e.g. TryFrom<u8>). Maybe we should returned the unmatched integer as the error type.

@jean-airoldie
Copy link
Contributor Author

I'll take a look at the code generator tomorrow if I find the time.

@rw
Copy link
Collaborator

rw commented Jul 30, 2019

@jean-airoldie Great! You think this is better than using the non-exhaustive attribute you mentioned in #5405 ?

@jean-airoldie
Copy link
Contributor Author

jean-airoldie commented Jul 31, 2019

I think the two should be used together. The problem with only using the non-exhaustive is that it does not prevent UB. As soon as an enum is created with a discriminant not in the type definition we get UB, before even matching it.

Using non-exhaustive is meant to "future-proof" enums so that they can add variants without it being a breaking change (because this is basically enforced by the compiler). So still I think this usage fits flatbuffers.

@rw
Copy link
Collaborator

rw commented Jul 31, 2019

Thanks for thinking this through.

I'm interested in your TryFrom idea. Want to write up some code (in a gist, maybe) that shows what you're thinking?

@jean-airoldie
Copy link
Contributor Author

Sure I'll do that.

@jean-airoldie
Copy link
Contributor Author

jean-airoldie commented Jul 31, 2019

This is the simplest solution I can think of.

// schema.fbs

enum MsgType: uint8 {
    Notification,
    Data,
}
// This is the generated code
#[allow(non_camel_case_types)]
#[repr(u8)]
#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Debug)]
enum MsgType {
    Notification = 0,
    Data = 1,
}

const ENUM_MIN_MSG_TYPE: u8 = 0;
const ENUM_MAX_MSG_TYPE: u8 = 1;

impl flatbuffers::EndianScalar for MsgType {
  #[inline]
  fn to_little_endian(self) -> Self {
    let n = u8::to_le(self as u8);
    let p = &n as *const u8 as *const MsgType;
    unsafe { *p }
  }
  #[inline]
  fn try_from_little_endian(self) -> Result<Self, u8> {
    let n = u8::from_le(self as u8);
    if n > ENUM_MAX_MSG_TYPE || n < ENUM_MIN_MSG_TYPE {
        Err(n)
    } else {
        let p = &n as *const u8 as *const MsgType;
        Ok(unsafe { *p })
    }
  }
}

@jean-airoldie

This comment has been minimized.

@theduke
Copy link

theduke commented Aug 27, 2019

Is anyone working on this?

I already ran into this issue after adding a variant.

@jean-airoldie
Copy link
Contributor Author

jean-airoldie commented Aug 28, 2019

I'm not, so my guess is no one is. If you are interested a PR is welcomed, otherwise I'll implement it when I get the time (at last 1 month from now).

You can step around this by always adding a _ => () clause to your match statements, but its easy to forget, and not a good work around.

@jean-airoldie
Copy link
Contributor Author

I think the ideal solution would be to use the FromPrimitive and ToPrimitive traits from the num crate. https://docs.rs/num-derive/0.3.0/num_derive/derive.FromPrimitive.html

@vglavnyy
Copy link
Contributor

Is any progress with this issue?
I'm working with #5108 (connected issue #5194).
It is impossible to assign 0 in Rust, if zero isn't part of an enum.

enum Color:ubyte (bit_flags) { 
  Red = 0, // color Red = (1u << 0)
  Green, 
  Blue = 3, 
}

table Monster {
  color_mix: Color = "Red Blue"; // allowed by fbs grammar but will lead to an error in Rust
  color_none: Color; // error #5194 (0 isn't part of enum)
}

@jean-airoldie
I need advice on 0-default issues (#5194, #5108) .
One way is to inject an implicit 0-default field into the enum at code-generation time (like cpp_gen does).
Is there other solution?

@jean-airoldie
Copy link
Contributor Author

Is any progress with this issue?

I'm not working on it atm.

It is impossible to assign 0 in Rust, if zero isn't part of an enum.

I'm not sure I'm following, are you talking about the generated code or a rust c enum?

What kind of error is produced by the "Red Blue" bitfield in your example in rust?

@jean-airoldie
Copy link
Contributor Author

jean-airoldie commented Oct 17, 2019

For a bitflag in rust you would want to use an integer as opposed to an enum. But ideally you would want something more convenient like the bitflag crate.

@vglavnyy
Copy link
Contributor

I'm not sure I'm following, are you talking about the generated code or a rust c enum?

What kind of error is produced by the "Red Blue" bitfield in your example in rust?

I'm talking about the generated code.

Rust code generator fails with:

/flatbuffers/src/idl_gen_rust.cpp:738: std::__cxx11::string flatbuffers::rust::RustGenerator::GetDefaultScalarValue(const flatbuffers::FieldDef&): Assertion `ev' failed.

At generation time "Red Blue" is equal to ((1u << 0) | (1u << 3)) which isn't part of Color enum.

case ftEnumKey: {
auto ev = field.value.type.enum_def->FindByValue(field.value.constant);
assert(ev);
return WrapInNameSpace(field.value.type.enum_def->defined_namespace,
GetEnumValUse(*field.value.type.enum_def, *ev));
}

The bitflag crate is a good idea but the 0-default issue is a general problem:

enum WithoutZero : ubyte { 
  Two = 2
}
table Monster {
  test: WithoutZero; // error, 0  isn't part of enum
}

What value should the program pass to GetEnumValUse if field.value.constant == "0" and 0 is not part of user-defined fbs-enum?

@jean-airoldie
Copy link
Contributor Author

jean-airoldie commented Oct 18, 2019

At generation time "Red Blue" is equal to ((1u << 0) | (1u << 3)) which isn't part of Color enum.

The problem is that a bitflag really isn't an enum, but rather a combination of enums. So you would need a special case where you check that every individual flag is contained within the enum.
So with Red Blue you would check that both Red and Blue are separately contained in the enum.

What value should the program pass to GetEnumValUse if field.value.constant == "0" and 0 is not part of user-defined fbs-enum?

I think the enum default value should be 0 like you mentioned. However this will cause some issues with the current generated code, since it will guarantee UB for any enum that does not contain 0. This is because the current enum deserialization method is unsafe and flawed.

To solve this issue we need to change the enum generated code. I can help but I'm really busy atm so I can't commit too much time.

This is what the current generated code would look like:

#[allow(non_camel_case_types)]
#[repr(u8)]
#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Debug)]
pub enum WithoutZero {
  Two = 2,
}

const ENUM_MIN_WITHOUT_ZERO: u8 = 2;
const ENUM_MAX_WITHOUT_ZERO: u8 = 2;

impl<'a> flatbuffers::Follow<'a> for WithoutZero {
  type Inner = Self;
  #[inline]
  fn follow(buf: &'a [u8], loc: usize) -> Self::Inner {
    // this would cause UB if scalar == 0 && 0 is not a valid enum value
    flatbuffers::read_scalar_at::<Self>(buf, loc) 
  }
}

impl flatbuffers::EndianScalar for WithoutZero {
  #[inline]
  fn to_little_endian(self) -> Self {
    let n = u8::to_le(self as u8);
    let p = &n as *const u8 as *const WithoutZero;
    unsafe { *p } // this is also potentially UB
  }
  #[inline]
  fn from_little_endian(self) -> Self {
    let n = u8::from_le(self as u8);
    let p = &n as *const u8 as *const WithoutZero;
    unsafe { *p } // so is this
  }
}

impl flatbuffers::Push for WithoutZero {
    type Output = WithoutZero;
    #[inline]
    fn push(&self, dst: &mut [u8], _rest: &[u8]) {
        flatbuffers::emplace_scalar::<WithoutZero>(dst, *self);
    }
}

And this is what I think it should look like:

#[allow(non_camel_case_types)]
#[repr(u8)]
#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Debug, FromPrimitive, ToPrimitive)]
pub enum WithoutZero {
  Two = 2,
}

const ENUM_MIN_WITHOUT_ZERO: u8 = 2;
const ENUM_MAX_WITHOUT_ZERO: u8 = 2;

impl<'a> flatbuffers::Follow<'a> for WithoutZero {
  // We make the de-serialization fallible
  type Inner = Option<Self>;
  #[inline]
  fn follow(buf: &'a [u8], loc: usize) -> Self::Inner {
    // We read as a scalar then do the faillible FromPrimitive conversion
    let scalar = flatbuffers::read_scalar_at::<u8>(buf, loc);
    Self::from_u8(scalar)
  }
}

impl flatbuffers::Push for WithoutZero {
    type Output = u8;
    #[inline]
    fn push(&self, dst: &mut [u8], _rest: &[u8]) {
        let scalar = self.to_u8().unwrap(); // We use the ToPrimitive conversion.
        flatbuffers::emplace_scalar::<u8>(dst, scalar);
    }
}

@jean-airoldie
Copy link
Contributor Author

jean-airoldie commented Oct 18, 2019

As for bitflags, here is what I think the generated code should look like:

enum Color:uint8 (bit_flags) { 
  Red = 0, // color Red = (1u << 0)
  Green, 
  Blue = 3, 
}
bitflags! {
    #[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Debug)]
    pub struct Color: u8 {
        pub const RED = 1 << 0;
        pub const GREEN = 1 << 1;
        pub const BLUE = 1 << 2;
    }
}

impl<'a> flatbuffers::Follow<'a> for Color {
  type Inner = Option<Self>;
  #[inline]
  fn follow(buf: &'a [u8], loc: usize) -> Self::Inner {
    let scalar = flatbuffers::read_scalar_at::<u8>(buf, loc);
    Self::from_bits(scalar)
  }
}

impl flatbuffers::Push for Color {
    type Output = u8;
    #[inline]
    fn push(&self, dst: &mut [u8], _rest: &[u8]) {
        let scalar = self.bits();
        flatbuffers::emplace_scalar::<u8>(dst, scalar);
    }
}

@vglavnyy
Copy link
Contributor

@jean-airoldie I get you.
Thank you for the detailed explanation. I will try this solution.

@jcrevier
Copy link

@jean-airoldie do you know if this also an issue for the enums generated to discriminate flatbuffer unions? A brief glance I took today suggested yes.

@jean-airoldie
Copy link
Contributor Author

Good question, I'm not sure unions are handled internally since I don't use them. I'll take a look.

@jean-airoldie
Copy link
Contributor Author

Yep the same problem occurs in unions unfortunately.

let n = u8::from_le(self as u8);
let p = &n as *const u8 as *const AnyAmbiguousAliases;
unsafe { *p }

@vglavnyy
Copy link
Contributor

@jean-airoldie
I need your help with re-export of FromPrimitive and ToPrimitive.

flatbuffers/src/lib.rs

pub extern crate num_derive;
pub extern crate num_traits;

flatbuffers/cargo.toml

[dependencies]
smallvec = "0.6"
num-traits = "0.2"
num-derive = "0.3"

monster_test_generated.rs

  use self::flatbuffers::num_derive::{FromPrimitive, ToPrimitive}; // resolve #[derive(FromPrimitive)]
  use self::flatbuffers::num_traits::*; // doesn't work

I get the error: "error[E0463]: can't find crate for _num_traits".
Could you advise an example with re-export of num_traits or serdes?

@jean-airoldie
Copy link
Contributor Author

// flatbuffers/src/lib.rs
pub extern crate num_derive;
pub extern crate num_traits;

Is the old style way (pre-edition 2018) of saying that you depend on this crate. This does not reexport the crate.

To reexport the crate:

// flatbuffers/src/lib.rs
pub use num_derive::{FromPrimitive, ToPrimitive};

I haven't use this old style syntax in a while so I'm not sure but its possible that you need to also add #[macro_use] to your num_derive import since you import a proc macro.

// flatbuffers/src/lib.rs
#[macro_use]
pub extern crate num_derive;
pub extern crate num_traits;

@rw
Copy link
Collaborator

rw commented Oct 31, 2019

I'm happy to review PRs to fix what you've been discussing in the thread, @jean-airoldie! If you think you have a solution.

@jean-airoldie
Copy link
Contributor Author

Yeah eventually, I'll find the time.

@vglavnyy
Copy link
Contributor

vglavnyy commented Nov 1, 2019

@jean-airoldie, @rw
I have prepared a solution for this and #5581 issues.
A draft of Optional enums.

This is my first experience with Rust.
If this is can be useful I will fix all notes and prepare PR.
This code doesn't cover bit_flags and has a problem with re-export of crates (I expect that dependencies resolution should be transitive but it doesn't).

@github-actions
Copy link
Contributor

This issue is stale because it has been open 6 months with no activity. Please comment or this will be closed in 14 days.

@github-actions github-actions bot added the stale label Jun 12, 2020
@jcrevier
Copy link

Still would be nice to see this fixed. Any updates on the status of this / and or PR #5614 ?

@stale stale bot removed the stale label Jun 12, 2020
@CasperN CasperN added the rust label Oct 8, 2020
@CasperN
Copy link
Collaborator

CasperN commented Nov 4, 2020

I think #6098 fixed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants