Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Option<Entity> to leverage niche optimization #3029

99 changes: 80 additions & 19 deletions crates/bevy_ecs/src/entity/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ use std::{
convert::TryFrom,
fmt, mem,
sync::atomic::{AtomicI64, Ordering},
num::NonZeroU32,
};

/// Lightweight unique ID of an entity.
Expand All @@ -46,7 +47,7 @@ use std::{
/// [`Query::get`](crate::system::Query::get) and related methods.
#[derive(Clone, Copy, Hash, Eq, Ord, PartialEq, PartialOrd)]
pub struct Entity {
pub(crate) generation: u32,
pub(crate) generation: NonZeroU32,
pub(crate) id: u32,
}

Expand All @@ -57,7 +58,7 @@ pub enum AllocAtWithoutReplacement {
}

impl Entity {
/// Creates a new entity reference with a generation of 0.
/// Creates a new entity reference with a generation of 1.
///
/// # Note
///
Expand All @@ -66,7 +67,10 @@ impl Entity {
/// only be used for sharing entities across apps, and only when they have a scheme
/// worked out to share an ID space (which doesn't happen by default).
pub fn new(id: u32) -> Entity {
Entity { id, generation: 0 }
Entity {
id,
generation: unsafe{ NonZeroU32::new_unchecked(1) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe

Suggested change
generation: unsafe{ NonZeroU32::new_unchecked(1) }
generation: NonZeroU32::new(1).unwrap(),

as this is pretty much guaranteed to be optimized away anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely should be optimized away like that, but I'm not sure about with debug/non-optimized builds - though I don't imagine it being a bottleneck.

The other option is to introduce a constant in the file, ie. GENERATION_INITIAL, which we almost have anyway via the EntityMeta::EMPTY constant since unsafe is required in that context anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checked, compiler seems happy to optimize that even at opt-level=1 but won't at 0:
https://godbolt.org/z/E8EY5Mj8s

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other option is to introduce a constant in the file, ie. GENERATION_INITIAL, which we almost have anyway via the EntityMeta::EMPTY constant since unsafe is required in that context anyway.

A constant should work. Something like const GENERATION_ONE: NonZeroU32 = if let Some(gen) = NonZeroU32::new(1) { gen } else { panic!() }; should work in safe code and optimize even with opt-level=0. It will require rustc 1.57 for the panic!() though, but replacing panic!() with [][1] works even with the current rustc 1.56.

}
}

/// Convert to a form convenient for passing outside of rust.
Expand All @@ -76,17 +80,17 @@ impl Entity {
///
/// No particular structure is guaranteed for the returned bits.
pub fn to_bits(self) -> u64 {
u64::from(self.generation) << 32 | u64::from(self.id)
u64::from(self.generation.get()) << 32 | u64::from(self.id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
u64::from(self.generation.get()) << 32 | u64::from(self.id)
u64::from(self.generation()) << 32 | u64::from(self.id)

}

/// Reconstruct an `Entity` previously destructured with [`Entity::to_bits`].
///
/// Only useful when applied to results from `to_bits` in the same instance of an application.
pub fn from_bits(bits: u64) -> Self {
Self {
generation: (bits >> 32) as u32,
pub fn from_bits(bits: u64) -> Option<Self> {
Some(Self{
generation: NonZeroU32::new((bits >> 32) as u32)?,
id: bits as u32,
}
})
}

/// Return a transiently unique identifier.
Expand All @@ -104,7 +108,7 @@ impl Entity {
/// given id has been reused (id, generation) pairs uniquely identify a given Entity.
#[inline]
pub fn generation(self) -> u32 {
self.generation
self.generation.get()
}
}

Expand Down Expand Up @@ -147,7 +151,10 @@ impl<'a> Iterator for ReserveEntitiesIterator<'a> {
generation: self.meta[id as usize].generation,
id,
})
.or_else(|| self.id_range.next().map(|id| Entity { generation: 0, id }))
.or_else(|| self.id_range.next().map(|id| Entity {
generation: unsafe{ NonZeroU32::new_unchecked(1) },
id,
}))
}

fn size_hint(&self) -> (usize, Option<usize>) {
Expand Down Expand Up @@ -265,7 +272,7 @@ impl Entities {
// As `self.free_cursor` goes more and more negative, we return IDs farther
// and farther beyond `meta.len()`.
Entity {
generation: 0,
generation: unsafe{ NonZeroU32::new_unchecked(1) },
id: u32::try_from(self.meta.len() as i64 - n).expect("too many entities"),
}
}
Expand Down Expand Up @@ -293,7 +300,10 @@ impl Entities {
} else {
let id = u32::try_from(self.meta.len()).expect("too many entities");
self.meta.push(EntityMeta::EMPTY);
Entity { generation: 0, id }
Entity {
generation: unsafe{ NonZeroU32::new_unchecked(1) },
id,
}
}
}

Expand Down Expand Up @@ -373,7 +383,12 @@ impl Entities {
if meta.generation != entity.generation {
return None;
}
meta.generation += 1;

meta.generation = unsafe{ NonZeroU32::new_unchecked(
meta.generation.get()
.checked_add(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if this addition fails, we should instead choose not to add the entity to the pending Vec.
This effectively marks the slot as dead, but avoids re-using generations.
This should probably give a warning though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only issue I can see is that contains, and any similar checks, will see the u32::MAX generation and match it with the last EntityID - which is still a problem with wrapping anyway, just a more likely problem. We can avoid this by losing 1 generation at the top too though, so that's fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah, well spotted. I guess EntityMeta could contain an Option<NonZeroU32> with None for this case?

Copy link
Contributor Author

@notverymoe notverymoe Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be the dead generation way, avoids unwraps on reallocating free'd but not dead entities:

        meta.generation = NonZeroU32::new(meta.generation.get().saturating_add(1)).unwrap();

        if meta.generation != GENERATION_DEAD {
            self.pending.push(entity.id);

            let new_free_cursor = self.pending.len() as i64;
            *self.free_cursor.get_mut() = new_free_cursor;
            self.len -= 1;
        }

        Some(mem::replace(&mut meta.location, EntityMeta::EMPTY.location))

Optimized, this is the same cost as just a saturating_add, so it correctly removes the NonZeroU32::new/unwrap.

I'll give the Option option a go now

Copy link
Contributor Author

@notverymoe notverymoe Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, dinner! Option method was a little more complex, but it's the more correct solution for sure. Thought there were a few issues revealed by using it, but, it turns out that in all those cases we were accessing meta via indices from pending - so they're all safe to unwrap in this case. But, in the previous case most of them had slipped past my scan through.

Only issue I can foresee is that I don't think the optimizer can handle this case, since it won't know that pending can only contain indexes to options with a value, so all those additional unwraps probably won't be optimized out. I don't think that switching to a raw u32 would "fix" that here either, and I'm not even really sure this would be a bottleneck anyway. Beats a panic or bug though, that's for sure.

.unwrap_or(1))
};
Copy link
Contributor

@bjorn3 bjorn3 Oct 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the safe version optimize to the same asm as this code here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, I'll confirm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't appear to optimize to the same code unfortunately, though I might've done it wrong:
https://godbolt.org/z/4r6soaTbj

Copy link
Contributor Author

@notverymoe notverymoe Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible solution:

std::num::NonZeroU32::new(match num.get() {
        u32::MAX => 1,
        v        => v+1
    }).unwrap();
  • [Opt-0] Removes overhead of calls to checked_add and unwrap_or, still has the overhead of NonZeroU32::new.
  • [Opt-1] Shorter asm than unsafe option
  • [Opt-2][Opt-3] Identical asm between it and the unsafe version
  • Can mark the slot dead without effecting the asm much, as per: Allow Option<Entity> to leverage niche optimization #3029 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the dead slot, we don't actually need to unwrap where. But we do need to unwrap when we allocate from the freelist. I'm not sure if that's particularly performant - but it's safe!


let loc = mem::replace(&mut meta.location, EntityMeta::EMPTY.location);

Expand Down Expand Up @@ -401,7 +416,7 @@ impl Entities {
// not reallocated since the generation is incremented in `free`
pub fn contains(&self, entity: Entity) -> bool {
self.resolve_from_id(entity.id())
.map_or(false, |e| e.generation() == entity.generation)
.map_or(false, |e| e.generation == entity.generation)
}

pub fn clear(&mut self) {
Expand Down Expand Up @@ -442,7 +457,10 @@ impl Entities {
// If this entity was manually created, then free_cursor might be positive
// Returning None handles that case correctly
let num_pending = usize::try_from(-free_cursor).ok()?;
(idu < self.meta.len() + num_pending).then(|| Entity { generation: 0, id })
(idu < self.meta.len() + num_pending).then(|| Entity {
generation: unsafe{ NonZeroU32::new_unchecked(1) },
id
})
}
}

Expand Down Expand Up @@ -518,13 +536,13 @@ impl Entities {

#[derive(Copy, Clone, Debug)]
pub struct EntityMeta {
pub generation: u32,
pub generation: NonZeroU32,
pub location: EntityLocation,
}

impl EntityMeta {
const EMPTY: EntityMeta = EntityMeta {
generation: 0,
generation: unsafe{ NonZeroU32::new_unchecked(1) },
location: EntityLocation {
archetype_id: ArchetypeId::INVALID,
index: usize::MAX, // dummy value, to be filled in
Expand All @@ -549,10 +567,53 @@ mod tests {
#[test]
fn entity_bits_roundtrip() {
let e = Entity {
generation: 0xDEADBEEF,
generation: NonZeroU32::new(0xDEADBEEF).unwrap(),
id: 0xBAADF00D,
};
assert_eq!(Entity::from_bits(e.to_bits()), e);
assert_eq!(Entity::from_bits(e.to_bits()).unwrap(), e);
}

#[test]
fn entity_bad_bits() {
let bits: u64 = 0xBAADF00D;
assert_eq!(Entity::from_bits(bits), None);
}

#[test]
fn entity_option_size_optimized() {
assert_eq!(
core::mem::size_of::<Option<Entity>>(),
core::mem::size_of::<Entity>()
);
}

#[test]
fn entities_generation_increment() {
let mut entities = Entities::default();

let entity_old = entities.alloc();
entities.free(entity_old);
let entity_new = entities.alloc();

assert_eq!(entity_old.id, entity_new.id );
assert_eq!(entity_old.generation() + 1, entity_new.generation());
}

#[test]
fn entities_generation_wrap() {
let mut entities = Entities::default();
let mut entity_old = entities.alloc();

// Modify generation on entity and entities to cause overflow on free
entity_old.generation = NonZeroU32::new(u32::MAX).unwrap();
entities.meta[entity_old.id as usize].generation = entity_old.generation;
entities.free(entity_old);

// Get just free-d entity back
let entity_new = entities.alloc();

assert_eq!(entity_old.id, entity_new.id);
assert_eq!(entity_new.generation(), 1);
}

#[test]
Expand Down
19 changes: 10 additions & 9 deletions crates/bevy_ecs/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ mod tests {
};
use bevy_tasks::TaskPool;
use parking_lot::Mutex;
use std::num::NonZeroU32;
use std::{
any::TypeId,
sync::{
Expand Down Expand Up @@ -1417,7 +1418,7 @@ mod tests {
assert_eq!(
e4,
Entity {
generation: 0,
generation: NonZeroU32::new(1).unwrap(),
id: 3,
},
"new entity is created immediately after world_a's max entity"
Expand Down Expand Up @@ -1451,7 +1452,7 @@ mod tests {
);

let e4_mismatched_generation = Entity {
generation: 1,
generation: NonZeroU32::new(2).unwrap(),
id: 3,
};
assert!(
Expand All @@ -1470,7 +1471,7 @@ mod tests {
);

let high_non_existent_entity = Entity {
generation: 0,
generation: NonZeroU32::new(1).unwrap(),
id: 6,
};
world_b
Expand All @@ -1484,7 +1485,7 @@ mod tests {
);

let high_non_existent_but_reserved_entity = Entity {
generation: 0,
generation: NonZeroU32::new(1).unwrap(),
id: 5,
};
assert!(
Expand All @@ -1503,19 +1504,19 @@ mod tests {
reserved_entities,
vec![
Entity {
generation: 0,
generation: NonZeroU32::new(1).unwrap(),
id: 5
},
Entity {
generation: 0,
generation: NonZeroU32::new(1).unwrap(),
id: 4
},
Entity {
generation: 0,
generation: NonZeroU32::new(1).unwrap(),
id: 7,
},
Entity {
generation: 0,
generation: NonZeroU32::new(1).unwrap(),
id: 8,
},
],
Expand Down Expand Up @@ -1567,7 +1568,7 @@ mod tests {
let e1 = Entity::new(1);
let e2 = world.spawn().id();
let invalid_e2 = Entity {
generation: 1,
generation: NonZeroU32::new(2).unwrap(),
id: e2.id,
};

Expand Down