Use statically-typed views for better performance #856

adlarkin · 2021-06-10T02:40:08Z

Signed-off-by: Ashton Larkin ashton@openrobotics.org

🎉 New feature

Closes #711

Summary

In an ongoing attempt to improve runtime performance, #711 points out that the runtime performance of EntityComponentManager::Each gets worse as more components are used. I've changed how views are implemented to improve the runtime of EntityComponentManager::Each. The main change here is that we now have a statically-typed view, which is defined by the list of component types that are required by the view. By doing this, a view can store an entity and pointers to the required component types in a std::tuple, and then make use of std::apply when we want to apply an entity and all of its relevant components to a callback in EntityComponentManager::Each.

Using std::apply allows for better runtime performance because we can apply the entity and associated components to a callback "all at once". This is quicker than the previous view approach, which involved individually looking up each component for an entity (see #711 (comment)).

The main drawback with the approach of a statically-typed view is the added software complexity/maintenance to the system. For example, during testing, I found that tuple creation can be costly - so, in order to avoid re-creating tuples whenever a component was being added/removed, I added an internal ignore flag to components to eliminate tuple re-creation. This ignore component flag is set to true whenever a user requests to remove an entity that was previously created, and is set to false whenever a user requests to add an entity that was previously removed. The ignore flag is hidden from the user and is used by the ECM and its views to determine which entities and components should be used in a particular Each call.

Another thing to note is that the old ComponentStorage approach is no longer used. This new view approach does not require components of the same type to be packed sequentially in memory, and through various testing that I performed, I found that runtime performance with statically-typed views without components stored sequentially in memory is still noticeably better than the runtime performance of the old views with components stored sequentially in memory. So, hopefully this makes at least the ComponentStorage part of our system simpler to maintain 🙂

A side effect of removing the old ComponentStorage approach is that we also no longer need ComponentId, which means that we also no longer need ComponentKey. I tried to deprecate methods using ComponentKey wherever possible in order to follow the tick-tock model, but there were several methods that had to be deleted completely and replaced with something else (i.e., methods that returned ComponentKey or had ComponentKey as the only method parameter, which meant that there was no longer enough information provided to achieve the desired functionality of the method).

Test it

If you'd like to compare the performance of running Each with the approach proposed in this PR to the existing view approach, you can run the each benchmark. I have already done this on my system, and have included the results below (I ran a comparison against 8f5103b):

Each benchmark test

As seen from the results above, all of the EachCache tests are significantly faster with the statically-typed view approach - in fact, if we look at the very last test in that file (Each10ComponentCache/1000 - so, 1000 entities with 10 components each), we go from 150ms to 2ms 🤯 It's also worth noting that we don't see a decrease in EachNoCache performance either, which makes sense since views are not used for EachNoCache.

Another way to test the changes here is to run a world with 3k shapes. Here are the RTF numbers I'm seeing from testing 3k simple shapes, using both GUI and headless simulation. I used TPE for the physics engine. I also echoed the /world/shapes/dynamic_pose/info topic to see how RTF was impacted using these changes (see #743).

original ECM (main at ca10c77):

static, no gui: 42%
- echoing the /world/shapes/dynamic_pose/info topic: 8.3%
static, gui: bounces around between 11-15%
- echoing the /world/shapes/dynamic_pose/info topic: 4%
non-static, no gui: 8.5%
- echoing the /world/shapes/dynamic_pose/info topic: 3.5%
non-static, gui: 4%
- echoing the /world/shapes/dynamic_pose/info topic: 2.3%

new ECM (this branch):

static, no gui: 50%
- echoing the /world/shapes/dynamic_pose/info topic: 35% (this is a huge increase compared to 8.3% 🤯)
static, gui: bounces around 30-35%
- echoing the /world/shapes/dynamic_pose/info topic: 23% (again, a huge increase compared to 4% 🤯)
non-static, no gui: 8.6%
- echoing the /world/shapes/dynamic_pose/info topic: 5%
non-static, gui: 6%
- echoing the /world/shapes/dynamic_pose/info topic: 3.75%

As the numbers show, there's a performance improvement for almost every single test case (there's not much of an improvement for non-static, no GUI). For use cases where plugins are used that make calls to Each with a large number of components, I suspect that there will be even greater performance improvements.

It was also pointed out in #856 (comment) that GUI startup time for 3k shapes with the changes in this PR is only a few seconds (it used to take up to a few minutes). Perhaps this is because of some each calls that are used when initializing the GUI?

Another thing to note is that when using the changes in this PR to run 3k shape simulations with a GUI, loading time is a lot faster. In the tests I ran, I found that these new changes allowed the GUI to load and run in a few seconds, while the old ECM/View implementation took a few minutes to load (thanks @iche033 for making this discovery!).

One other scenario that could be worth testing is a benchmark of the time it takes to add/remove components frequently. I did some informal testing locally with the new approach and found that removing components with the statically-typed views and an internal component ignore flag is faster than the current ECM/view approach, but perhaps it would be worth adding a benchmark test for this. This PR is already very large, so I decided to hold off on adding a benchmark test for now, but if anyone would be interested in me adding this in, let me know - I can also make another PR that adds this benchmark test if that is more preferable 🙂

Takeaways/Next steps

While it's clear that this PR has improved the runtime performance of EntityComponentManager::Each, there's not much of an improvement in RTF for the headless tests. While #793 should help improve the RTF for running the GUI, we will need to figure out why we still aren't seeing higher RTF for headless simulation (theoretically, headless RTF for 3k static shapes should be 100%).

I would also be interested in figuring out how to generate better testing scenarios/evaluation metrics. Perhaps 3k simple shapes isn't a great way to benchmark performance - we aren't running any plugins other than the physics system, and aren't doing things like adding/removing components. Maybe an external user will try these changes in a context that differs from the 3k simple shapes example and find a nice performance improvement! Considering that most users use various plugins and may or may not add/remove components frequently, I think it'd be worth taking some time to figure out how to benchmark use cases that are frequent among Ignition users. That way, we can get a better feel for whether our changes are being noticed by the community or not.

Also, it would be nice to add some more features/functionality to the EntityComponentManager that make use of the new statically-typed views. For example, perhaps we can use iterators (as done in EnTT) for use cases where a lot of components are used as "filters", and we only need specific components of entities that pass all component filter checks. So, if a user wanted all entities that are static models, but only need the pose of these models, then maybe they could do something like this:

auto entityGroup = ecm.FilterEntities<components::Static, components::Model, components::Pose>();

for (auto entity : entityGroup)
{
  auto pose = entity.Component<components::Pose>();

  // do something with pose here
}

Related to the idea of providing functionality like iterators as described above, we will eventually need to address #628 and #805.

Checklist

Signed all commits for DCO
Added tests
Added example and/or tutorial
Updated documentation (as needed)
Updated migration guide (as needed)
codecheck passed (See contributing)
All tests passed (See test coverage)
While waiting for a review on your PR, please help review another open pull request to support the maintainers

Note to maintainers: Remember to use Squash-Merge

Signed-off-by: Ashton Larkin <ashton@openrobotics.org>

adlarkin

I've got a few questions/notes for my reviewers.

test/integration/components.cc

include/ignition/gazebo/Types.hh

src/EntityComponentManager_TEST.cc

iche033 · 2021-06-10T20:35:42Z

Do you notice that when launching shapes_population.sdf with GUI, the scene is now loaded faster? It used to take 2 mins on my machine and now it's only just a few seconds.

I am testing with 2 workspaces, one on this branch and the other on main, and both built from source. My RTFs from running the shapes population world with the tpe plugin:

new ECM (this branch):

static, no gui: 35%
static, gui: 16%
non-static, no gui: 10%
non-static, gui: 5%

original ECM (main)

static, no gui: 26%
static, gui: 6%
non-static, no gui: 9%
non-static, gui: 3.5%

So looks like it made a difference on my machine. Could you try comparing against fortress built from source and see if you also see these improvements?

adlarkin · 2021-06-11T15:38:14Z

Do you notice that when launching shapes_population.sdf with GUI, the scene is now loaded faster? It used to take 2 mins on my machine and now it's only just a few seconds.

I never ran this with the GUI the first time around, but now that you have made the point, I went ahead and tested it myself. Yes, the scene loads a lot faster for me now with these changes. I'm also seeing loading times decrease from a few minutes to a few seconds. Thanks for pointing this out!

Could you try comparing against fortress built from source and see if you also see these improvements?

I went ahead an did a comparison with source builds and have updated the PR description with my results. I do notice some improvements now, especially when echoing the /world/shapes/dynamic_pose/info topic 😎 (see #743)

iche033

Left a few minor optimization comments in code.

I'm also randomly testing examples worlds. So far I ran into one crash which I'm not able to reproduce on the main branch. To reproduce:

ign gazebo -v 4 -r --levels levels.sdf

Right click and remove the red vehicle -> segfault

include/ignition/gazebo/detail/BaseView.hh

include/ignition/gazebo/components/Factory.hh

include/ignition/gazebo/detail/EntityComponentManager.hh

src/EntityComponentManager.cc

Signed-off-by: Ashton Larkin <ashton@openrobotics.org>

adlarkin · 2021-06-18T21:08:01Z

I've addressed all review feedback, except for updating the migration guide with things that had to be deleted or deprecated (I'll do this once we know no other changes have to be made).

So far I ran into one crash which I'm not able to reproduce on the main branch. To reproduce:
ign gazebo -v 4 -r --levels levels.sdf
Right click and remove the red vehicle -> segfault

I tested this locally as well, and also noticed a segfault. I will debug this and update the PR once I've found a fix.

chapulina

Just did a first pass. There's a lot of changes, so please allow me a few passes until I get the hang of it.

I have some suggestions to allow us to tick-tock most of the API and ease migration in #873.

Meanwhile, can you add some unit tests for the new classes / files? i.e. BaseView, ComponentStorage

include/ignition/gazebo/components/Component.hh

include/ignition/gazebo/Types.hh

src/EntityComponentManager_TEST.cc

include/ignition/gazebo/detail/BaseView.hh

include/ignition/gazebo/detail/ComponentStorage.hh

src/EntityComponentManager.cc

Signed-off-by: Ashton Larkin <ashton@openrobotics.org>

…t test Signed-off-by: Ashton Larkin <ashton@openrobotics.org>

Signed-off-by: Ashton Larkin <ashton@openrobotics.org>

include/ignition/gazebo/detail/BaseView.hh

include/ignition/gazebo/detail/ComponentStorage.hh

src/EntityComponentManager.cc

include/ignition/gazebo/detail/View.hh

Signed-off-by: Ashton Larkin <ashton@openrobotics.org>

adlarkin · 2021-08-11T23:28:53Z

Let's run another round of performance tests to sanity check that we didn't introduce any regressions

I ran performance tests again and did not see any noticeable changes in the numbers. You can see the results in the PR description.

figure out what's happening with CI.

I think at this point, there are 2 lingering CI issues:

Test failures on ubuntu CI that cannot be reproduced locally - see Use statically-typed views for better performance #856 (comment)
Compilation failure on windows related to the new templated view class - see Use statically-typed views for better performance #856 (comment)

Signed-off-by: Louise Poubel <louise@openrobotics.org>

chapulina · 2021-08-12T02:57:08Z

Compilation failure on windows

I'm blindly trying @j-rivero's approach from gazebosim/gz-gui@5f7f05c in 66bb2a2

It's worth noting that it fails when compiling the ignition-gazebo6-gui component, not the core library.

Test failures on ubuntu CI

I think it may be due to root permissions on the Jenkins node?

Nope, the same happens on GitHub actions, so it's definitely an issue with the PR. Very strange that it fails to create the server.config file exactly for those tests, but succeeds on other tests. I thought it may be some interference between the tests, but I can't reproduce the failure even running all colcon tests locally.

adlarkin · 2021-08-12T03:18:43Z

Nope, the same happens on GitHub actions, so it's definitely an issue with the PR

That's strange - I mentioned in a previous comment that tests started failing in https://build.osrfoundation.org/job/ignition_gazebo-ci-pr_any-ubuntu_auto-amd64/6727/, which was a build for #943. But, I don't see how changes in #943 would cause a break.

Another thing to note is that these tests are not failing on MacOS CI 🤔

…tuff Signed-off-by: Louise Poubel <louise@openrobotics.org>

chapulina · 2021-08-12T04:49:51Z

I have a promising fix for Server_TEST in d9a49e6, I was able to reproduce the failure locally when the ~/.ignition/gazebo directory didn't exist. It still doesn't make sense to me why this manifested in this PR though 🤷‍♀️

adlarkin · 2021-08-12T04:55:19Z

I was able to reproduce the failure locally when the ~/.ignition/gazebo directory didn't exist.

Ohhh you know what, this has actually happened to me before. I just forgot about it and always ran ign gazebo once (to populate ~/.ignition/gazebo) before running tests to avoid this error locally. So, if anything, it sounds like this PR exposed a bug instead of creating a new one 😉 Hopefully your fix works! 🤞

codecov · 2021-08-12T05:48:03Z

Codecov Report

Merging #856 (2dadec6) into main (e3ee0af) will increase coverage by 0.38%.
The diff coverage is 76.83%.

❗ Current head 2dadec6 differs from pull request most recent head abf9d23. Consider uploading reports for the commit abf9d23 to get more accurate results

@@            Coverage Diff             @@
##             main     #856      +/-   ##
==========================================
+ Coverage   64.66%   65.04%   +0.38%     
==========================================
  Files         242      245       +3     
  Lines       19014    19413     +399     
==========================================
+ Hits        12296    12628     +332     
- Misses       6718     6785      +67

Impacted Files	Coverage Δ
include/ignition/gazebo/EntityComponentManager.hh	`100.00% <ø> (ø)`
include/ignition/gazebo/Link.hh	`100.00% <ø> (ø)`
include/ignition/gazebo/Server.hh	`100.00% <ø> (ø)`
include/ignition/gazebo/rendering/RenderUtil.hh	`100.00% <ø> (ø)`
src/gui/Gui.cc	`65.35% <ø> (+0.23%)`	⬆️
src/gui/GuiRunner.cc	`85.91% <0.00%> (-0.34%)`	⬇️
src/gui/plugins/scene3d/Scene3D.hh	`50.00% <ø> (ø)`
src/systems/lift_drag/LiftDrag.hh	`100.00% <ø> (ø)`
src/systems/log/LogRecord.cc	`80.56% <ø> (-0.31%)`	⬇️
src/systems/velocity_control/VelocityControl.hh	`100.00% <ø> (ø)`
... and 54 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e040ec2...abf9d23. Read the comment docs.

Signed-off-by: Jose Luis Rivero <jrivero@osrfoundation.org>

iche033

I tested this branch with various worlds, spawning and deleting objects, distributed sim, custom plugins and it seems to be working quite well.

the latest ubuntu build failed due to network issue. I just queued another one.

chapulina · 2021-08-13T17:12:16Z

That's a lot of failing tests on Ubuntu. The sensors ones should be fixed by #617 , but that's only 9 tests and there are 23 failing here

Signed-off-by: Louise Poubel <louise@openrobotics.org>

chapulina · 2021-08-13T18:43:29Z

there are 23 failing here

I'm seeing 23 failing tests on Ubuntu Jenkins for multiple PRs. I believe this is the fallout from gazebo-tooling/release-tools#494.

Let's rely on the other platforms to merge this PR. GitHub actions was happy, as well as homebrew and Windows. I've fixed conflicts and will merge if CI comes back happy for those platforms again.

Use statically-typed views for better performance

046210a

Signed-off-by: Ashton Larkin <ashton@openrobotics.org>

adlarkin requested review from iche033 and chapulina June 10, 2021 02:40

github-actions bot added the 🏯 fortress Ignition Fortress label Jun 10, 2021

adlarkin commented Jun 10, 2021

View reviewed changes

test/integration/components.cc Outdated Show resolved Hide resolved

include/ignition/gazebo/Types.hh Outdated Show resolved Hide resolved

include/ignition/gazebo/Types.hh Outdated Show resolved Hide resolved

src/EntityComponentManager_TEST.cc Show resolved Hide resolved

This was referenced Jun 14, 2021

Performance enhancement to component addition/removal adlarkin/simple_ECM#3

Closed

Use unordered maps for component look-ups in views #752

Closed

Merge branch 'main' into adlarkin/view_restructure

e4a6568

iche033 reviewed Jun 18, 2021

View reviewed changes

adlarkin added 2 commits June 18, 2021 15:23

clearer deprecation messages

ee9b6c5

Signed-off-by: Ashton Larkin <ashton@openrobotics.org>

review feedback

39b499e

Signed-off-by: Ashton Larkin <ashton@openrobotics.org>

adlarkin self-assigned this Jun 21, 2021

chapulina reviewed Jun 22, 2021

View reviewed changes

adlarkin mentioned this pull request Jun 22, 2021

Proposal for #856: Deprecations instead of removals #873

Merged

adlarkin and others added 6 commits June 28, 2021 15:49

use a removed flag on components instead of a ignore flag

cd635dd

Signed-off-by: Ashton Larkin <ashton@openrobotics.org>

add nullptr checks for CreateComponent and RemoveComponent in ECM uni…

f8204de

…t test Signed-off-by: Ashton Larkin <ashton@openrobotics.org>

updated documentation and other review nits

52f8c7e

Signed-off-by: Ashton Larkin <ashton@openrobotics.org>

Merge branch 'main' into adlarkin/view_restructure

023bc7f

added ComponentStorage tests

40101a8

Signed-off-by: Ashton Larkin <ashton@openrobotics.org>

added tests for View and BaseView classes

ee3d3d0

Signed-off-by: Ashton Larkin <ashton@openrobotics.org>

adlarkin force-pushed the adlarkin/view_restructure branch from 74c5d80 to ee3d3d0 Compare July 9, 2021 22:39

chapulina reviewed Jul 9, 2021

View reviewed changes

include/ignition/gazebo/detail/BaseView.hh Show resolved Hide resolved

include/ignition/gazebo/detail/ComponentStorage.hh Outdated Show resolved Hide resolved

src/EntityComponentManager.cc Outdated Show resolved Hide resolved

chapulina reviewed Jul 9, 2021

View reviewed changes

include/ignition/gazebo/detail/View.hh Outdated Show resolved Hide resolved

adlarkin mentioned this pull request Jul 12, 2021

Fix crash on GUI entity removal with levels #913

Merged

7 tasks

remove ComponentStorage from detail namespace

4a90e86

Signed-off-by: Ashton Larkin <ashton@openrobotics.org>

set visibility for template specializations

9baa1a8

Signed-off-by: Louise Poubel <louise@openrobotics.org>

chapulina added 2 commits August 11, 2021 19:08

Try the ign-gui plugins approach

66bb2a2

Signed-off-by: Louise Poubel <louise@openrobotics.org>

merged from main

640df12

Signed-off-by: Louise Poubel <louise@openrobotics.org>

Make Server test not write to /home/chapulina, attempt more Windows s…

d9a49e6

…tuff Signed-off-by: Louise Poubel <louise@openrobotics.org>

Remove visibility from templated class and methods

772a5f8

Signed-off-by: Jose Luis Rivero <jrivero@osrfoundation.org>

iche033 reviewed Aug 13, 2021

View reviewed changes

merged from main

2dadec6

Signed-off-by: Louise Poubel <louise@openrobotics.org>

Merge branch 'main' into adlarkin/view_restructure

abf9d23

chapulina merged commit eaf7aab into main Aug 13, 2021

chapulina deleted the adlarkin/view_restructure branch August 13, 2021 21:48

chapulina mentioned this pull request Aug 26, 2021

Improve the performance of EntityComponentManager::Each #711

Closed

This was referenced Aug 27, 2021

Add support for cloning entities #959

Merged

Lock views during system PostUpdates #1001

Merged

Performance drops significantly when echoing the /world/shapes/dynamic_pose/info topic #743

Closed

adlarkin mentioned this pull request Oct 27, 2021

Use deserialize to remove CreateComponentImplementation's template dependency #1148

Closed

7 tasks

chapulina mentioned this pull request Dec 15, 2021

Data storage classes #494

Open

adlarkin mentioned this pull request Feb 11, 2022

Change various CMDs to use std::optional #1334

Open

azeey mentioned this pull request Mar 13, 2023

Clearing pending commands - unexpected behavior #1926

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use statically-typed views for better performance #856

Use statically-typed views for better performance #856

adlarkin commented Jun 10, 2021 •

edited

Loading

adlarkin left a comment

iche033 commented Jun 10, 2021

adlarkin commented Jun 11, 2021

iche033 left a comment •

edited

Loading

adlarkin commented Jun 18, 2021 •

edited

Loading

chapulina left a comment

adlarkin commented Aug 11, 2021

chapulina commented Aug 12, 2021

adlarkin commented Aug 12, 2021 •

edited

Loading

chapulina commented Aug 12, 2021

adlarkin commented Aug 12, 2021

codecov bot commented Aug 12, 2021 •

edited

Loading

iche033 left a comment

chapulina commented Aug 13, 2021

chapulina commented Aug 13, 2021

Use statically-typed views for better performance #856

Use statically-typed views for better performance #856

Conversation

adlarkin commented Jun 10, 2021 • edited Loading

🎉 New feature

Summary

Test it

Takeaways/Next steps

Checklist

adlarkin left a comment

Choose a reason for hiding this comment

iche033 commented Jun 10, 2021

adlarkin commented Jun 11, 2021

iche033 left a comment • edited Loading

Choose a reason for hiding this comment

adlarkin commented Jun 18, 2021 • edited Loading

chapulina left a comment

Choose a reason for hiding this comment

adlarkin commented Aug 11, 2021

chapulina commented Aug 12, 2021

adlarkin commented Aug 12, 2021 • edited Loading

chapulina commented Aug 12, 2021

adlarkin commented Aug 12, 2021

codecov bot commented Aug 12, 2021 • edited Loading

Codecov Report

iche033 left a comment

Choose a reason for hiding this comment

chapulina commented Aug 13, 2021

chapulina commented Aug 13, 2021

adlarkin commented Jun 10, 2021 •

edited

Loading

iche033 left a comment •

edited

Loading

adlarkin commented Jun 18, 2021 •

edited

Loading

adlarkin commented Aug 12, 2021 •

edited

Loading

codecov bot commented Aug 12, 2021 •

edited

Loading