Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance slower than it used to be in c++ version #139

Open
Ughuuu opened this issue Jul 3, 2024 · 6 comments
Open

Performance slower than it used to be in c++ version #139

Ughuuu opened this issue Jul 3, 2024 · 6 comments
Labels
enhancement New feature or request

Comments

@Ughuuu
Copy link
Contributor

Ughuuu commented Jul 3, 2024

I started to investigate out why the performance is so slow. On Rust version I am getting 9k circles, on new one I am getting 6k circles. So about 33% slower.

I am using Instruments app on mac which gives very good insight into what takes a lot of time from app and what functions take a lot of time. However as the result of this app is not exportable, I will either write by hand here by copy pasting or make screenshots. The following is a 5s overview of what calls take the most time from a scene where a lot of circles are created and hit the ground:

Initial Overview

image

We have here the main thread with 84% usage, 4.69s out of 5.55s.
The other threads:

  • 3.7%
  • 1.7%
  • ...

seem to be Rayon calling into Rapier, and there isn't much that can be done there. In these threads there is also a callback from our lib where we filter objects, however as the percentage isn't that big, our investigation won't focus on this.

image

That being said, we are left with the Main Thread, so lets go into that.

Note that going forward, percentages are based on the thing we are focused on. Here it's going to be based on Main Thread duration, 4.66s.

Main Thread (4.66s)

The main thread is then split into 4 parts:

image
  1. 43.8% (2.06s) call to RapierPhysicsServer2D. (flush queries)
  2. 37.5% (1.76s) call to RenderingServerDefault::_draw. This is godot code. In here we have most notably vulkan functions, and nothing much to do with us.
  3. 13.6% (640ms) call to RapierPhysicsServer2D. (step)
  4. 3.6% (173ms) call to SceneTree::physics_process. This is godot code. It seems to handle updating of objects to be drawn.

Flow 2 and 4 don't seem to call back into our lib. So we focus on 1 and 4.

Main Thread -> Flow 3(640ms, 13.6%) - Step

The purpose of this function is to simulate the whole physics world with a step interval. Handles collisions and everything.

image

Same as before, gdext hides away the function, but our specific class calls into step function, so it's most likely that.

Here going down into the step function, it splits into:

  1. 73.1% call into rapier lib. Here there isn't much we can do about it, aside for a callback down the line
image

with a small percentage is in our code as well, handle_contact_force :

image

Not much to be said about this, and even if we do take this out of the equation and re-run everything, we don't see a big change in the time.

Also, seeing as the main physics process part, the Step function only takes this little, and the rest alone is just the Flush Queries part, means probably the bottle neck is there.

Main Thread -> Flow 1 (2.05s, 43.8%) - Flush Queries

The purpose of this is to call into the Callables godot offer to some physics objects. These objects call these in case of collision events or active change events, etc.

image

Notice that we have a call into RapierPhysicsServer2D, but it doesn't say specifically where. This is because of how gdext works, it seems to hide away the call. But we do see later down the call it calls into our implementation of flush_queries, so it's coming from the PhysicsServer2D flush_queries method. This method is mainly going to be calling back into godot by calling Callables (that it previously handed down to physics objects) that had collision events or just are active.

image

If we open this flow, we can see that we have:

  1. 69.4% (1.43s) Callable::callv function. We will go into this a bit later, as the other ones are small.
  2. 6.4% (133ms) godot_core::meta::class_name::ClassName::to_string_name
image
  1. 5.5% (108ms) godot_core::builtin::callable::Callable::bindv - quite expensive for just binding the arguments it seems.
image
  1. 3.6% (76ms) godot_core::builtin::callable::Callable::is_valid. I like how here it uses os_unfair_lock. Haha.
image
  1. 2% (54ms) godot_core::gen::builtin_classes::array::InnerArray::push_back
image

Then the rest seem to be like this, keep getting smaller and smaller:

image

Main -> Flow1 -> Callable::callv (1.43s)

image

Going inside the Callable, it seems godot calls back into the library 5 times (from RigidBody2D::_sync_body_state):

  1. 352ms (27.8%) Here we see a split into drop_in_place and RawGd::bind :
image
  1. 323ms (25.5%) Here most of it is from drop_in_place and RawGd::bind:
image
  1. 199ms (15.7%) Same as previous call, drop_in_place and RawGd::bind
image
  1. 168ms (13.3%) Same
image
  1. 164ms (12.9%) Same
image
@Ughuuu Ughuuu added the bug Something isn't working label Jul 3, 2024
@Ughuuu
Copy link
Contributor Author

Ughuuu commented Jul 3, 2024

The following commit was mode to reduce the calls to core::fmt::write
Ughuuu/gdext@934ea4c

This improved a lot performance. There is still bottleneck in the godot_cell::borrow_state::BorrowState::increment_shared function. Other small things that can be improved:

  • class_name::to_string_name - extra calls for this
  • Extra Variant copies
  • callv taking ownership of Array makes clones of array.

@Ughuuu
Copy link
Contributor Author

Ughuuu commented Aug 8, 2024

Another interesting issue to look at is this: godotengine/godot-proposals#10389

@Ughuuu Ughuuu added enhancement New feature or request and removed bug Something isn't working labels Aug 27, 2024
@juanitonelli
Copy link

Hello @Ughuuu, I didn’t want to create a separate issue for this because it’s really difficult for me to pinpoint the cause, but the performance with any version greater than 0.8.4 has become unusable for me: the game runs at 2 or 3 FPS when I enable Rapier. It works fine with the default Physics. It’s the same 2D project I’ve been working on for months now, which is pretty complex, so I cannot upload any isolated parts of it.

Sorry for not being able to give you a more concise description of the problem; for the time being, I had to rollback to Rapier 0.8.4 to be able to use it.

@Ughuuu
Copy link
Contributor Author

Ughuuu commented Nov 1, 2024

Hey, thanks for letting me know. If you can pinpoint better whats causing the slowdown, write here. If not, we will probably find at some point what causes it(I hope)

@juanitonelli
Copy link

Just to add some data to the perfomance issue. This is with Rapier 0.8.8:

With Rapier enabled, the FPS drops to 2-3, and the Physics Process time slows down to 53.13ms (avg):
image

The logs generate lots of this "RapierShapeBase leaked owners" errors:
image

The exact same scene, with Rapier disabled (Godot default physics): 144 FPS (capped by my monitor) and Physics Process time of 0.29ms (avg), no errors on console:
image

@Ughuuu
Copy link
Contributor Author

Ughuuu commented Nov 8, 2024

Error logs are very costly, those in themselves generate a lot of delay. It's probably not issue with performance but with those errors. If you can open an issue with this, I can check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants