Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dramatically decrease the startup locking/halt time when the overlay is enabled #27

Merged

Conversation

otavepto
Copy link

@otavepto otavepto commented Aug 21, 2024

  • deprecate lazy_load_achievements_icons in favor of paginated_achievements_icons in configs.main.ini, which controls how many icons to load each iteration
  • new option upload_achievements_icons_to_gpu in configs.overlay.ini which controls whether the overlay should upload the achievements icons to the GPU and display them or not
  • synchronize overlay proc with the periodic steam callback in a better way to avoid FPS drop
  • fix overlay flickering regression
  • upload achievements icons to the GPU in the overlay proc periodically, this dramatically decreased the startup locking/halt time
  • fix a potential deadlock scenario in the overlay as a result of synchronizing with 2 mutex objects

The way this works now is as follows:

Steam_User_Stats class

In the periodic callback of the Steam_User_Stats class, the achievements icons will be loaded from disk into memory via stb lib, this is done in batches each time the callback is run and the amount of images to load is controlled by paginated_achievements_icons.

Too much loading operations from disk will make the callback consume more time, and all steam callbacks will be impacted, but will happen once or twice at startup only.
This can cause timeout in some games.

Too few loading operations from disk will make the callback consume way less time, and all steam callbacks will run just fine, but will keep recurring during startup until all icons are loaded.
This can cause an overall slow down in the steam callback system.

With some testing it seemed 20 images is a moderate amount, it takes ~20-30 ms.
This doesn't impact the original startup delay problem which happens when the overlay is enabled, but this is a much better way to load icons without impacting performance or causing timeout.

overlay class

This 2 biggest problems are

  • Uploading an image resource to the GPU takes almost ~30 ms per image especially on DirectX 12, DirectX 11 takes way less time per image

  • The overlay procedure/function is called each frame of the game and it locks the overlay mutex, and the periodic callback of the overlay, which already owns the emu's global mutex, also locks the same mutex

The second point caused 2 problems:

  • Deadlock scenario which happens like this:

    1. The steam callback is triggered one way or another, either the game itself called RunCallbacks() or the emu's background thread was now running, in both cases the emu's global mutex is locked
    2. DirectX, which runs on its own thread, decided to invoke the Present() function, which is called each frame of the game, and the overlay proc locked the overlay mutex
    3. Now back to the steam callback, it attempted to call a function in the overlay which locks the overlay mutex, it won't succeed and now the steam callback thread is halted
    4. Back to the overlay proc, it tries to call some steam API which locks the global emu's mutex, but because the steam callback has already locked it, the overlay proc thread is now also halted
  • Frame drops when the overlay was active, this can happen when the steam callback is run too many times, which locks the overlay mutex, and the overlay proc, which also locks the overlay mutex, gets fewer chances to run uninterrupted. The time consumed by the steam callback is a dead time for the overlay proc, or a dead time for DirectX's rendering thread.

To fix the problem of icon uploading, the overlay proc now will attempt to upload 1 image at a time, since that takes ~30 ms on DirectX 12, the game FPS will be low at startup but will eventually rise again, and in case it keeps failing over and over forever, the user now has the option upload_achievements_icons_to_gpu.

To fix the 2 mutexes sync problem, the steam callback will try to lock the overlay mutex instead of forcefully doing it, if it failed to lock the mutex, it won't block the steam thread and instead bail out immediately, leaving the global mutex available. After all this callback is run periodically, so it should be fine to skip a few iterations.
The other way around, which is making the overlay proc attempt to lock the global emu mutex and bailing out immediately if it failed, made the overlay skip a few frames as expected, but that obviously means it will be hidden in some frames and visible in others very rapidly, hence the flickering regression.

…ievements_icons`

* new option `upload_achievements_icons_to_gpu` in `configs.main.ini` which controls whether the overlay should upload the achievements icons to the GPU and display them or not
* synchronize overlay proc with the periodic steam callback in a better way to avoid FPS drop
* prevent overlay flickering regression
* upload achievements icons to the GPU in the overlay proc periodically, this dramatically decreased the startup locking/halt time
* fix a potential deadlock scenario in the overlay as a result of synchonizing with 2 mutex objects
@otavepto otavepto changed the title * deprecate lazy_load_achievements_icons in favor of `paginated_ach… Dramatically decrease the startup locking/halt time when the overlay is enabled Aug 21, 2024
@otavepto
Copy link
Author

One reasoning I forgot to mention, why uploading icons is done periodically?
In games like payday 2 when you change resolutions all icons resources are invalidated, so they have to be re-uploaded again.
If it was done once, changing resolutions would mean losing all icons until next game start.

@Detanup01 Detanup01 merged commit f981989 into Detanup01:dev Aug 22, 2024
62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants