-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Dota2] fossilize eats all RAM and makes the system unresponsive as a result. #84
Comments
Fossilize should already be limiting its memory usage fairly aggressively as this is a known problem. Not sure why this happens in Steam. |
Are you able to reproduce it on your end? (With the steam beta) |
Haven't triaged it, but I'll look at it. |
Ok, I tried replaying locally and I can reproduce rather large memory consumption when replaying on the 5700xt. This does not happen when I run with a dummy device, so the memory consumption must happen in the driver. I suspect it's retaining a lot of memory in the disk caches, and multiply that over lots of processes, it becomes a problem. I've seen this issue before, but to my knowledge it was fixed in Mesa a while back. Seems like I might need to poke around to see where it's hogging so much memory ... |
If it is any help, I am currently using and building mesa from the git master branch. |
Hey, I didn't know that this has been tracked here before I opened my report at steam for linux. As you can see this effects Shadow of the Tomb Raider as well. Also I'm not sure if the following should be treated as a separate issue (or if is an issue at all) but background processing happens every time I start steam even when there are no updates to the games. On top of that today SotTR run twice(few other games run between the 2) without restarting steam. |
Noticed from time to time it starts to release some memory (intended, crash or terminated?) but the consumed memory is by far higher than the released so it will still happen.
How many processes and threads should it be using in normal conditions? |
I'm also on Arch Linux:
I have installed |
I'm having the same issue on KDE neon with an RX570 (Mesa 19.2.8). Doesn't even help to launch the game with -gl, already got two abandons because I didn't manage to restart Dota within five minutes after a crash, even though I had killed all steam processes in htop. |
You can probably turn off background shader processing for the time being. |
I never had it turned on in the first place. I updated to latest oibaf drivers, let the pregame shader processing run through and am going to test this in low priority now... |
I don't have Dota so I don't know it then. Does it have a separate thing for shader processing from the steam wide one that was introduced recently? Or does the shader processing supposed to run at the start of a game? I haven't played SotTR for a while so I don't know if it would run at the start and I haven't really noticed it with other games but those usually only take a few seconds to compile. |
No I have never seen this shader processing until recently. I made the following observation, there are two types of launches for me:
So I have to suspect that when I skip shader processing, rarely fossilize starts itself in the background and wrecks my system. I'll occasionally check htop while playing and edit this message if I'll witness anything. |
@arcsaber there's an option on dota settings to disable "compute shaders", try that |
Thanks, I tried but after launching Dota fossilize still starts. |
When you observe this, do you see the memory usage after Dota has started and when it's shown its window? Are there any contents in the window or just black? When that happens do you see the memory being consumed by fossilize-replay processes, or Dota itself? |
I narrowed down a few things:
#85 should fix the memory explosion problem. |
Someone reported on Discord that the issue is resolved now, so closing. |
@HansKristian-Work Any ETA for when it will be in the steam beta? |
The reporter said it was already in Steam, I don't know anything beyond that. |
Hmm, I still have the issue then. I just tried 2 hours ago. (Steam says that I'm up to date, and I'm opted into the beta client program) |
Today, 10:40 AM : PC started (cold boot) : Ubuntu 20.04, Radeon RX5700 XT. Steam autostarted. 1st time I see this. |
I just wanted to chime in and say it seems to be solved for me now. I'll leave this open so others can confirm as well. |
The hero selection screen didn't load for me, more specifically only the top row of empty rectangles loaded and I had sound (you may now select your hero), screen was black otherwise. I checked processes and found fossilize_replay. I got an abandon because I couldn't reconnect quickly enough and have to wait an hour. =( Seems like I have to check for fossilize every single time. Edit: I also got low priority, so because of this mess I have/had to play at least five matches in low priority, that's at least around 2,5 hours of my life wasted in toxic environments for bugs which are out of my control. Come on guys... |
I used e-mail due to my general aversion of creating accounts, I might still do it if you think a forum post would get more traction? |
Good spot: "Disabled shader processing on NVIDIA while driver issues are being looked into". December 9th on https://steamcommunity.com/groups/SteamClientBeta/announcements I was already wondering why I saw no "Processing shaders" dialogue yesterday 😅 |
As a background service, we don't want to dominate the cache, so hint the kernel when we no longer need the data in cache: This commit removes the database from cache when we close it. Also hint the kernel at database opening time that we're going to read the database randomly to prevent excess cache usage introduced by readahead. Todo: * This does not yet fix the cache pressure introduced by the shader cache in the graphics driver, neither does it fix an issue in the replayer that duplicates a lot of data in memory. * It may be better to first the readahead mode to sequential on initial load, then switch to random mode. * When starting a game, fossilize-replay should probably pause or exit immediately (fossilize itself is not in control of that). See-also: ValveSoftware#84 See-also: ValveSoftware#99 Signed-off-by: Kai Krakow <kai@kaishome.de>
This seems to be an effect by coincidence: For me, BL3 is just the first game that is processed - every time. But the issue also occurs with other games. @HansKristian-Work It looks like there's at least one bug in NVIDIA somewhere:
At least my patch referenced above keeps write-back mostly under control but this is just a primitive idea yet, it needs some more tuning to adapt better to system load changes, maybe it should synchronize between all of the processes somehow. But maybe you could give it a try and share your thoughts? |
@kakra @HansKristian-Work The issue of RAM consumption and long shader processing was resolved for me in the last Steam update. However, often after the closure of Dota 2, some active I/O operations occur.(hard disk read / write indicator is flashing actively and head noise is active) As a result, when you restart the game, often the game loading stops at "Vulkan Shader Processing". I have to restart Steam |
@alexeysvrv If you're using NVIDIA, this may happen because fossilize is currently disabled there. Also, I'm not sure if the first part of performance updates has already been deployed to public release. @HansKristian-Work may shed some light on this. If the game stops at "Vulkan Shader Processing", could you look into top or htop and see if fossilize is running and not competing for CPU with other processes? |
@kakra I have an amd graphics card. And I use the mesa driver stack. You know, the processor just at this time is not loaded(judging by the task manager). But during active some hidden I / O operations, the responsiveness of the system noticeably decreases. There are brakes, lags(even with the mouse). This can last for 10-20 minutes until all these operations are completely calmed down. In order not to wait, it is easier for me to restart the steam |
I have an AMD card as well, and the long periods of fossilize using all of my CPU cores is pretty much resolved. However, what I do see now is short periods of intense activity, followed by intermittent activity, and some of the processes not completing (zombie processes). e.g:
this does not seem to clear up on its own, and I have to kill steam to clear it up. Simply pressing skip just restarts the process next time. |
@alexeysvrv This should be fixed soon with the IO pressure throttling @HansKristian-Work is currently implementing. I'll test those patches in the weekend. It's based on a simple idea I tried and should work even better. But you'll need a kernel that has the |
Yes, I've seen some children dumping core during tests, that may be the same issue. Does dmesg show any crashes? There's an open issue report by me about this.
It looks like fossilize either forgot about its children or just didn't care about reading the process exit result. If you kill Steam, those orphans will ultimately be re-parented to PID 1 which ultimately takes care of this. No need to worry: except for some status and a PID, zombies no longer use any resources. |
Hello, per "Fixed a bug where processing Vulkan shaders would run out of memory on NVIDIA Pascal cards and older" in the 2021-01-29 Steam client beta update, please opt into Steam's beta client and retest this issue. |
Yeah, it's running with |
I'm going to test :) //EDIT: so far so good, i saw fossilize work in the background, and I felt no issue. Will keep checking but first impression makes me believe it is fixed. |
Got the Steam update and updated NVIDIA drivers to 460 as recommended in the release notes. So far so good, I'm get through HZD's shader processing without issues; memory consumption is max 250MB per child, CPU utilization is >90% all the time for all children, most > 97%. This means I can turn on background shader processing again, great! |
People still having problems may need to clear their steam shader cache once (while the client is not running), then let it re-download all shader pre-caches and letting it process all games. |
Closing as fixed in the 2021-02-05 Steam client update. |
@kisak-valve I'm on the latest beta client and this is happening again with Shadow of the Tomb Raider. It'll happily gobble up all the system memory and then OOM. I've taken to manually watching it and turning background processing off when it's about to run out of memory and then turning it back on. Fortunately it is incrementally finishing it so it doesn't restart at zero each time. |
Hello @philipl, friendly reminder that I'm a moderator for Valve's issue trackers on Github, and not a Fossilize dev myself. That said, please open a new issue report with your system information so that your issue can be tracked properly. |
Try running a kernel with PSI (keeps resource usage under control) and process autogrouping (keeps CPU usage under control) turned on. If you don't know how to do that, ask in the forums of your distribution. If that does not help, follow @kisak-valve advice. |
@kisak-valve Got it. Opened #194 for it. |
So, I realize this is kinda resurrecting a zombie thread, but I figured I shoud chime in here with almost the same issue, which is still happening now in 2023 on my Ubuntu machine. Notably, I do not play the games outlined in this thread, but rather a few other games - so I question whether it's game-specific or not. (FWIW the ones I have noticed are Borderlands 3 and lately Warframe). The thing that sets it apart is that it's especially notable when the system has been asleep. Consider the following scenario: 7pm play Warframe (via Steam) for maybe 30 minutes, but then leave the game idle for several hours; play another 30-60 minutes; 11pm exit the game, and put the system to sleep; next day, 7pm awaken the system. For the next 5 minutes, the system thrashes like nobody's business, and running TOP in terminal shows 1 or 2 instances of Fossilize are using ~150% or more of the processor and signinficant amounts of RAM. Now, I'm not averse to filing a new report, or even adding some detail here if it's needed. Just note that I don't typically spend a lot of time futzing with my machine these days ... just a little light gaming, and that's about it. And besides which, I'm also not exactlly sure what to file it under. Long story short, I think there's still something that's not quite right about Steam, Fossilize, Ubuntu (or other Linux), Nvidia, or some combination of things. |
I'm also having this issue. |
Also happening to me, recently had to downgrade from 32gb of ram to 16gb due to hardware failure and steam is basically unusable if i'm not paying attention and don't click "skip" on the shader compilation. |
Specs:
In the new steam beta after every update to Dota2 (doesn't seem to matter how small of an update), fossilize will rebuild the vulkan shaders when launching Dota2.
When it is rebuilding it eats all of my ram (16gb) and makes my computer swap. Making it unresponsive for around 1-2 minutes while it is building the shader cache.
Perhaps there could be some mechanism in place to make sure fossilize doesn't go overboard with memory usage if it will go over the available amount of memory?
I don't know if this is the correct place to report this. Sorry if it is not.
The text was updated successfully, but these errors were encountered: