-
-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vsync via the Windows compositor when appropriate. #33145
Conversation
This is a resubmitted PR. The original is here. |
if (SUCCEEDED(DwmIsCompositionEnabled(&dwm_enabled))) { | ||
return dwm_enabled; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to call this function in every frame?
On Windows 8+ it's always true, and for Windows 7 it might be better to call it once at the start and then add WM_DWMCOMPOSITIONCHANGED
message handler. See DwmIsCompositionEnabled docs for more info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what glfw and Chromium do. DwmIsCompositionEnabled()
is being called only when vsync is enabled so the thread will block when it calls either Dwmflush()
or SwapBuffers()
. Because of this, I don't know if it is worth the trouble to listen for WM_DWMCOMPOSITIONCHANGED.
// Note: All Windows versions supported by Godot (Vista and later) | ||
// have a compositor. It can be disabled on earlier Windows versions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK Vista is not supported for the long time (due to some interlocking functions missing), min. supported Windows version is 7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops. I tried to figure out which versions of Windows were supported and came across a Stack Overflow thread that said Vista so that's what I went with.
Is it possible to provide more information on why this should help with jitter, and by what mechanisms? |
I wish I understood the problem better. As it is, this is a case of monkey see, monkey do. Some of the posters in the issues threads suggested that someone look at other projects to see what they do about the issue and this is what glfw and Chromium do. Those links will take you to the source files in question for those projects. Notice that they are similar enough that one of the projects likely copied from the other. You'll also notice that they aren't happy about having to do this. Both of the projects label this a 'HACK'. So it seems that there's a long-standing (> 5 years) bug either in OpenGL or (nVidia?) video drivers and this is the workaround. I'm one the persons affected by the problem and, from what I can tell just from casual observation, is that I personally don't know if I would consider this a hack or not. If you think about it, the compositor provides an additional buffer and if you are double buffering via OpenGL then you will wind up with three buffers when you really only want two. This will introduce unnecessary latency. Ideally you would want OpenGL to coordinate with the compositor but, if using a single OpenGL buffer and calling DwmFlush() has the same effect then that's the next best thing. |
My speculative and simplified vision of the compositing is as follows: ideal:
practical:
DwmFlush is one of the ways to make sure Godot always renders before the compositor. There are sources claiming using it as wrong (e.g. https://www.vsynctester.com/firefoxisbroken.html#asolution), but IMHO it's way better than just relying on luck. |
I was kind of getting that impression. Don't get me wrong I agree if there is a bug in the compositor on windows in some configurations and we need to work around it we should, and this may well be a good solution. But rather than merge this straight away I think it would be good to do a little more research to confirm if this is the best fix (and try and work out why it works, if it does work!) 😄 . Having the frame present at an unpredictable time isn't the direct problem (its not ideal), the problems really come because the game simulation needs some kind of idea of the presentation time in order to simulate objects in the correct positions. This is a major flaw in the graphics APIs currently, and is even best case a guessing game on desktop (although there is a vulkan extension in the works to help with this I believe). If the estimation of the frame time is off by varying amounts, you get jitter. If the estimation is off, but by a fairly regular gap, you get smoother playback. See #30791 and also the links within, particularly: The vsynctester article is useful, he mentions some other features of the DWM_TIMING_INFO structure which may be an alternative? qpcVBlank for instance. He also mentions that chrome uses a timer. I already have a fork of Godot that adds delta smoothing, this may be equivalent to the effect chrome is using. The reason I suggested in the issue to try using If the command line gives less jitter, the problem is at least partly due to input timings and would be helped by delta smoothing (which is something all platforms can benefit from). If the command line gives the same amount or more jitter, then that gives further weight to either using the dwmFlush fix or something similar. It would be a good idea to find this out, because from the article it appears as though dwmFlush will introduce timing discrepancies itself. Edit: https://docs.microsoft.com/en-us/windows/win32/api/dwmapi/ns-dwmapi-dwm_timing_info Unfortunately although I know the exact bits to try my main machine is Linux, I don't have a windows machine readily available (and am currently working in another area). However if you want to test this yourself, you could try using the queryperformancecounter values from this structure and pass them into main_timer_sync in main::iteration. I can help with this if you are not sure where to apply. This may provide far better results than dwmFlush. |
@lawnjelly Yes, I agree that we want to test and understand this as well as we can before merging it since the change is in a fairly critical path for both the (Windows) editor and the end-user's game. Assuming that those other projects really do use this same strategy then I'd say we are in pretty good company. From my testing, the call to This is what So, in theory, you get a little extra processing time and less latency by using the compositor for vsync. Later today I will post some results of the timing regarding |
What I'm experiencing is the sprite goes back and forth, by ~1/3 of its size. Either from time to time (stutter?), or over and over again like mad - jitter. This only ever happens with Godot. |
How are you measuring the time of the vertical blank? |
I was using I was basically using some version of the following code:
After the file was created I had to sort the data out but the gist of it was as I stated earlier. |
I think you should try using the qpcVBlank value or the qpcCompose value to pass to main_timer_sync in main::iteration. If you look in
This is the main frame time that gives the delta which drives godot. (Yes you can smooth it close to here, I do it in main_timer_sync!). Now instead of randomly grabbing whatever the current time is when this is reached, try substituting in the value (converted to usec) from qpcVBlank etc from DWM_TIMING_INFO. For a test you can just create a global, store this in your log_vblank_info() cpp, then extern this in main.cpp and load it directly instead of the call to OS::get_singleton()->get_ticks_usec(). This could well give far better deltas. EDIT .. incidentally, reminder to try the |
@lawnjelly That sounds like an interesting strategy to get more consistent delta times but I don't think it is going to help for the situation that this particular PR is trying to fix. (Although, it might help mask it.) When I first encountered the problem that this PR tries to address I thought that it was caused by noisy delta times but they were actually very stable. This fix is a lot more specific than what you are proposing. This fix applies only to the Windows OS and only when the game is in windowed mode and the compositor is enabled. |
There's a good discussion here: You may well have read this already, as you were talking about chrome method.
This is what the command line argument can test, once again, I would suggest trying this, to eliminate input delta as a source of the jitter. 👍 The idea of reading the frame timings from the DWM info is that getting a godot update at a close relationship to the vsync is desirable, but only indirectly, as a source of:
If we can read (1) directly it alleviates the need to have a constant relationship with vsync. The chrome guys seem to be barking up this tree too (of close relationship to vsync), I'm not sure if it is merited, if it is possible to read the frame timings by another method. Instead of thinking of the frame renders as realtime, think of it as pre-rendered. It isn't super important that we start doing calculations near vsync (other than to keep the pipeline fed), but what IS important are the deltas so we render the objects in the right place when that frame is displayed. On the other hand whether this DWM_TIMING_INFO provides any kind of reliable results remains to be seen, the chrome thread seems to suggest they might not be that useful. It actually only needs to be approximately correct anyway, as we know deltas can only really be multiples of the refresh rate. I've not even started with the problem of the read delta being the delta from 3 frames ago... Off to bed now, will have a look in the morning. |
Yes, in the context that the author of that article is referring, using We're not using the function for that. We are using it only to emulate the behavior that we need from an apparently broken |
I've got a bit more insight into the nature of this problem to report. I changed The interesting thing about all of the logs is that they indicate that there aren't any problems! Nothing to see here. Carry on. The two logs that don't use This was puzzling but I believe it has a simple explanation. My theory is that the OS isn't drawing some of the frames in the case where |
I did some more research on this issue. If you Google Here are a couple of other projects that use |
I did a before and after screen recording of this problem. First I show the 3.2 alpha3 build then a custom build with the changes I made. Note that I moved the call to |
It does seem to be improved. Have you tried the test with |
If I create a Win64 export target with the 3.2 alpha3 build and run it it will behave fairly well whether I use the If I run the OBS screen recorder then it will start jittering/stuttering regardless of the I had the same result when using my test build with the I'm guessing the vsync feature is turned off when this option is used. If that's the case then my changes wouldn't be in play. If so then this lends credence to my theory that the frames never get painted to the screen. As an aside, at one point I was looking at the Windows implementation of this feature and I noticed that it is using the Windows |
This is interesting. There's a lot of stuff the game does in the editor which it doesn't do when exported. If you search through the source you'll find loads of area use As such the gold standard for checking anything to do with jitter / performance should be a release export rather than in the editor. In the editor jitter free would be 'nice to have', but is not necessary for most evaluation. If you don't get the same jitter (or differences in jitter from the PR) on an export (i.e. there is only evidence of it helping in the IDE) then that is a possible argument for wrapping the PR in TOOLS_ENABLED or is_editor_hint.
I have a vague recollection of reading that OBS does some funky stuff with timers to help the screen recording, that could be having effects. I'll try and dig up the article.
Now we've identified it as being worse in the editor, I'm not quite sure off hand where you put the command line to get fed to the spawned game from the editor. You might have to have a look through the source. If you use the command line to the editor, it may mean the editor will use it, but not the spawned game (I'm not sure on this, needs checking). If you use it directly to run an exported game, it will work, but then you say you can't notice a difference from the PR in exports...
No, the
This is absolutely right in that Sleep() is useless for precision on windows, and under normal running circumstances it shouldn't be called in a main game thread. However I think this is associated with I haven't mentioned this so far but one of the reasons of being wary of this kind of approach is that it seems to be a 'workaround' to a possible issue in the OS - something they may at a later point fix to work in a different way. As such you can end up 'fixing' the current situation on your PC, but that fix can also break the situation on other configurations. You have used checks for DWM being used, which is good, but it doesn't deal with the situation that if Microsoft later decide they made a mistake and change the implementation, your fix may now make the situation worse rather than better. This sounds contrived but it unfortunately happens all the time. Try getting any old windows game working on later versions of the same OS and you often end up having workaround hacks that the game put it for the behaviour of the current OS (at the time of development). It also happens all the time in GPU commands, some hardware may have bugs etc. There is no ideal solution to this, however it can be a good idea as much as possible to conform to the exact specification of the commands, rather than the behaviour on any particular configuration. If you do add in methods that could be considered 'hacks' for the current behaviour of a particular configuration (which I'm not totally against myself, although some may be), I personally believe it can be a good idea to make them switchable somehow by the user, to make them more future proof. |
TLDR - For my post above:
Also I think in practice, this is something reduz will need to look over and decide as he is most familiar with this area. During the PR meeting when we discussed it, he wasn't sure about it:
|
If, as you say, vsync is still enabled when using the As reduz suspects, there is, in fact, a cost to using When OpenGL's vsync implemenation works properly--with or without the compositor--the call to If the thread does non-OpenGL work right after the call to
At any rate, since there doesn't seem to be much of an appetite for this PR, I will likely close it here shortly. I'll attach a patch file to this thread so that the changes aren't lost. |
I wouldn't say that, I think everyone is super keen to find changes that can improve jitter, and it is an area ripe for investigation, and I for one appreciate the efforts you have gone to here. It definitely looks improved on your video. Because the timing / vsync are so central to the engine, peer review will tend to be quite scrutinising, this is to be expected, so please don't get discouraged. The more people investigating, experimenting and the more ideas the better imo, I myself wasn't aware of DwmFlush as an option on windows until your posts! 😄 It would be great to have some input from either someone from the windows team or someone working on the hardware side.
If you do a find in files the command line fixed-fps is loaded in Main::setup and stored in fixed_fps, a global in main.cpp. This is only used to pass to main_timer_sync which determines the delta, if it is present it uses 1/fixed_fps as the delta. Some particles also use variable called fixed_fps but this is not related as far as I can see. It might be worth putting in a debug output (e.g. a print_line("fixed_fps is active")) to double check it is being used by the game if you aren't sure. It is real easy to mess up the testing, I make mistakes all the time with things like this. Running with a fixed delta I think will be invaluable to understanding why particular things help with jitter. This is because there can be a feedback loop between output jitter and input - if for example there's a long gap at frame 3 (for whatever reason) the current logic means this will give a greater delta for perhaps frame 5 or 6, which if you think about it makes absolutely no sense. This is really an oversight in the API, there have been some attempts to fix this in vulkan I believe. My personal preference for an approach would be for the game to make each simulated frame with a frame number, at a given frames per second, say 60... Frame 0, 1, 2, 3 etc. The GPU then receives these numbered frames and attempts to render and display the relevant frame at the relevant time in the future. If it drops a frame it should delete the frame and start working on the next one, and NOT display it on the next vsync (which is the situation now). This is what leads to a discrepancy between the position objects should be on a frame and where they are at a frame. Instead it should attempt to reduce the work done so that the frame rate can be maintained (e.g. decrease the resolution), or drop to a lower frame rate (say 30fps) until it can consistently keep up at 60fps again. But am I correct in thinking that you currently now believe the fix only works when running from the editor? |
Well, to be honest, I haven't used my code to build an export template. When I tested my changes with the I'll play around with it some more. I'm still not convinced that vsync isn't disabled when using the
It will be easy enough to add some logging or print statements to see if From what I understand about the issue, though, the problem should be independent from running under the editor. I think having the editor running just exacerbates the problem. My theory still is that the OS isn't drawing some of the frames unless I might see what it will take to make this a configuration/command-line option. That's what some of the other projects do. That way it won't interfere with the editor or any existing games. People can explicitly enable the option if and when they need to. Having the ability to easily enable this might be a good way to gauge whether or not it truly fixes this particular problem as well as gauge how many people are having the problem. I know that if I had noticed the problem when I started using Godot that I would have been very discouraged. (The project that I used to learn Godot was a full screen project so I never ran into this particular issue.) But if I had this problem and a way to resolve it then I would be happy. Right now the only real resolution for the problem (if your hardware has this problem) is to only run in full screen mode. |
I am making changes that will make this a project setting as well as a command line option (which, if present, will override the project setting.) I am going to close this PR and resubmit it as a different PR soon. |
Some users are reporting bad jitter when running in windowed mode on the
Windows OS. This change will cause the OS's compositor to be used for
vsync when it is appropriate. This is a strategy that is used by other
projects such as Chromium and glfw.
fixes #19783
fixes #27211
Bugsquad edit: Superseded by #33414.