Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vsync via the Windows compositor when appropriate. #33145

Closed
wants to merge 3 commits into from
Closed

Vsync via the Windows compositor when appropriate. #33145

wants to merge 3 commits into from

Conversation

TerminalJack
Copy link
Contributor

@TerminalJack TerminalJack commented Oct 28, 2019

Some users are reporting bad jitter when running in windowed mode on the
Windows OS. This change will cause the OS's compositor to be used for
vsync when it is appropriate. This is a strategy that is used by other
projects such as Chromium and glfw.

fixes #19783
fixes #27211

Bugsquad edit: Superseded by #33414.

@TerminalJack
Copy link
Contributor Author

This is a resubmitted PR. The original is here.

Comment on lines +78 to +80
if (SUCCEEDED(DwmIsCompositionEnabled(&dwm_enabled))) {
return dwm_enabled;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to call this function in every frame?

On Windows 8+ it's always true, and for Windows 7 it might be better to call it once at the start and then add WM_DWMCOMPOSITIONCHANGED message handler. See DwmIsCompositionEnabled docs for more info.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what glfw and Chromium do. DwmIsCompositionEnabled() is being called only when vsync is enabled so the thread will block when it calls either Dwmflush() or SwapBuffers(). Because of this, I don't know if it is worth the trouble to listen for WM_DWMCOMPOSITIONCHANGED.

Comment on lines 74 to 75
// Note: All Windows versions supported by Godot (Vista and later)
// have a compositor. It can be disabled on earlier Windows versions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK Vista is not supported for the long time (due to some interlocking functions missing), min. supported Windows version is 7.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. I tried to figure out which versions of Windows were supported and came across a Stack Overflow thread that said Vista so that's what I went with.

@lawnjelly
Copy link
Member

Is it possible to provide more information on why this should help with jitter, and by what mechanisms?

@akien-mga akien-mga changed the title Vsync via the Windows compositor when appropriate. (Resubmitted.) Vsync via the Windows compositor when appropriate. Oct 29, 2019
@akien-mga akien-mga added this to the 3.2 milestone Oct 29, 2019
@TerminalJack
Copy link
Contributor Author

Is it possible to provide more information on why this should help with jitter, and by what mechanisms?

I wish I understood the problem better. As it is, this is a case of monkey see, monkey do. Some of the posters in the issues threads suggested that someone look at other projects to see what they do about the issue and this is what glfw and Chromium do.

Those links will take you to the source files in question for those projects. Notice that they are similar enough that one of the projects likely copied from the other. You'll also notice that they aren't happy about having to do this. Both of the projects label this a 'HACK'.

So it seems that there's a long-standing (> 5 years) bug either in OpenGL or (nVidia?) video drivers and this is the workaround.

I'm one the persons affected by the problem and, from what I can tell just from casual observation, is that SwapBuffers() either a) sometimes doesn't block and wait for the vertical blank period, or b) sometimes blocks for too long. This is just a guess. I can write some code to see what the timings look like and post the results.

I personally don't know if I would consider this a hack or not. If you think about it, the compositor provides an additional buffer and if you are double buffering via OpenGL then you will wind up with three buffers when you really only want two. This will introduce unnecessary latency. Ideally you would want OpenGL to coordinate with the compositor but, if using a single OpenGL buffer and calling DwmFlush() has the same effect then that's the next best thing.

@starry-abyss
Copy link
Contributor

My speculative and simplified vision of the compositing is as follows:

ideal:

  1. Godot renders in the window
  2. Firefox renders in the window
  3. Compositor waits for v-blank and renders the mix of all windows
  4. We have current frame of Godot shown

practical:

  1. Godot waits for v-blank
  2. Firefox renders in the window
  3. Compositor waits for v-blank and renders the mix of all windows
  4. Depending on whether compositor or Godot was faster, we have current or previous frame of Godot shown

DwmFlush is one of the ways to make sure Godot always renders before the compositor. There are sources claiming using it as wrong (e.g. https://www.vsynctester.com/firefoxisbroken.html#asolution), but IMHO it's way better than just relying on luck.

@lawnjelly
Copy link
Member

lawnjelly commented Oct 29, 2019

I wish I understood the problem better. As it is, this is a case of monkey see, monkey do. Some of the posters in the issues threads suggested that someone look at other projects to see what they do about the issue and this is what glfw and Chromium do.

I was kind of getting that impression. Don't get me wrong I agree if there is a bug in the compositor on windows in some configurations and we need to work around it we should, and this may well be a good solution. But rather than merge this straight away I think it would be good to do a little more research to confirm if this is the best fix (and try and work out why it works, if it does work!) 😄 .

Having the frame present at an unpredictable time isn't the direct problem (its not ideal), the problems really come because the game simulation needs some kind of idea of the presentation time in order to simulate objects in the correct positions. This is a major flaw in the graphics APIs currently, and is even best case a guessing game on desktop (although there is a vulkan extension in the works to help with this I believe). If the estimation of the frame time is off by varying amounts, you get jitter. If the estimation is off, but by a fairly regular gap, you get smoother playback.

See #30791 and also the links within, particularly:
https://medium.com/@alen.ladavac/the-elusive-frame-timing-168f899aec92

The vsynctester article is useful, he mentions some other features of the DWM_TIMING_INFO structure which may be an alternative? qpcVBlank for instance. He also mentions that chrome uses a timer.

I already have a fork of Godot that adds delta smoothing, this may be equivalent to the effect chrome is using. The reason I suggested in the issue to try using --fixed-fps 60 in the command line, is that we can test the effect of fixing the delta input on such a system. This will help understand whether the jitter is caused by the input timings, or due to random delays in the output frame (much as starry-abyss suggests above).

If the command line gives less jitter, the problem is at least partly due to input timings and would be helped by delta smoothing (which is something all platforms can benefit from). If the command line gives the same amount or more jitter, then that gives further weight to either using the dwmFlush fix or something similar. It would be a good idea to find this out, because from the article it appears as though dwmFlush will introduce timing discrepancies itself.

Edit:
Indeed the DWM_TIMING_INFO does seem to contain a rich amount of information that could be useful for driving the godot delta on windows (where available), in preference to the current method of sampling QueryPerformanceCounter at the start of main::iteration.

https://docs.microsoft.com/en-us/windows/win32/api/dwmapi/ns-dwmapi-dwm_timing_info

Unfortunately although I know the exact bits to try my main machine is Linux, I don't have a windows machine readily available (and am currently working in another area). However if you want to test this yourself, you could try using the queryperformancecounter values from this structure and pass them into main_timer_sync in main::iteration. I can help with this if you are not sure where to apply. This may provide far better results than dwmFlush.

@TerminalJack
Copy link
Contributor Author

@lawnjelly Yes, I agree that we want to test and understand this as well as we can before merging it since the change is in a fairly critical path for both the (Windows) editor and the end-user's game.

Assuming that those other projects really do use this same strategy then I'd say we are in pretty good company.

From my testing, the call to DwmFlush() blocks the thread until the vertical blank period and returns once the compositor is done with its work. For the game I was testing it will return about 1ms to 2ms after the vertical blank.

This is what SwapBuffers() does as well (when it works correctly) when using a swap interval of 1 (double buffering, syncing to vertical blank.) The only difference between the two that I can see is that SwapBuffers() takes longer to run. It doesn't return until 3ms to 4ms after the vertical blank.

So, in theory, you get a little extra processing time and less latency by using the compositor for vsync.

Later today I will post some results of the timing regarding DwmFlush() and SwapBuffers() in relation to the vertical blank period. I'll do this for when SwapBuffers() is behaving as it should (when in full screen) and when it is not (old code, windowed mode.)

@starry-abyss
Copy link
Contributor

This is a major flaw in the graphics APIs currently

What I'm experiencing is the sprite goes back and forth, by ~1/3 of its size. Either from time to time (stutter?), or over and over again like mad - jitter. This only ever happens with Godot.
It would be cool if microscopic imprecisions are solved as well, but currently the engine produces severely broken visuals.

@lawnjelly
Copy link
Member

From my testing, the call to DwmFlush() blocks the thread until the vertical blank period and returns once the compositor is done with its work. For the game I was testing it will return about 1ms to 2ms after the vertical blank.

How are you measuring the time of the vertical blank?

@TerminalJack
Copy link
Contributor Author

I was using DwmGetCompositionTimingInfo(). It will tell you when the next vertical blank period is.

I was basically using some version of the following code:

static void log_vblank_info() {
	LARGE_INTEGER now;
	QueryPerformanceCounter(&now);

	DWM_TIMING_INFO timing_info = { sizeof(timing_info) };

	DwmGetCompositionTimingInfo(NULL, &timing_info);

	LARGE_INTEGER frequency;
	QueryPerformanceFrequency(&frequency);

	int64_t now_us     = now.QuadPart * 1000000 / frequency.QuadPart;
	int64_t vblank_us  = timing_info.qpcVBlank * 1000000 / frequency.QuadPart;
	// int64_t compose_us = timing_info.qpcCompose * 1000000 / frequency.QuadPart;

	std::stringstream msg;
	//msg << "now_us = " << now_us << ", vblank_us = " << vblank_us;
	//msg << "now_us = " << now_us << ", compose_us = " << compose_us;
	msg << std::setprecision(5) << double(vblank_us - now_us) / 1000.0;

	log_msg(msg.str());
}

log_msg() uses a background thread to write the message to file.

log_vblank_info() was being called in ContextGL_Windows::swap_buffers()...

void ContextGL_Windows::swap_buffers() {

	...

	SwapBuffers(hDC);

	if (use_vsync) {
		log_vblank_info();
	}
}

After the file was created I had to sort the data out but the gist of it was as I stated earlier.

@lawnjelly
Copy link
Member

lawnjelly commented Oct 29, 2019

I think you should try using the qpcVBlank value or the qpcCompose value to pass to main_timer_sync in main::iteration.

If you look in main/main.cpp line 1906 you will see this:

	uint64_t ticks = OS::get_singleton()->get_ticks_usec();
	Engine::get_singleton()->_frame_ticks = ticks;
	main_timer_sync.set_cpu_ticks_usec(ticks);
	main_timer_sync.set_fixed_fps(fixed_fps);

This is the main frame time that gives the delta which drives godot. (Yes you can smooth it close to here, I do it in main_timer_sync!).

Now instead of randomly grabbing whatever the current time is when this is reached, try substituting in the value (converted to usec) from qpcVBlank etc from DWM_TIMING_INFO.

For a test you can just create a global, store this in your log_vblank_info() cpp, then extern this in main.cpp and load it directly instead of the call to OS::get_singleton()->get_ticks_usec().

This could well give far better deltas.

EDIT .. incidentally, reminder to try the --fixed-fps 60 test in the command line argument to godot.exe

@TerminalJack
Copy link
Contributor Author

@lawnjelly That sounds like an interesting strategy to get more consistent delta times but I don't think it is going to help for the situation that this particular PR is trying to fix. (Although, it might help mask it.)

When I first encountered the problem that this PR tries to address I thought that it was caused by noisy delta times but they were actually very stable.

This fix is a lot more specific than what you are proposing. This fix applies only to the Windows OS and only when the game is in windowed mode and the compositor is enabled.

@lawnjelly
Copy link
Member

There's a good discussion here:
https://bugs.chromium.org/p/chromium/issues/detail?id=467617

You may well have read this already, as you were talking about chrome method.

When I first encountered the problem that this PR tries to address I thought that it was caused by noisy delta times but they were actually very stable.

This is what the command line argument can test, once again, I would suggest trying this, to eliminate input delta as a source of the jitter. 👍

The idea of reading the frame timings from the DWM info is that getting a godot update at a close relationship to the vsync is desirable, but only indirectly, as a source of:

  1. Getting a good input delta (as currently this is done by measuring time at the start of the godot update)
  2. Getting started on the computations to make the next frame as fast as possible (keep the pipeline fed).

If we can read (1) directly it alleviates the need to have a constant relationship with vsync.

The chrome guys seem to be barking up this tree too (of close relationship to vsync), I'm not sure if it is merited, if it is possible to read the frame timings by another method.

Instead of thinking of the frame renders as realtime, think of it as pre-rendered. It isn't super important that we start doing calculations near vsync (other than to keep the pipeline fed), but what IS important are the deltas so we render the objects in the right place when that frame is displayed.

On the other hand whether this DWM_TIMING_INFO provides any kind of reliable results remains to be seen, the chrome thread seems to suggest they might not be that useful.

It actually only needs to be approximately correct anyway, as we know deltas can only really be multiples of the refresh rate. I've not even started with the problem of the read delta being the delta from 3 frames ago...

Off to bed now, will have a look in the morning. ☺️

@TerminalJack
Copy link
Contributor Author

TerminalJack commented Oct 29, 2019

@starry-abyss

DwmFlush is one of the ways to make sure Godot always renders before the compositor. There are sources claiming using it as wrong (e.g. https://www.vsynctester.com/firefoxisbroken.html#asolution), but IMHO it's way better than just relying on luck.

Yes, in the context that the author of that article is referring, using DwmFlush() is less than ideal. Apparently, Firefox was using it to compute when the vertical blank period starts and the author was pointing out that the function doesn't return when the vertical blank period starts but it returns only after it has finished with processing the buffer--which will be a couple of milliseconds after the vertical blank period starts.

We're not using the function for that. We are using it only to emulate the behavior that we need from an apparently broken SwapBuffers().

@TerminalJack
Copy link
Contributor Author

I've got a bit more insight into the nature of this problem to report. I changed context_gl_windows.cpp (gist) and had it log a bunch of frame timing information. I then ran a game in windowed mode both with and without the call to DwmFlush(). Also, for reference, I ran a game in full screen mode, which isn't affected by this problem.

The interesting thing about all of the logs is that they indicate that there aren't any problems! Nothing to see here. Carry on.

The two logs that don't use DwmFlush() look similar. They show that the bulk of the time for each frame is spent in SwapBuffers(). The log of the game that uses DwmFlush() shows the bulk of the time for each frame being spent in DwmFlush(). This seems perfectly reasonable. Each of the frame lengths was approximately 1/60 of a second which corresponds to the monitor's refresh rate. The vertical blank period for the data that I inspected showed that it was occurring in either SwapBuffer() or DwmFlush().

This was puzzling but I believe it has a simple explanation. My theory is that the OS isn't drawing some of the frames in the case where DwmFlush() isn't being used. The game engine is drawing all of them just fine but some of them are never making it to the screen. Something that supports this theory is that if you click on another window (a command prompt in my case) and make it active then the jitter gets much worse.

@TerminalJack
Copy link
Contributor Author

I did some more research on this issue. If you Google opengl windows windowed mode dwmflush you will find a lot of useful information regarding the issue.

Here are a couple of other projects that use DwmFlush(): mpv and snes9x. Search those projects for 'dwmflush'. Something to note is that they call DwmFlush() after swapping buffers. That seems more correct to me. I implemented it the way I did since the projects I based my code on put it before the call to SwapBuffers().

@TerminalJack
Copy link
Contributor Author

I did a before and after screen recording of this problem.

First I show the 3.2 alpha3 build then a custom build with the changes I made. Note that I moved the call to DwmFlush() after SwapBuffers() but that change hasn't been checked in. It doesn't seem to make any difference but it may help get things to the screen faster and reduce latency a bit.

@lawnjelly
Copy link
Member

I did a before and after screen recording of this problem.

First I show the 3.2 alpha3 build then a custom build with the changes I made. Note that I moved the call to DwmFlush() after SwapBuffers() but that change hasn't been checked in. It doesn't seem to make any difference but it may help get things to the screen faster and reduce latency a bit.

It does seem to be improved. Have you tried the test with --fixed-fps 60 yet? (or whatever you monitor refresh is)

@TerminalJack
Copy link
Contributor Author

It does seem to be improved. Have you tried the test with --fixed-fps 60 yet? (or whatever you monitor refresh is)

If I create a Win64 export target with the 3.2 alpha3 build and run it it will behave fairly well whether I use the --fixed-fps 60 option or not. This is using the same conditions as I was testing in the video. The only difference is that I didn't launch the game from the editor. (It is running in the background, however.)

If I run the OBS screen recorder then it will start jittering/stuttering regardless of the --fixed-fps 60 option.

I had the same result when using my test build with the --fixed-fps 60 option.

I'm guessing the vsync feature is turned off when this option is used. If that's the case then my changes wouldn't be in play. If so then this lends credence to my theory that the frames never get painted to the screen.

As an aside, at one point I was looking at the Windows implementation of this feature and I noticed that it is using the Windows Sleep() function for the delays. Microsoft says you shouldn't use this function for any type of precision timing so I'm curious as to how well it would work when the system is under a light load. (I doubt this is the problem in this particular case.)

@lawnjelly
Copy link
Member

If I create a Win64 export target with the 3.2 alpha3 build and run it it will behave fairly well whether I use the --fixed-fps 60 option or not. This is using the same conditions as I was testing in the video. The only difference is that I didn't launch the game from the editor. (It is running in the background, however.)

This is interesting. There's a lot of stuff the game does in the editor which it doesn't do when exported. If you search through the source you'll find loads of area use Engine::get_singleton()->is_editor_hint() and #ifdef TOOLS_ENABLED. One example is the debugging communication between the game and the IDE, which I vaguely remember can cause stalls etc.

As such the gold standard for checking anything to do with jitter / performance should be a release export rather than in the editor. In the editor jitter free would be 'nice to have', but is not necessary for most evaluation.

If you don't get the same jitter (or differences in jitter from the PR) on an export (i.e. there is only evidence of it helping in the IDE) then that is a possible argument for wrapping the PR in TOOLS_ENABLED or is_editor_hint.

If I run the OBS screen recorder then it will start jittering/stuttering regardless of the --fixed-fps 60 option.

I have a vague recollection of reading that OBS does some funky stuff with timers to help the screen recording, that could be having effects. I'll try and dig up the article.

I had the same result when using my test build with the --fixed-fps 60 option.

Now we've identified it as being worse in the editor, I'm not quite sure off hand where you put the command line to get fed to the spawned game from the editor. You might have to have a look through the source. If you use the command line to the editor, it may mean the editor will use it, but not the spawned game (I'm not sure on this, needs checking). If you use it directly to run an exported game, it will work, but then you say you can't notice a difference from the PR in exports...

I'm guessing the vsync feature is turned off when this option is used. If that's the case then my changes wouldn't be in play.

No, the --fixed-fps is confusingly named, it should really be called fixed-fps-delta or something like this. I've mentioned the naming as a problem before, we might see if it can be changed as it causes confusion. The only thing it changes is it fixes the frame delta used for timing in the main::iteration. There is another setting in project-settings which you may be confusing with this.

As an aside, at one point I was looking at the Windows implementation of this feature and I noticed that it is using the Windows Sleep() function for the delays. Microsoft says you shouldn't use this function for any type of precision timing so I'm curious as to how well it would work when the system is under a light load.

This is absolutely right in that Sleep() is useless for precision on windows, and under normal running circumstances it shouldn't be called in a main game thread. However I think this is associated with
is_in_low_processor_usage_mode and get_target_fps rather than fixed-fps.

I haven't mentioned this so far but one of the reasons of being wary of this kind of approach is that it seems to be a 'workaround' to a possible issue in the OS - something they may at a later point fix to work in a different way. As such you can end up 'fixing' the current situation on your PC, but that fix can also break the situation on other configurations. You have used checks for DWM being used, which is good, but it doesn't deal with the situation that if Microsoft later decide they made a mistake and change the implementation, your fix may now make the situation worse rather than better.

This sounds contrived but it unfortunately happens all the time. Try getting any old windows game working on later versions of the same OS and you often end up having workaround hacks that the game put it for the behaviour of the current OS (at the time of development). It also happens all the time in GPU commands, some hardware may have bugs etc.

There is no ideal solution to this, however it can be a good idea as much as possible to conform to the exact specification of the commands, rather than the behaviour on any particular configuration. If you do add in methods that could be considered 'hacks' for the current behaviour of a particular configuration (which I'm not totally against myself, although some may be), I personally believe it can be a good idea to make them switchable somehow by the user, to make them more future proof.

@lawnjelly
Copy link
Member

lawnjelly commented Nov 3, 2019

TLDR - For my post above:

  1. If the PR is only useful for running within the editor, and it runs as editor only, there are far fewer dangers from merging it
  2. On the other hand there is less pressing need to merge something that only improves editor experience only
  3. If it does improve situation in editor only, it may not be via the mechanism suggested (it may be interacting with Sleep for example)
  4. If we were going to merge for non-editor (if it were useful for game releases, which now seems less clear based on TerminalJack's post) I would suggest making it optional at least at first, so it can be more widely tested (either a command line argument, or something in project settings)

Also I think in practice, this is something reduz will need to look over and decide as he is most familiar with this area. During the PR meeting when we discussed it, he wasn't sure about it:

[14:03:21] <Akien> reduz: Needs further testing reports, but on the principle WDYT about letting the compositor handle vsync as suggested?
[14:05:54] <reduz> Akien: I think its an enhancement and it should not go to 3.2
[14:06:20] <Akien> You must be kidding.
[14:06:42] <Akien> How is fixing a stutter cause an enhancement?
[14:08:04] <reduz> It should not really have to do with the compositor tbh, and you also risk wasting more CPU time, I think it should be an option and probably better tested
[14:08:17] <reduz> on Vulkan you cant even do that
[14:08:29] <reduz> so to me its probably not the way to go
[14:08:54] <reduz> as in, to me it sounds like it has a cost and it needs to be understood better
[14:09:23] <Akien> Ok, let's wait for more data from users who actually see a change with this PR.

@TerminalJack
Copy link
Contributor Author

If, as you say, vsync is still enabled when using the --fixed-fps 60 option then I find it very perplexing that I still had stuttering/jitters with my changes. Either I messed something up (ran the wrong EXE) or that isn't actually true because it behaved exactly like it does without the call to DwmFlush().

As reduz suspects, there is, in fact, a cost to using DwmFlush(). Unfortunately, it isn't functionally equivalent to SwapBuffers() with a swap interval of 1.

When OpenGL's vsync implemenation works properly--with or without the compositor--the call to SwapBuffers() doesn't have to block the calling thread while it waits for the vertical blank period. It can return immediately and let the calling thread continue running and block at some later point while drawing (maybe even the next call to SwapBuffers().)

If the thread does non-OpenGL work right after the call to SwapBuffers() (physics processing, for example, as Godot does) then this is basically extra processing time that you wouldn't otherwise get if SwapBuffers() blocked waiting for the vertical blank.

DwmFlush() doesn't seem to operate this way. From what I can tell it will always block and wait for the vertical blank. Because of this you lose this extra processing time and would want to use DwmFlush() only as needed. Unfortunately, I'm not sure how you would determine that it is necessary without some kind of user interaction. All of the other projects that I looked at basically do what I did in my code.

At any rate, since there doesn't seem to be much of an appetite for this PR, I will likely close it here shortly. I'll attach a patch file to this thread so that the changes aren't lost.

@lawnjelly
Copy link
Member

At any rate, since there doesn't seem to be much of an appetite for this PR, I will likely close it here shortly. I'll attach a patch file to this thread so that the changes aren't lost.

I wouldn't say that, I think everyone is super keen to find changes that can improve jitter, and it is an area ripe for investigation, and I for one appreciate the efforts you have gone to here. It definitely looks improved on your video.

Because the timing / vsync are so central to the engine, peer review will tend to be quite scrutinising, this is to be expected, so please don't get discouraged. The more people investigating, experimenting and the more ideas the better imo, I myself wasn't aware of DwmFlush as an option on windows until your posts! 😄 It would be great to have some input from either someone from the windows team or someone working on the hardware side.

Either I messed something up (ran the wrong EXE) or that isn't actually true because it behaved exactly like it does without the call to DwmFlush().

If you do a find in files the command line fixed-fps is loaded in Main::setup and stored in fixed_fps, a global in main.cpp. This is only used to pass to main_timer_sync which determines the delta, if it is present it uses 1/fixed_fps as the delta. Some particles also use variable called fixed_fps but this is not related as far as I can see.

It might be worth putting in a debug output (e.g. a print_line("fixed_fps is active")) to double check it is being used by the game if you aren't sure. It is real easy to mess up the testing, I make mistakes all the time with things like this.

Running with a fixed delta I think will be invaluable to understanding why particular things help with jitter. This is because there can be a feedback loop between output jitter and input - if for example there's a long gap at frame 3 (for whatever reason) the current logic means this will give a greater delta for perhaps frame 5 or 6, which if you think about it makes absolutely no sense. This is really an oversight in the API, there have been some attempts to fix this in vulkan I believe.

My personal preference for an approach would be for the game to make each simulated frame with a frame number, at a given frames per second, say 60... Frame 0, 1, 2, 3 etc. The GPU then receives these numbered frames and attempts to render and display the relevant frame at the relevant time in the future. If it drops a frame it should delete the frame and start working on the next one, and NOT display it on the next vsync (which is the situation now). This is what leads to a discrepancy between the position objects should be on a frame and where they are at a frame. Instead it should attempt to reduce the work done so that the frame rate can be maintained (e.g. decrease the resolution), or drop to a lower frame rate (say 30fps) until it can consistently keep up at 60fps again.

But am I correct in thinking that you currently now believe the fix only works when running from the editor?

@TerminalJack
Copy link
Contributor Author

But am I correct in thinking that you currently now believe the fix only works when running from the editor?

Well, to be honest, I haven't used my code to build an export template. When I tested my changes with the --fixed-fps 60 command line option I used the editor EXE that I built and ran it with the options --fixed-fps 60 --main-pack <file> to run the package file. I don't know if that's a factor or not.

I'll play around with it some more. I'm still not convinced that vsync isn't disabled when using the --fixed-fps option. The usage (godot --help) for it says...

Force a fixed number of frames per second. This setting disables real-time synchronization.

It will be easy enough to add some logging or print statements to see if DwmFlush() is being called.

From what I understand about the issue, though, the problem should be independent from running under the editor. I think having the editor running just exacerbates the problem. My theory still is that the OS isn't drawing some of the frames unless DwmFlush() is called and that shouldn't have anything to do with whether the game is running from the editor or not.

I might see what it will take to make this a configuration/command-line option. That's what some of the other projects do. That way it won't interfere with the editor or any existing games. People can explicitly enable the option if and when they need to.

Having the ability to easily enable this might be a good way to gauge whether or not it truly fixes this particular problem as well as gauge how many people are having the problem. I know that if I had noticed the problem when I started using Godot that I would have been very discouraged. (The project that I used to learn Godot was a full screen project so I never ran into this particular issue.) But if I had this problem and a way to resolve it then I would be happy. Right now the only real resolution for the problem (if your hardware has this problem) is to only run in full screen mode.

@TerminalJack TerminalJack requested a review from a team as a code owner November 5, 2019 19:01
@TerminalJack
Copy link
Contributor Author

I am making changes that will make this a project setting as well as a command line option (which, if present, will override the project setting.) I am going to close this PR and resubmit it as a different PR soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Very bad FPS stability. Heavy stuttering issue in simple 2D game [Windows 10, Nvidia]
5 participants