Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dante's Inferno buffer copy performance problem #17030

Closed
hrydgard opened this issue Mar 1, 2023 · 1 comment · Fixed by #17032
Closed

Dante's Inferno buffer copy performance problem #17030

hrydgard opened this issue Mar 1, 2023 · 1 comment · Fixed by #17032
Labels
GE emulation Backend-independent GPU issues

Comments

@hrydgard
Copy link
Owner

hrydgard commented Mar 1, 2023

Broken out from #16638 .

Dante is doing one of those blooms where it textures from one part of a framebuffer while drawing to another part of the same one, leading to copies. However, a lot of these copies are unnecessary since multiple draws are often executed where the source region does not overlap the destination., still we end up doing separate framebuffer copies per draw, which is very expensive since there are so many.

I believe there are a few other games similarly affected to various degrees.

The framebuffer where this processing happens is located at 110000, and it starts by doing a single draw into it, downscaling the main framebuffer which is located at 44000 or 00000, alternately. Then a sequence of draws start where texaddr = 110000 and framebuffer addr too, which is what we're talking about here, around draw 250-ish in the first scene.

Conveniently, these draws are all done in through mode, making it easy to keep track of texture and vertex coordinates without having to bother with transforms. So concievably, we could do dirty-rectangle tracking here.

Now, it does seem like some of these flushes could be skipped, which would reduce the need for said tracking. I haven't quite tracked down exactly why all of them happen yet.

This is one that gets triggered, for example:

	if (prim == GE_PRIM_RECTANGLES && (gstate.getTextureAddress(0) & 0x3FFFFFFF) == (gstate.getFrameBufAddress() & 0x3FFFFFFF)) {
		// Rendertarget == texture? Shouldn't happen. Still, try some mitigations.
		gstate_c.Dirty(DIRTY_TEXTURE_PARAMS);
		DispatchFlush();
	}

But, the game actually helpfully does a CB000000 (TexFlush) instruction when this flush actually needs to happen (texturing from a part of the framebuffer that's just been drawn to), so maybe we could rely on that instead - causing a flush directly from that in this case.

@hrydgard hrydgard added the GE emulation Backend-independent GPU issues label Mar 1, 2023
@hrydgard
Copy link
Owner Author

hrydgard commented Mar 1, 2023

Actually, during this process, the game does shuffle around the texAddr to different locations within the framebuffer (110400, 110600, 130400 etc), while keeping the framebuffer address the same. Well, we already handle that by handling those addresses as texturing from an offset within the framebuffer, but it defeats simple checks like the one above (which indeed didn't seem to happen enough times per frame). Still, something else is causing the "unnecessary" flushes.

Hm, turns out the "unnecessary" flushes happen due to FinishDeferred. We don't flush in that in the other backends, only in Vulkan...

Though it might be better to not focus on minimizing the flushes, instead the abovementioned dirty rectangle tracking to avoid copies, or just reuse copies until a texflush or framebuffer change, is seeming more and more promising...

hrydgard added a commit that referenced this issue Mar 1, 2023
… instruction.

Fixes #17030 , or at least improves on it - for optimal performance that
big framebuffer used for bloom should be split like in Killzone, but it's not trivial.

The regression in 1.14 is fixed with this, at least.

I tried it with a few other games with no issues - it seems games are
using TexFlush when needed. But let's see if it really is safe to rely
on that...

There might also be other places we should call DiscardFramebufferCopy
in.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GE emulation Backend-independent GPU issues
Projects
None yet
1 participant