Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle guardband clip/cull better for hardware backends #14833

Merged
merged 17 commits into from
Oct 19, 2021

Conversation

unknownbrackets
Copy link
Collaborator

@unknownbrackets unknownbrackets commented Sep 12, 2021

This doesn't really fix software transform, but it does implement user clip planes for Vulkan, GL/GLES, and D3D11. There is a way to set a clip plane with a constant buffer in D3D9, but it's more work and I don't think it can do the culling anyway.

This is probably dangerous to merge before 1.12.x, but it does fix a bunch of annoying/visible bugs in certain games and things work generally correctly. I fully expect weird driver bugs.

Features still missing (for follow up):

  • Rectangles have clipping occur, and shouldn't (debating sending a uniform or VSID bit...)
  • Only works on devices that support clip and cull distance, especially not Mali.
  • Sengoku Cannon - pink graphics problem on stage 2 #10914 still doesn't show as expected, and I think (?) it may be a precision issue, or else it means I implemented clipping wrong (but everything else seems right...)

For software transform specifically, I did implement a change that implements culling, at least:
unknownbrackets@da50f87

But I don't really like it for a number of reasons:

  • Of course, it'd mean Mali users need to use software transform to get proper rendering... though that's still better than today...
  • It could be more optimal, it redoes the same verts now if reused between indices. Was proving if it worked. And adding projection to the sw transform steps isn't great (what's more, GPU does it again anyway.)
  • It doesn't do clipping, so a lot of the issues are not properly fixed anyway.

Anyway, outside #10914, this takes care of the issues linked to #12058 that I have dumps for, at least on my PC. Haven't tested any wide range of devices at this point. Any reports of success or failure are welcome.

-[Unknown]

@unknownbrackets unknownbrackets added this to the Future-Prio milestone Sep 12, 2021
@hrydgard
Copy link
Owner

This is really super cool! Finally we have an explanation for most/all of the culling weirdness, and we'll be able to delete a lot of stuff from compat.ini, I think, even if not fully implemented on D3D9 at least the guardband culling should work better with this.

I'll look at this in detail later today.

But yes, will probably wait until post 1.12 - the main danger is indeed driver bugs.

As for Mali, I want to try later to implement a geometry shader path that does the clipping and culling, and see if it's too slow or not. As previously mentioned, most recent Mali devices do (reluctantly!) support geometry shaders.

@hrydgard hrydgard added GE emulation Backend-independent GPU issues Guardband / Range Culling Involves vertices outside fustrum. Depth / Z Issue involves depth drawing parameters. labels Sep 12, 2021
@ghost
Copy link

ghost commented Sep 14, 2021

This pr also help #1788?

@ghost ghost mentioned this pull request Sep 14, 2021
@unknownbrackets
Copy link
Collaborator Author

This pr also help #1788?

I didn't try it, but I already thought accurate depth avoided those issues. I assume these changes would make any remaining issues work better too. That said, this doesn't remove any of those hack settings - yet. So it doesn't change the Phantasy Star Portable behavior.

-[Unknown]

@vit9696
Copy link
Contributor

vit9696 commented Sep 19, 2021

Tried experimenting with that, and apparently this causes all kinds of issues on macOS 11.5 with OpenGL backend on AMD RX 590 with Valkyria Chronicles II. Is that known? Vulkan seems to be ok.

Screenshot 2021-09-19 at 14 27 40

Screenshot 2021-09-19 at 14 28 02

@hrydgard
Copy link
Owner

No but now it is! Thanks for reporting.

@vit9696
Copy link
Contributor

vit9696 commented Sep 19, 2021

Sounds like it is having hard times with the trash GL backend on macOS:

log
VulkanMayBeAvailable: Device allowed ('SDL:macOS')
Vulkan loader: Library not available
DEBUG: Vulkan is not available, not using Vulkan.
2021-09-19 14:43:50.549532+0300 PPSSPPSDL[64947:3082283] [plugin] AddInstanceForFactory: No factory registered for id <CFUUID 0x60000390a560> F8BB1C28-BAE8-11D6-9C31-00039315CD46
Info: We compiled against SDL version 2.0.16 and we are linking against SDL version 2.0.16. :)
ThreadManager::Init(compute threads: 16, all: 20)
43:50:604 Core/Config.cpp:627 I[G3D]: Longest display side: -1 pixels. Choosing scale 1
Pixels: 960 x 544
Virtual pixels: 960 x 544
Vulkan init error 'Failed to load Vulkan driver library' - falling back to GL
OpenGL 2.0 or higher.
loading control pad mappings from gamecontrollerdb.txt: SUCCESS!
found control pad: PS3 Controller, loading mapping: SUCCESS, mapping is:
030000004c0500006802000000010000,PS3 Controller,a:b14,b:b13,back:b0,dpdown:b6,dpleft:b7,dpright:b5,dpup:b4,guide:b16,leftshoulder:b10,leftstick:b1,lefttrigger:b8,leftx:a0,lefty:a1,rightshoulder:b11,rightstick:b2,righttrigger:b9,rightx:a2,righty:a3,start:b3,x:b15,y:b12,
pad 1 has been assigned to control pad: PS3 Controller
found control pad: PS3 Controller, loading mapping: SUCCESS, mapping is:
030000004c0500006802000000010000,PS3 Controller,a:b14,b:b13,back:b0,dpdown:b6,dpleft:b7,dpright:b5,dpup:b4,guide:b16,leftshoulder:b10,leftstick:b1,lefttrigger:b8,leftx:a0,lefty:a1,rightshoulder:b11,rightstick:b2,righttrigger:b9,rightx:a2,righty:a3,start:b3,x:b15,y:b12,
43:54:624 root         N[BOOT]: UI/EmuScreen.cpp:341 Loading /Volumes/C0/Soft/Soft (PSP)/ISO/Valkyria Chronicles 2 [Undub].iso...
43:54:813 Odin_Main    E[SCEUTIL]: HLE/sceUtility.cpp:487 80111102=sceUtilityLoadModule(00000301): already loaded
43:54:813 Odin_Main    E[SCEUTIL]: HLE/sceUtility.cpp:487 80111102=sceUtilityLoadModule(00000300): already loaded
43:54:813 Odin_Main    E[SCEUTIL]: HLE/sceUtility.cpp:487 80111102=sceUtilityLoadModule(00000302): already loaded
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 Error in shader compilation for: 01070000:00000938 HWX C T Tex Light: MatUp:7 Cull 
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 Info log: ERROR: 0:51: Use of undeclared identifier 'gl_CullDistance'
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 ERROR: 0:52: Use of undeclared identifier 'gl_CullDistance'
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 Shader source:
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    1:  #version 410
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    2:  // AMD Radeon RX 590 OpenGL Engine - GLSL 410
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    3:  #define gl_VertexIndex gl_VertexID
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    4:  #define lowp
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    5:  #define mediump
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    6:  #define highp
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    7:  #define splat3(x) vec3(x)
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    8:  #define mul(x, y) ((x) * (y))
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    9:  in vec3 position;
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   10:  in vec2 texcoord;
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   11:  in lowp vec4 color0;
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   12:  uniform mat4 u_proj;
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   13:  uniform mat4 u_world;
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   14:  uniform mat4 u_view;
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   15:  uniform vec4 u_uvscaleoffset;
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   16:  uniform lowp vec4 u_ambient;
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   17:  uniform lowp vec4 u_matspecular;
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   18:  uniform lowp vec3 u_matemissive;
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   19:  uniform lowp vec4 u_matambientalpha;
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   20:  uniform highp vec4 u_depthRange;
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   21:  uniform highp vec4 u_cullRangeMin;
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   22:  uniform highp vec4 u_cullRangeMax;
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   23:  out lowp vec4 v_color0;
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   24:  out mediump vec3 v_texcoord;
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   25:  vec3 normalizeOr001(vec3 v) {
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   26:     return length(v) == 0.0 ? vec3(0.0, 0.0, 1.0) : normalize(v);
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   27:  }
43:55:215 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   28:  void main() {
43:55:219 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   29:    vec3 worldpos = mul(vec4(position, 1.0), u_world).xyz;
43:55:219 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   30:    mediump vec3 worldnormal = vec3(0.0, 0.0, 1.0);
43:55:219 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   31:    vec4 viewPos = vec4(mul(vec4(worldpos, 1.0), u_view).xyz, 1.0);
43:55:219 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   32:    vec4 outPos = mul(u_proj, viewPos);
43:55:219 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   33:    lowp vec4 lightSum0 = u_ambient * color0 + vec4(u_matemissive, 0.0);
43:55:219 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   34:    mediump float ldot;
43:55:219 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   35:    v_color0 = clamp(lightSum0, 0.0, 1.0);
43:55:219 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   36:    v_texcoord = vec3(texcoord.xy * u_uvscaleoffset.xy, 0.0);
43:55:219 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   37:    vec3 projPos = outPos.xyz / outPos.w;
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   38:    float projZ = (projPos.z - u_depthRange.z) * u_depthRange.w;
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   39:    if (u_cullRangeMin.w <= 0.0 || projZ * outPos.w > -outPos.w) {
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   40:      if (projPos.x < u_cullRangeMin.x || projPos.y < u_cullRangeMin.y || projPos.x > u_cullRangeMax.x || projPos.y > u_cullRangeMax.y) {
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   41:        outPos.xyzw = u_cullRangeMax.wwww;
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   42:      }
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   43:    }
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   44:    if (u_cullRangeMin.w <= 0.0) {
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   45:      if (projPos.z < u_cullRangeMin.z || projPos.z > u_cullRangeMax.z) {
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   46:        outPos.xyzw = u_cullRangeMax.wwww;
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   47:      }
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   48:    }
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   49:    gl_ClipDistance[0] = projZ * outPos.w + outPos.w;
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   50:    if (u_cullRangeMin.w > 0.0) {
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   51:      gl_CullDistance[0] = projPos.z - u_cullRangeMin.z;
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   52:      gl_CullDistance[1] = u_cullRangeMax.z - projPos.z;
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   53:    }
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   54:    gl_Position = outPos;
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   55:  }
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 //END
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:233 Could not link program:
 ERROR: One or more attached shaders not successfully compiled

43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:234 VS desc:
01070000:00000938 HWX C T Tex Light: MatUp:7 Cull  (failed)
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:235 FS desc:
00000000:00000022 Tex TexAlpha TFuncMod 
43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:236 VS:
#version 410
// AMD Radeon RX 590 OpenGL Engine - GLSL 410
#define gl_VertexIndex gl_VertexID
#define lowp
#define mediump
#define highp
#define splat3(x) vec3(x)
#define mul(x, y) ((x) * (y))
in vec3 position;
in vec2 texcoord;
in lowp vec4 color0;
uniform mat4 u_proj;
uniform mat4 u_world;
uniform mat4 u_view;
uniform vec4 u_uvscaleoffset;
uniform lowp vec4 u_ambient;
uniform lowp vec4 u_matspecular;
uniform lowp vec3 u_matemissive;
uniform lowp vec4 u_matambientalpha;
uniform highp vec4 u_depthRange;
uniform highp vec4 u_cullRangeMin;
uniform highp vec4 u_cullRangeMax;
out lowp vec4 v_color0;
out mediump vec3 v_texcoord;
vec3 normalizeOr001(vec3 v) {
   return length(v) == 0.0 ? vec3(0.0, 0.0, 1.0) : normalize(v);
}
void main() {
  vec3 worldpos = mul(vec4(position, 1.0), u_world).xyz;
  mediump vec3 worldnormal = vec3(0.0, 0.0, 1.0);
  vec4 viewPos = vec4(mul(vec4(worldpos, 1.0), u_view).xyz, 1.0);
  vec4 outPos = mul(u_proj, viewPos);
  lowp vec4 lightSum0 = u_ambient * color0 + vec4(u_matemissive, 0.0);
  mediump float ldot;
  v_color0 = clamp(lightSum0, 0.0, 1.0);
  v_texcoord = vec3(texcoord.xy * u_uvscaleoffset.xy, 0.0);
  vec3 projPos = outPos.xyz / outPos.w;
  float projZ = (projPos.z - u_depthRange.z) * u_depthRange.w;
  if (u_cullRangeMin.w <= 0.0 || projZ * outPos.w > -outPos.w) {
    if (projPos.x < u_cullRangeMin.x || projPos.y < u_cullRangeMin.y || projPos.x > u_cullRangeMax.x || projPos.y > u_cullRangeMax.y) {
      outPos.xyzw = u_cullRangeMax.wwww;
    }
  }
  if (u_cullRangeMin.w <= 0.0) {
    if (projPos.z < u_cullRangeMin.z || projPos.z > u_cullRangeMax.z) {
      outPos.xyzw = u_cullRangeMax.wwww;
    }
  }
  gl_ClipDistance[0] = projZ * outPos.w + outPos.w;
  if (u_cullRangeMin.w > 0.0) {
    gl_CullDistance[0] = projPos.z - u_cullRangeMin.z;
    gl_CullDistance[1] = u_cullRangeMax.z - projPos.z;
  }
  gl_Position = outPos;
}


43:55:220 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:237 FS:
#version 410
// AMD Radeon RX 590 OpenGL Engine - GLSL 410
#define DISCARD discard
#define lowp
#define mediump
#define highp
#define splat3(x) vec3(x)
#define mul(x, y) ((x) * (y))
uniform sampler2D tex;
 in lowp vec4 v_color0;
in mediump vec3 v_texcoord;
out vec4 fragColor0;
void main() {
  vec4 t = texture(tex, v_texcoord.xy);
  vec4 p = v_color0;
  vec4 v = p * t;
  fragColor0 = v;
}


43:59:552 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 Error in shader compilation for: 01070000:00000128 HWX C Light: MatUp:7 Cull 
43:59:552 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 Info log: ERROR: 0:47: Use of undeclared identifier 'gl_CullDistance'
43:59:552 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 ERROR: 0:48: Use of undeclared identifier 'gl_CullDistance'
43:59:552 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 Shader source:
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    1:  #version 410
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    2:  // AMD Radeon RX 590 OpenGL Engine - GLSL 410
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    3:  #define gl_VertexIndex gl_VertexID
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    4:  #define lowp
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    5:  #define mediump
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    6:  #define highp
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    7:  #define splat3(x) vec3(x)
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    8:  #define mul(x, y) ((x) * (y))
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    9:  in vec3 position;
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   10:  in lowp vec4 color0;
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   11:  uniform mat4 u_proj;
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   12:  uniform mat4 u_world;
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   13:  uniform mat4 u_view;
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   14:  uniform lowp vec4 u_ambient;
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   15:  uniform lowp vec4 u_matspecular;
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   16:  uniform lowp vec3 u_matemissive;
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   17:  uniform lowp vec4 u_matambientalpha;
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   18:  uniform highp vec4 u_depthRange;
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   19:  uniform highp vec4 u_cullRangeMin;
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   20:  uniform highp vec4 u_cullRangeMax;
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   21:  out lowp vec4 v_color0;
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   22:  vec3 normalizeOr001(vec3 v) {
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   23:     return length(v) == 0.0 ? vec3(0.0, 0.0, 1.0) : normalize(v);
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   24:  }
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   25:  void main() {
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   26:    vec3 worldpos = mul(vec4(position, 1.0), u_world).xyz;
43:59:553 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   27:    mediump vec3 worldnormal = vec3(0.0, 0.0, 1.0);
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   28:    vec4 viewPos = vec4(mul(vec4(worldpos, 1.0), u_view).xyz, 1.0);
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   29:    vec4 outPos = mul(u_proj, viewPos);
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   30:    lowp vec4 lightSum0 = u_ambient * color0 + vec4(u_matemissive, 0.0);
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   31:    mediump float ldot;
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   32:    v_color0 = clamp(lightSum0, 0.0, 1.0);
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   33:    vec3 projPos = outPos.xyz / outPos.w;
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   34:    float projZ = (projPos.z - u_depthRange.z) * u_depthRange.w;
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   35:    if (u_cullRangeMin.w <= 0.0 || projZ * outPos.w > -outPos.w) {
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   36:      if (projPos.x < u_cullRangeMin.x || projPos.y < u_cullRangeMin.y || projPos.x > u_cullRangeMax.x || projPos.y > u_cullRangeMax.y) {
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   37:        outPos.xyzw = u_cullRangeMax.wwww;
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   38:      }
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   39:    }
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   40:    if (u_cullRangeMin.w <= 0.0) {
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   41:      if (projPos.z < u_cullRangeMin.z || projPos.z > u_cullRangeMax.z) {
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   42:        outPos.xyzw = u_cullRangeMax.wwww;
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   43:      }
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   44:    }
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   45:    gl_ClipDistance[0] = projZ * outPos.w + outPos.w;
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   46:    if (u_cullRangeMin.w > 0.0) {
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   47:      gl_CullDistance[0] = projPos.z - u_cullRangeMin.z;
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   48:      gl_CullDistance[1] = u_cullRangeMax.z - projPos.z;
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   49:    }
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   50:    gl_Position = outPos;
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   51:  }
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 //END
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:233 Could not link program:
 ERROR: One or more attached shaders not successfully compiled

43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:234 VS desc:
01070000:00000128 HWX C Light: MatUp:7 Cull  (failed)
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:235 FS desc:
00000000:00000000 
43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:236 VS:
#version 410
// AMD Radeon RX 590 OpenGL Engine - GLSL 410
#define gl_VertexIndex gl_VertexID
#define lowp
#define mediump
#define highp
#define splat3(x) vec3(x)
#define mul(x, y) ((x) * (y))
in vec3 position;
in lowp vec4 color0;
uniform mat4 u_proj;
uniform mat4 u_world;
uniform mat4 u_view;
uniform lowp vec4 u_ambient;
uniform lowp vec4 u_matspecular;
uniform lowp vec3 u_matemissive;
uniform lowp vec4 u_matambientalpha;
uniform highp vec4 u_depthRange;
uniform highp vec4 u_cullRangeMin;
uniform highp vec4 u_cullRangeMax;
out lowp vec4 v_color0;
vec3 normalizeOr001(vec3 v) {
   return length(v) == 0.0 ? vec3(0.0, 0.0, 1.0) : normalize(v);
}
void main() {
  vec3 worldpos = mul(vec4(position, 1.0), u_world).xyz;
  mediump vec3 worldnormal = vec3(0.0, 0.0, 1.0);
  vec4 viewPos = vec4(mul(vec4(worldpos, 1.0), u_view).xyz, 1.0);
  vec4 outPos = mul(u_proj, viewPos);
  lowp vec4 lightSum0 = u_ambient * color0 + vec4(u_matemissive, 0.0);
  mediump float ldot;
  v_color0 = clamp(lightSum0, 0.0, 1.0);
  vec3 projPos = outPos.xyz / outPos.w;
  float projZ = (projPos.z - u_depthRange.z) * u_depthRange.w;
  if (u_cullRangeMin.w <= 0.0 || projZ * outPos.w > -outPos.w) {
    if (projPos.x < u_cullRangeMin.x || projPos.y < u_cullRangeMin.y || projPos.x > u_cullRangeMax.x || projPos.y > u_cullRangeMax.y) {
      outPos.xyzw = u_cullRangeMax.wwww;
    }
  }
  if (u_cullRangeMin.w <= 0.0) {
    if (projPos.z < u_cullRangeMin.z || projPos.z > u_cullRangeMax.z) {
      outPos.xyzw = u_cullRangeMax.wwww;
    }
  }
  gl_ClipDistance[0] = projZ * outPos.w + outPos.w;
  if (u_cullRangeMin.w > 0.0) {
    gl_CullDistance[0] = projPos.z - u_cullRangeMin.z;
    gl_CullDistance[1] = u_cullRangeMax.z - projPos.z;
  }
  gl_Position = outPos;
}


43:59:558 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:237 FS:
#version 410
// AMD Radeon RX 590 OpenGL Engine - GLSL 410
#define DISCARD discard
#define lowp
#define mediump
#define highp
#define splat3(x) vec3(x)
#define mul(x, y) ((x) * (y))
 in lowp vec4 v_color0;
out vec4 fragColor0;
void main() {
  vec4 v = v_color0 ;
  fragColor0 = v;
}


43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 Error in shader compilation for: 00000000:00000038 C Tex Cull 
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 Info log: ERROR: 0:36: Use of undeclared identifier 'gl_CullDistance'
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 ERROR: 0:37: Use of undeclared identifier 'gl_CullDistance'
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 Shader source:
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    1:  #version 410
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    2:  // AMD Radeon RX 590 OpenGL Engine - GLSL 410
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    3:  #define gl_VertexIndex gl_VertexID
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    4:  #define lowp
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    5:  #define mediump
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    6:  #define highp
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    7:  #define splat3(x) vec3(x)
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    8:  #define mul(x, y) ((x) * (y))
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296    9:  in vec4 position;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   10:  in vec2 texcoord;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   11:  in lowp vec4 color0;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   12:  uniform mat4 u_proj;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   13:  uniform highp vec4 u_depthRange;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   14:  uniform highp vec4 u_cullRangeMin;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   15:  uniform highp vec4 u_cullRangeMax;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   16:  out lowp vec4 v_color0;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   17:  out mediump vec3 v_texcoord;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   18:  void main() {
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   19:    v_texcoord = vec3(texcoord, 1.0);
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   20:    v_color0 = color0;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   21:    vec4 outPos = mul(u_proj, vec4(position.xyz, 1.0));
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   22:    vec3 projPos = outPos.xyz / outPos.w;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   23:    float projZ = (projPos.z - u_depthRange.z) * u_depthRange.w;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   24:    if (u_cullRangeMin.w <= 0.0 || projZ * outPos.w > -outPos.w) {
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   25:      if (projPos.x < u_cullRangeMin.x || projPos.y < u_cullRangeMin.y || projPos.x > u_cullRangeMax.x || projPos.y > u_cullRangeMax.y) {
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   26:        outPos.xyzw = u_cullRangeMax.wwww;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   27:      }
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   28:    }
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   29:    if (u_cullRangeMin.w <= 0.0) {
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   30:      if (projPos.z < u_cullRangeMin.z || projPos.z > u_cullRangeMax.z) {
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   31:        outPos.xyzw = u_cullRangeMax.wwww;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   32:      }
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   33:    }
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   34:    gl_ClipDistance[0] = projZ * outPos.w + outPos.w;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   35:    if (u_cullRangeMin.w > 0.0) {
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   36:      gl_CullDistance[0] = projPos.z - u_cullRangeMin.z;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   37:      gl_CullDistance[1] = u_cullRangeMax.z - projPos.z;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   38:    }
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   39:    gl_Position = outPos;
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296   40:  }
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 //END
43:59:559 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:296 
43:59:560 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:233 Could not link program:
 ERROR: One or more attached shaders not successfully compiled

43:59:560 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:234 VS desc:
00000000:00000038 C Tex Cull  (failed)
43:59:560 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:235 FS desc:
00000000:00000022 Tex TexAlpha TFuncMod 
43:59:560 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:236 VS:
#version 410
// AMD Radeon RX 590 OpenGL Engine - GLSL 410
#define gl_VertexIndex gl_VertexID
#define lowp
#define mediump
#define highp
#define splat3(x) vec3(x)
#define mul(x, y) ((x) * (y))
in vec4 position;
in vec2 texcoord;
in lowp vec4 color0;
uniform mat4 u_proj;
uniform highp vec4 u_depthRange;
uniform highp vec4 u_cullRangeMin;
uniform highp vec4 u_cullRangeMax;
out lowp vec4 v_color0;
out mediump vec3 v_texcoord;
void main() {
  v_texcoord = vec3(texcoord, 1.0);
  v_color0 = color0;
  vec4 outPos = mul(u_proj, vec4(position.xyz, 1.0));
  vec3 projPos = outPos.xyz / outPos.w;
  float projZ = (projPos.z - u_depthRange.z) * u_depthRange.w;
  if (u_cullRangeMin.w <= 0.0 || projZ * outPos.w > -outPos.w) {
    if (projPos.x < u_cullRangeMin.x || projPos.y < u_cullRangeMin.y || projPos.x > u_cullRangeMax.x || projPos.y > u_cullRangeMax.y) {
      outPos.xyzw = u_cullRangeMax.wwww;
    }
  }
  if (u_cullRangeMin.w <= 0.0) {
    if (projPos.z < u_cullRangeMin.z || projPos.z > u_cullRangeMax.z) {
      outPos.xyzw = u_cullRangeMax.wwww;
    }
  }
  gl_ClipDistance[0] = projZ * outPos.w + outPos.w;
  if (u_cullRangeMin.w > 0.0) {
    gl_CullDistance[0] = projPos.z - u_cullRangeMin.z;
    gl_CullDistance[1] = u_cullRangeMax.z - projPos.z;
  }
  gl_Position = outPos;
}


43:59:560 idle0        E[G3D]: OpenGL/GLQueueRunner.cpp:237 FS:
#version 410
// AMD Radeon RX 590 OpenGL Engine - GLSL 410
#define DISCARD discard
#define lowp
#define mediump
#define highp
#define splat3(x) vec3(x)
#define mul(x, y) ((x) * (y))
uniform sampler2D tex;
 in lowp vec4 v_color0;
in mediump vec3 v_texcoord;
out vec4 fragColor0;
void main() {
  vec4 t = texture(tex, v_texcoord.xy);
  vec4 p = v_color0;
  vec4 v = p * t;
  fragColor0 = v;
}

@vit9696
Copy link
Contributor

vit9696 commented Sep 19, 2021

This looks incorrect to me:

	caps_.clipCullDistanceSupported = gl_extensions.EXT_clip_cull_distance || (!gl_extensions.IsGLES && gl_extensions.VersionGEThan(3, 0));
  • gl_ClipDistance is part of OpenGL 3.0.
  • gl_CullDistance is part of OpenGL 4.5.

Can we implement the feature without relying on gl_CullDistance?

@hrydgard
Copy link
Owner

Yeah, seems we need to split that caps_ bool into two. Clearly not supported by Apple's outdated OpenGL implementation.

@@ -534,6 +534,7 @@ OpenGLContext::OpenGLContext() {
caps_.framebufferBlitSupported = gl_extensions.NV_framebuffer_blit || gl_extensions.ARB_framebuffer_object;
caps_.framebufferDepthBlitSupported = caps_.framebufferBlitSupported;
caps_.depthClampSupported = gl_extensions.ARB_depth_clamp;
caps_.clipCullDistanceSupported = gl_extensions.EXT_clip_cull_distance || (!gl_extensions.IsGLES && gl_extensions.VersionGEThan(3, 0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gl_CullDistance is part of OpenGL 4.5, not 3.0.

Suggested change
caps_.clipCullDistanceSupported = gl_extensions.EXT_clip_cull_distance || (!gl_extensions.IsGLES && gl_extensions.VersionGEThan(3, 0));
caps_.clipCullDistanceSupported = gl_extensions.EXT_clip_cull_distance || (!gl_extensions.IsGLES && gl_extensions.VersionGEThan(4, 5));

@unknownbrackets
Copy link
Collaborator Author

Might be nice to support clip only if needed, but wasn't sure if I wanted to dedicate a whole supports bit to it since we're running low... I guess could bite the bullet and switch to 64-bit if needed...

-[Unknown]

@unknownbrackets
Copy link
Collaborator Author

Added (untested) support for APPLE_clip_distance, since those devices apparently typically don't support cull distance. Also made it so clip-only is supported, and at least fixes the bugs only clipping would fix.

-[Unknown]

@Anuskuss
Copy link
Contributor

Good job @unknownbrackets, this fixes #11399 (comment). I understand that it could be risky, but this fixes an annoying regression so I hope it finds its way into 1.12.

@ghost
Copy link

ghost commented Sep 22, 2021

Good job @unknownbrackets, this fixes #11399 (comment). I understand that it could be risky, but this fixes an annoying regression so I hope it finds its way into 1.12.

Yes I agree this should be added to v1.12.3 milestone as it's fixes a lot of graphical issue.

Both TL and BR must be outside in the same direction to be culled when
depth clamp is enabled.
If any vert is outside Z, it's culled when not clamping/clipping.
This culls based on pre-viewport Z and avoids culling based on the clip
range at negative Z.
Pretty limited on GLES3+.  Also D3D11.
Seems like doing it on D3D9 might be a bit tricky.
On GLES, saw a texture bound to slot 1 when UI started to draw after an
emu frame, which caused a crash because there was no sampler.  Let's just
explicitly flush.
Following PSP rules of -1 to 1 pre-viewport Z.  This also enables it for
GLES/OpenGL.
GL_ARB_cull_distance is needed, sometimes available on older GL.
It's really a bug (might even ideally cap the version?), and we already
have other bugs handled the same way.
Older GL devices, and it seems Apple devices, may not support cull.
Seems modern Apple mobile chips only support clip.
@ghost
Copy link

ghost commented Oct 19, 2021

clash-of-the-titans-kraken

@hrydgard hrydgard merged commit 16bf519 into hrydgard:master Oct 19, 2021
@hrydgard
Copy link
Owner

There, Kraken released as requested @Gamemulatorer .

Will let the buildbot complete this one before merging any more, want a separate build for this...

@ghost
Copy link

ghost commented Oct 19, 2021

After this pr I think DisableRangeCulling compat.ini can be remove now, because I tested a few games listed on that like
Metal Gear Solid Peace Walker, Asphalt2, N.O.V.A, and Street Riders no issue after I remove them on DisableRangeCulling compat.ini

@hrydgard
Copy link
Owner

That's right, but this method is not supported yet on D3D9 and also not fully on Mali GPUs, so some additional work is needed before we remove the section.

However, we can probably just start ignoring it entirely. where this method works.

@unknownbrackets
Copy link
Collaborator Author

I'll do a separate pull to ignore that compat when these things are fully supported soon. Hopefully we can entirely remove indeed.

-[Unknown]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Depth / Z Issue involves depth drawing parameters. GE emulation Backend-independent GPU issues Guardband / Range Culling Involves vertices outside fustrum.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants