-
Notifications
You must be signed in to change notification settings - Fork 686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[spirv] Inout semantics in presence of early thread termination do no match DXIL codegen #5158
Comments
I did a test that uses The pixel shader I tried was
The DXIL is
Note that the stores are after the discard. I'll have to find out what is happening for the ray-tracing example. |
I looked at it a bit more. It seems like the DXIL path "optimizes" the case when an This is snippet where the DXIL path does not do the copy-in and copy-out:
|
Thank you @s-perron for pointing this issue out to me. I'm working on a big chunk of changes in this area that impact both DXIL and SPIR-V, and I am trying to get the DXIL and SPIR-V code generation to match. SPIR-V doesn't seem to correctly implement copy-in/copy-out argument passing at all. I'm also actually not sure DXIL's current behavior is correct in the presence of thread termination, so I'll need to think that through. This branch has my current work in progress changes. There are still some remaining bugs on the DXIL side relating to matrix orientation annotations, but otherwise the DXIL implementation is functional. I just started the SPIR-V implementation, and it only works for extremely trivial cases. In general the goal of those changes is to try and more accurately capture the parameter passing semantics in the AST for To give a concrete illustration on one of the extremely simple deviations between DXIL and SPIR-V take this simple example: RWBuffer<int> Buf;
void fn(inout int X, inout int Y) {
Y = 2;
X = 1;
}
[numthreads(1,1,1)]
void main(int GI : SV_GroupIndex) {
int Val = 3;
fn(Val, Val);
Buf[GI] = Val;
} Compiler output: If you look at the examples, DXIL stores 2, and SPIR-V stores 1. This is because SPIR-V more or less passes In DXIL (and in theory by HLSL's definition), function parameters are copy-in/copy-out so they are always guaranteed to reference unique memory inside |
Interesting. I looked at the SPIR-V code generated by DXC pre-optimization. It looks like the SPIR-V code is doing a copy-in and copy-out, but it reuses the same "local" copy for both parameters because they are the same input value. I won't do anything, and I'll wait for your AST changes to be ready for a more detailed review. |
FYI, for reference this was cut down test from a customer reported issue If I follow HLSLs current language semantics for 'inout' being copy-in and copy-out, it makes it impossible to write to payload in user function in presence of thread terminating instructions. |
We are working on writing a formal specification for HLSL. The current behavior right now in DXC is that we try to eliminate copies where we can, so it is actually more like the value of an That may actually be what the language in the spec will need to say. |
There is a proposal to add reference parameter to HLSL . That will be what you want. https://github.com/microsoft/hlsl-specs/blob/main/proposals/0006-reference-types.md |
I will close this bug. SPIR-V is not breaking the spec. It is the intended behaviour. As Chris mentioned, the spec will probably state the behaviour is undefined. You will need to wait for the references proposal to be implemented or find some other workaround. |
@s-perron Because proposals take quite a while to get through (I'd imagine especially for something as broad as adding references to HLSL), would it be possible to issue at least a compiler warning that tells the developer that the code they've written will result in undefined behavior? It's an incredibly difficult issue to track down otherwise, and vendors seem to be taking the blame for what's really a DXC issue. For what it's worth, I myself and others are running into this problem: https://forums.developer.nvidia.com/t/anyhit-payload-lost-when-calling-accepthitandendsearch-or-ignorehit/242880/9 |
I mentioned on the other thread, but I have a kludgy workaround for the issue. The idea is to relax the behavior of "AcceptHitAndEndSearch" to not terminate the thread early. I create two static booleans above my entrypoint:
Then, I add the following code to the end of my anyhit entrypoint:
This allows for payload values to be written to before the thread is terminated. Is it even correct behavior to terminate threads early when these intrinsics are called? Doesn't that violate the RT spec? I believe that NVIDIA's OptiX platform continues execution to the end of the function body. |
I'll see what we can do. Once the code has been translated to spir-v, we can no longer identify the potential problem. I'm not very familiar with the clang part of dxc to know what we can do in that part. |
I suppose my bigger question is why these intrinsics terminate the thread in HLSL in the first place. They modify the traversal state to stop traversing, but I don’t see why the thread couldn’t continue onto the end of the end of the kernel. I’m not even sure if it’s part of the DXR/RTX specification to terminate the threads in this way. Perhaps this is just a misunderstanding of the spec during implementation? Just removing that requirement that threads terminate on these calls would seemingly fix the issue, right? |
For the following any-hit shader
DXC generates the following DXIL
Here the payload passed in to 'userFunc' as an 'inout' argument is passed in by reference and hence mutated
Note HLSL spec states
https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-function-parameters
That values are copied in and copied out. (I guess pass by reference is a valid implementation of the semantics)
The code generated for SPIRV is
Note modification of payload is completely lost.
This happens because for SPIRV codegen, 'inout' parameter semantics are implemented as pass by value-result, i.e they are copied in and then copied out (This can be seen by generating code at O0)
However since 'AcceptHitAndEndSearch' is a thread terminating instruction. We never perform the copy-out
This is technically a legal implementation of 'inout'
Probably couple of questions arise out of this
HLSL folks, I guess this was specifically done for payloads, otherwise you can never write to a payload in any non-entry function
Are there any other scenarios where DXC does something special?
DXC folks, if we were to make this work, we have to implement some form of unwind/cleanup for this special case?
(Ugly way is to track using a boolean and unwind all the way to main and then write out to payload before exiting?)
The text was updated successfully, but these errors were encountered: