Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bindBuffer errors in webgl bevy game #2378

Closed
rparrett opened this issue Jan 11, 2022 · 12 comments · Fixed by #2391
Closed

bindBuffer errors in webgl bevy game #2378

rparrett opened this issue Jan 11, 2022 · 12 comments · Fixed by #2391

Comments

@rparrett
Copy link
Contributor

rparrett commented Jan 11, 2022

Description

Hundreds of this warning, which does not crash the game

"WebGL: INVALID_OPERATION: bindBuffer: buffers bound to non ELEMENT_ARRAY_BUFFER targets can not be bound to ELEMENT_ARRAY_BUFFER target

Followed eventually by this one, which does

Uncaught TypeError: Failed to execute 'bindBuffer' on 'WebGL2RenderingContext': parameter 2 is not of type 'WebGLBuffer

Repro steps

git checkout https://github.com/rparrett/pixie_wrangler
cd pixie_wrangler
git checkout broken
cargo make --profile release serve

Now play the game naturally. Complete each level in sequence, returning to the level select screen and advancing to the next one. Eventually, clicking the next level will crash the game before the level even appears on the screen. I have had this happen mostly when advancing to level 4 or 5.

When you have played this far already and have solutions saved, you can skip to the level prior to the crash, send the pixies, wait for the popup, and then proceed to the next level.

Extra materials

Traces of the full repro were hundreds of megabytes to gigabytes. Here are what I think are the relevant bits that I was able to extract using a webgl2 context shim and by enabling chrome logging (dev tools will crash)

Initial non-fatal error spam context

This general pattern will repeat many times. As far as I can tell, this doesn't make any sense. It's possible that somewhere earlier that buffer was bound to something else.

[31424:259:0110/141557.353487:INFO:CONSOLE(2493)] "gl.bindBuffer(36663,buffer4187591)" // GL_COPY_READ_BUFFER
[31424:259:0110/141557.353507:INFO:CONSOLE(2493)] "gl.bufferSubData(36663,0,255,214,2,60,50,55,49,60,85,124,/* and so on */)" // GL_COPY_WRITE_BUFFER
[31424:259:0110/141557.353527:INFO:CONSOLE(2493)] "gl.bindBuffer(36662,)" // GL_COPY_READ_BUFFER
[31424:259:0110/141557.353546:INFO:CONSOLE(2493)] "gl.bindBuffer(36663,)" // GL_COPY_WRITE_BUFFER
[31424:259:0110/141557.353566:INFO:CONSOLE(2493)] "gl.bindBuffer(34963,buffer4187591)" // GL_ELEMENT_ARRAY_BUFFER
[31424:259:0110/141557.353585:INFO:CONSOLE(2493)] "gl.bufferSubData(34963,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,/* and so on */)" // GL_ELEMENT_ARRAY_BUFFER
[31424:259:0110/141557.353607:INFO:CONSOLE(2493)] "gl.bindBuffer(36662,)" // GL_COPY_READ_BUFFER
[31424:259:0110/141557.353626:INFO:CONSOLE(2493)] "gl.bindBuffer(34963,buffer4187591)" // GL_ELEMENT_ARRAY_BUFFER
[31424:259:0110/141557.353635:INFO:CONSOLE(2500)] "WebGL: INVALID_OPERATION: bindBuffer: buffers bound to non ELEMENT_ARRAY_BUFFER targets can not be bound to ELEMENT_ARRAY_BUFFER target"

Fatal error context

Dropped buffer later bound?

[99151:259:0109/151936.814784:INFO:CONSOLE(451)] "%cINFO%c %c Created buffer Valid((4220, 1, Gl)) with BufferDescriptor { label: Some("Mesh Index Buffer"), size: 48, usage: INDEX, mapped_at_creation: true } log.target = "wgpu_core::device";
[99151:259:0109/151936.814795:INFO:CONSOLE(451)] "%cDEBUG%c %c Buffer (4220, 1, Gl) map state -> Idle log.target = "wgpu_core::device";
/* clipped */
[99151:259:0109/151942.507203:INFO:CONSOLE(451)] "%cINFO%c %c Buffer (4220, 1, Gl) is dropped log.target = "wgpu_core::device";
[99151:259:0109/151942.841790:INFO:CONSOLE(451)] "%cDEBUG%c %c Buffer Valid((4220, 1, Gl)) is detached log.target = "wgpu_core::device::life";
/* clipped */
[99151:259:0109/151951.204260:INFO:CONSOLE(2493)] "gl.bindBuffer(34963,4220)"
[99151:259:0109/151951.204270:INFO:CONSOLE(2500)] "Uncaught TypeError: Failed to execute 'bindBuffer' on 'WebGL2RenderingContext': parameter 2 is not of type 'WebGLBuffer'."

Workaround

It seemed based on the "non-fatal" error traces that "copy buffer to buffer" was failing in some way. Having no idea what this code really does, I attempted this hack: rparrett@12ea780 which I thought made sense at the time, but now I'm pretty sure is just coincidentally making my stuff work.

This did fix the issue for me though. And surprisingly didn't seem to harm other wgpu examples (I could have easily missed something there). Also, another game of mine that was working now continues to work with that patch applied.

Context

The game is

Platform

MacOS 12.0.1 (m1)
Chrome 97, Firefox 95
bevy 0.6 from crates.io (also with patched wgpu dep)
wgpu master 01f62ba

Also tested in Chrome in a Windows/nvidia environment.

Next

I am dumping all the information I have about this issue so that error messages are searchable in case someone else has the same issue.

Please let me know if you are looking into this and I can be of any help.

@rparrett
Copy link
Contributor Author

rparrett commented Jan 11, 2022

Here's a dump from what I believe is the start of a frame to the first INVALID_OPERATION.

edit: reuploaded twice

invalid_operation.txt

Here's the webgl2context shim I'm using (originally from bjorn3). I'm cramming into the middle of the getContext function in wasm.js.

webgl2context_shim.txt

@rparrett
Copy link
Contributor Author

So I'm not entirely sure what the expected order of these console logs and error messages coming in would be, but it seems possible that the next lines are actually triggering the error, perhaps?

And if that's the case, then it indeed seems like buffer4187591 is being bound as an GL_ELEMENT_ARRAY_BUFFER and then immediately again as a GL_COPY_WRITE_BUFFER.

My patch has the effect of changing the "rebinding as GL_ELEMENT_ARRAY_BUFFER" into an "unbinding" and preventing this error.

But it seems like the actual problem may be elsewhere?

[31424:259:0110/141557.310528:INFO:CONSOLE(2493)] "gl.bindBuffer(GL_ELEMENT_ARRAY_BUFFER,buffer4187591)"
[31424:259:0110/141557.310539:INFO:CONSOLE(2500)] "WebGL: INVALID_OPERATION: bindBuffer: buffers bound to non ELEMENT_ARRAY_BUFFER targets can not be bound to ELEMENT_ARRAY_BUFFER target"
[31424:259:0110/141557.310549:INFO:CONSOLE(2502)] "=  undefined"
[31424:259:0110/141557.310559:INFO:CONSOLE(2493)] "gl.bindBuffer(GL_COPY_WRITE_BUFFER,buffer4187591)"

@rparrett
Copy link
Contributor Author

@zicklag apologies for the unsolicited ping, but this code seems to be yours (thanks!) so I'm wondering if you might have any insight.

@zicklag
Copy link
Contributor

zicklag commented Jan 11, 2022

No problem! Actually as I look closer at that code again, it looks like your "hack" is actually the correct code and I think you have discovered a legit oversight in my code!

A little bit higher up we have this:

                let is_index_buffer_only_element_dst = !self
                    .shared
                    .private_caps
                    .contains(super::PrivateCapabilities::INDEX_BUFFER_ROLE_CHANGE)
                    && dst_target == glow::ELEMENT_ARRAY_BUFFER
                    || src_target == glow::ELEMENT_ARRAY_BUFFER;
                // WebGL not allowed to copy data from other targets to element buffer and can't copy element data to other buffers
                let copy_dst_target = if is_index_buffer_only_element_dst {
                    glow::ELEMENT_ARRAY_BUFFER
                } else {
                    glow::COPY_WRITE_BUFFER
                };

This checks for platforms such as WebGL which can't bind an element array buffer as a different kind of buffer. Then we set the copy_dst_target according to that check.

At the end where you have made your code modification:

                gl.bind_buffer(copy_src_target, None);
                if is_index_buffer_only_element_dst {
                    gl.bind_buffer(glow::ELEMENT_ARRAY_BUFFER, self.current_index_buffer);
                } else {
                    gl.bind_buffer(copy_dst_target, None);
                }

We're really just trying to unbind the buffers that we bind in the code above to set things back to "normal".

In this case your new code:

                gl.bind_buffer(copy_src_target, None);
                gl.bind_buffer(copy_dst_target, None);

That is actually what it should have been in the first place, I think!

You can probably open a PR and someone who understands it better than me will have more insight on whether that's actually the right fix. :)

@rparrett
Copy link
Contributor Author

Thanks, I may do that.

Interestingly, it seems like this is the only actual usage of current_index_buffer.

Ideally, I'd like to understand what's going on here well enough to create a more minimal reproduction of the crash.

@zicklag
Copy link
Contributor

zicklag commented Jan 12, 2022

Oh, you're right. Your change isn't what it's supposed to be. Now I'm remembering more of how this worked.

So...

  • if we are copying to an element array buffer on WebGL, we have to bind it as an ELEMENT_ARRAY_BUFFER, not as a COPY_WRITE_BUFFER, because WebGL doesn't support INDEX_BUFFER_ROLE_CHANGE.
  • There is a possibility, though, that we currently have something bound to our element array buffer and it may need to stay set to whatever it was. Whatever is currently bound to ELEMENT_ARRAY_BUFFER is stored inside the current_index_buffer variable.
  • To make sure that the state is properly reset at the end of this command, we bind the ELEMENT_ARRAY_BUFFER back to whatever it was set to before by setting it to the current_index_buffer.

So that code actually looks fine.


The error you are getting might not be due to this code at all then.

It might be that some code in wgpu-hal is calling C::SetIndexBuffer on a buffer that isn't an index buffer? 🤷‍♂️

I'm not totally sure what's wrong, but now I'm thinking that the code you were modifying is probably correct. At least that last if statement.

@rparrett
Copy link
Contributor Author

My last train thought that I'll pick up later was that perhaps there's a command out there somewhere that is binding to ELEMENT_ARRAY_BUFFER but not updating current_index_buffer, or maybe it's not being cleared properly between frames or something?

@rparrett
Copy link
Contributor Author

rparrett commented Jan 13, 2022

I think I'm understanding what's going on here a whole lot better now after diving into the webgl 2 spec. (Relevant sections 5.1, 5.2)

Still, I think something fishy may be going on with current_index_buffer, because working on solutions involving that hasn't been very fruitful. And just ripping out all of the "no index buffer role change" related code in CopyBufferToBuffer results in a working game for me, so I'm not sure that any "index buffer" <-> "other data" copying is actually happening in my game.

But current_index_buffer is only used in CopyBufferToBuffer, and I don't think it's actually needed there, because we are given the source and destination buffers already.

So here's my latest attempt: rparrett@856d86e

This seems to be working well for me.

Next on my list:

  • Create raw wgpu example that reproduces bad behavior
  • Add some of my notes as comments in CopyBufferToBuffer
  • ? Remove current_index_buffer?

@zicklag
Copy link
Contributor

zicklag commented Jan 13, 2022

Like most of the WebGL backend, I actually copied it from the GFX GL backend: https://github.com/gfx-rs/gfx/blame/bc77309afdb0829605982370a3e17382c5968071/src/backend/gl/src/queue.rs#L661.

It looks like @Gordon-F first made the commit that used the current index buffer tracking. @Gordon-F, do you have any idea what might be going on?

@rparrett
Copy link
Contributor Author

rparrett commented Jan 13, 2022

Oh, that's good to know, thanks.

Hm, I'm seeing that in the gfx gl backend, reset_state is resetting their version of current_index_buffer but we're not doing that. I'll test out an equivalent change in a bit.

@rparrett
Copy link
Contributor Author

rparrett commented Jan 13, 2022

That on its own seems to also resolve my issue.

Although I think that the current CopyBufferToBuffer is likely incorrect. It seems like it's now (vs gfx gl) glomping a src ELEMENT_ARRAY_BUFFER into the special casing but not actually handling that possibility.

Still, this makes for a very digestible PR so maybe I'll go for that.

@zicklag
Copy link
Contributor

zicklag commented Jan 13, 2022

Awesome, good catch! I must have missed that as I migrated it from gfx. 👍 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants