-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a bug which could cause a crash during composition #1568
Fix a bug which could cause a crash during composition #1568
Conversation
after a change to a stage. Sufficiently frequent/repetitive calls to _EvalRefOrPayloadArcs in primIndex.cpp, where ComputeLayerStack is called many times for the same layer could result in Pcp_LayerStackRegistry returning a PcpLayerStack that was either already deleted, or about to be deleted. Fixing this required moe careful management of the table mapping identifiers to layer stacks in the registry object.
PcpLayerStack if that layer stack isn't actually in the registry (it may have already been replaced by a call to FindOrCreate on another thread).
FWIW, here is a simple Houdini file that demonstrates the crash: |
Filed as internal issue #USD-6804 |
I believe I'm also seeing this bug pop up. I haven't found a consistent repro with just USD either, but if I do find one I'll try testing this fix against it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find! I added a few notes, let me know what you think.
… was renamed to _SetLayersAndRemove).
in the identifierToLayerStack map, along with a comment explaining why it should always be called.
Thanks @marktucker! Changes look good to me, I've merged them internally and they should land in dev soon. |
Just to follow up with some more affirmation, I found a couple fairly consistent repro, the best being rapidly running LoadAndUnload on some heavy assets at my studio (failing on anywhere between 1 to 20 load/unloads), and this change completely fixes the problem! It hasn't crashed once, even on 50 load/unloads. Sadly the repro requires production assets that I can't share so I can't pass along a test, but I at least wanted to confirm the stability of the fix in case there was any uncertainty about the fix working with more cases. |
Thanks @lbiasco-dwa! It's great to have additional evidence that the bug is not Houdini-specific, and that this fix works in more scenarios than the one I was testing with. |
Fix a bug which could cause a crash during composition or recomposition
after a change to a stage. Sufficiently frequent/repetitive calls to
_EvalRefOrPayloadArcs in primIndex.cpp, where ComputeLayerStack is called
many times for the same layer could result in Pcp_LayerStackRegistry
returning a PcpLayerStack that was either already deleted, or about to be
deleted. Fixing this required moe careful management of the table mapping
identifiers to layer stacks in the registry object.
Description of Change(s)
I believe there was unsafe code in the layer stack registry. Given enough threads simultaneously being in _EvalRefOrPayloadArcs in pcp/primIndex.cpp, with all those threads calling ComputeLayerStack for the same layer file, there is lots of opportunity for the registry cleanup code in the PcpLayerStack destructor to interact badly with the Pcp_LayerStackRegistry::FindOrCreate method.
I wish I could provide a reproduceable test case for this crash, but I was only able to make it happen within Houdini. But the crash would happen within a call to UsdUtilsFlattenLayerStack, in the SdfChangeBlock destructor when exiting UsdFlattenLayerStack. With this change I can no longer reproduce the crash in Houdini, and all the USD tests still pass (so at least I'm pretty sure I didn't break anything). Really curious to hear thoughts about this change...