-
-
Notifications
You must be signed in to change notification settings - Fork 21.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cascaded canvas groups and canvas group shader revamp #74859
base: master
Are you sure you want to change the base?
Conversation
903ffaf
to
25bdbd5
Compare
I am not sure what (Well, nesting and subtract will go a long way, but the 2 features I described should make certain things more direct) Also, any change for MSAA support of the clipped nodes? Though if some type of slight blur can be had that might work too, especially with individual tuning. |
@insomniacUNDERSCORElemon There are two ways to achieve "behind parent". The main part of it is that it distinguishes between "transparent" and "not drawn". For a parent that's a polygon2D, this is pretty straightforward. The polygon's border is enforced by the graphic's card, as being part of a mesh, while the alpha exists in the polygon's texture. So what you can do is take the mask texture, and mix the polygon's texture on top of it. It would be something like this:
For a sprite, the original drawing region is a square, and this is probably not what you would want as the masking for the "children behind the parent". So, the next best option is to separate near-0 alpha from any other alpha with an The best solution, of course, is to clip the child with a "full" parent sprite, and then draw the "cut out" parent sprite above that. Also, I'm not sure what issues you have with MSAA, but if you want nodes that are "blurred a bit", you can just use mipmaps:
In other words, you have pretty crazy options to play with. The only limitation with this PR is one-parent many-children. And I'm hoping to fix this with a following PR, if this gets merged. |
25bdbd5
to
68fa4d2
Compare
68fa4d2
to
c5da523
Compare
It doesn't work here. #66005 Forgive the outdated picture, but it still applies in 4.0.
I was not asking how, I was asking for said basic functionality to not need to be added by (or unknown to) every user. Clipping to the parent's shape is already done so you just need a way to send it to the back after is has been rendered after subtractions you don't want applied to it. It's either that or a way to make a node un-subtractable (or I would say unclippable because that'd be useful too, but in this case it still needs to be clipped by the parent), but existing options being helpful would be more elegant. Restore alpha is similar (it allows more complex designs that you might not want to/be-able-to do with subtract, also easier editing), though I get that's further down and it's reasonable not wanting to clutter the options of a new feature. |
@insomniacUNDERSCORElemon I see the issue with MSAA. It seems the AA settings aren't being inherited by the canvas framebuffer. I'll take a look at it.
It's not basic functionality, though. The notion of
If manually extracting color and alpha from the parent is too much, it is possible to achieve it via shaders. But really not recommended, because you want the color to be padded past the clipped area, to avoid artifacts. Automating this whole thing for any user with no artist effort is not practical for the same reason Godot doesn't have a one-click RTS creator template. Godot is only meant to enable you to do what you want, with a little bit of ingenuity from yourself. Doing it any other way would not allow Godot's runtime to remain a tiny 30mb download, and the editor itself to be just over 100mb. A good artist would already know how to mix effects in a way that achieves what they want. |
e2ebde0
to
11ec9d4
Compare
Fixed MSAA inside canvas group buffer. This only applies to Vulkan, since GLES3 doesn't seem to have MSAA implemented. I was kind of surprised to find that default AA doesn't seem to work for 2D polygons. |
Finally tested this. MSAA working and it's a definite improvement. I don't like the setup (and resulting limitations) of subtract here (particularly if you want clipping+subtraction at the same level in the same tree). Though to be clear it is easy and perfect for many workflows despite that limit. I had multiply/subtract working with transparency in the material blend modes before (note: invisible-parent option isn't relevant because clip-only option does that), and it would be a nice option to have here to not have that limit: It should just be a tiny shader adjustment, unless there is some issue with this? EDIT: Ok I see, the blend mode can no longer change parent opacity (according to my previous comments, it could at one point even before my edits), but the current implementation can affect all the On top of subtraction alongside clipping, the PR here doesn't seem to break when order changes (like older versions) so a blend mode and then a -1 z index node after it might work. If that does work, it would allow my example in my proposal... though without restore alpha (which is a better name than add) option, it would require more nodes. Which honestly can be worse or better to edit depending on the situation, they are opposite workflows. Also an option to exclude even below/above nodes (first/last in tree? as those are the most logical to exclude) from being clipped would make non-static designs less tedious. EDIT: I did a character test, 4 unique uses of nested clipping (eye shine, pupils, eyelids, and the entire face) I was happy to see that subtract mode also nests with itself, as I highlighted in one of the eyes (the other eye shine is 2 overlapping lines that aren't subtracted). This is actually animated as well (psuedo-3D rotation), but aside from lacking an easy way to show it in decent quality the animation also doesn't look good when the face is at the edge of the head (fault of the skew and/or me) Better example of nested subtract (3 levels)Nesting subtraction with partial transparency is also a bit wonky (understandable), even aside from fonts. With an opaque parent (final subtract) and partially transparent child, any overlapping sections of the parent is visible for some unexpected reason: (I would expect an intersection of 2 opaque areas to not create an opacity difference) |
ERR_FAIL_COND(rt->direct_to_screen); | ||
// Caller should already check both of these, and assume there's a valid backbuffer on return | ||
DEV_ASSERT(rt->backbuffer_fbo == 0); | ||
DEV_ASSERT(rt->direct_to_screen); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DEV_ASSERT(rt->direct_to_screen); | |
DEV_ASSERT(!rt->direct_to_screen); |
You've kept the condition unchanged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. I adjusted the list of things to test at the top, to better cover issues like this.
@@ -355,7 +353,7 @@ class RasterizerCanvasGLES3 : public RendererCanvasRender { | |||
void _prepare_canvas_texture(RID p_texture, RS::CanvasItemTextureFilter p_base_filter, RS::CanvasItemTextureRepeat p_base_repeat, uint32_t &r_index, Size2 &r_texpixel_size); | |||
|
|||
void canvas_render_items(RID p_to_render_target, Item *p_item_list, const Color &p_modulate, Light *p_light_list, Light *p_directional_list, const Transform2D &p_canvas_transform, RS::CanvasItemTextureFilter p_default_filter, RS::CanvasItemTextureRepeat p_default_repeat, bool p_snap_2d_vertices_to_pixel, bool &r_sdf_used) override; | |||
void _render_items(RID p_to_render_target, int p_item_count, const Transform2D &p_canvas_transform_inverse, Light *p_lights, bool &r_sdf_used, bool p_to_backbuffer = false); | |||
void _render_items(RID p_to_render_target, int p_item_count, const Transform2D &p_canvas_transform_inverse, Light *p_lights, bool &r_sdf_used, uint32_t canvas_group_level, bool p_alias_screen_to_mask = false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void _render_items(RID p_to_render_target, int p_item_count, const Transform2D &p_canvas_transform_inverse, Light *p_lights, bool &r_sdf_used, uint32_t canvas_group_level, bool p_alias_screen_to_mask = false); | |
void _render_items(RID p_to_render_target, int p_item_count, const Transform2D &p_canvas_transform_inverse, Light *p_lights, bool &r_sdf_used, uint32_t p_canvas_group_level, bool p_alias_screen_to_mask = false); |
00d690f
to
beac7b7
Compare
@insomniacUNDERSCORElemon I tried to explain it before, but "restore parent alpha" is not a possibility, because the alpha is multiplied during render, meaning, the use of alpha is lossy. As in, pixels that are deleted using This is an inevitable effect of how alpha blending works in the GPU. The only way to circumvent it is to first do drawing on a black and white image/canvas representing the alpha mask, and then using a custom shader to turn it into an actual alpha mask multiplied with the parent. Nevertheless, if you believe a "tiny shader change" can fix this, you can always create your own shader that blends the mask with the parent's colors and alpha in any way you desire. So you're free to try. As for Z-index, I explained it elsewhere multiple times, but Z-index completely bypasses the tree order, meaning items with a different Z-index, for all intents and purposes, exist in a separate scene tree. So they are no longer considered children of the matching canvas group. Canvas group effects will therefore not apply to a child with a different Z-index. Incidentally, a similar but slightly different effect applies to Y-sorting. A child of a Y-sorted node is transformed into a sibling of that node, so the parent is sorted as one of the children. That means that if said parent is a canvas group, its effects no longer apply to its children. This can be worked around by putting a sorting parent inside a canvas group parent, so the canvas group parent itself is not sorting. |
With that bit I was only talking about final transparency with the multiply material, and the context was even a bit wrong (as older versions of the clip shader actually did allow it, as seen below). FWIW I did get a basic restore alpha with an override shader (red remove, green restore, render everything else) working here (removing
Overriding the clip shader for all child nodes is the problem though, similar with subtraction working that way too. I want a way to also work up (or onto things that are rendered, not ones that will render) or at least something that is optional. Particularly I'm not aware of any option for differentiation with a general shader other than to use color (which is made worse because MSAA introduces an outline that isn't purely the intended color). With the old behavior, this (the orange/red) is 1 parent node with 3 child nodes (1 clipping, 2 deleting alpha via multiply): I tried to clarify in my edit, but the actual issue with multiply is just the behavior of parent alpha as the multiply material otherwise works as expected. The old and new behaviors are both useful for different reasons.
Sorry, I saw z-index having some effect while clipped... I missed that on its own it looks like it's clamped to the parent index (allowing children to be re-ordered) and that it actually can bug onto other visible (clip/CG) nodes (which is not a new thing). Also a weird wobble (at least in the editor) when zooming in and out. I looked at old versions again, the ability to break out of clipping was useful. And is probably a better behavior particularly as it makes it immediately obvious that it isn't properly integrated. Although from what you're saying it sounds like there is not much point for z-index anyway, at least in this context. |
I may have found a bigger issue: performance can easily be tanked depending on MSAA settings (context: 1050Ti, 1080p window size). With x8MSAA, there is a 20FPS drop with 6 nodes of canvasgroups/clipping. FPS returns to 60 if clipping is disabled instead. This does not seem to depend on the type of content much (CG, what type of clipping, transparency, texture vs polygon, put onto a tilemap, animation doesn't matter). Except that nested clipping seems to perform better, x8MSAA only a 15FPS drop at 6 layers (and seeming to not drop much at 7 layers). Perhaps it belongs to the MSAA implementation and thus not this PR (and maybe even discussed/tried/working-on etc already), I hope optimization is possible (less calls for static objects/discrete animation, batching on tilemaps) as it's obvious for doing less (a simpler/stylized style or just... nothing happening) to have less hardware strain than having everything zip-zooping everywhere.
|
I don't think this closes the Mask2D proposal as it is still the parent which decides to mask itself. It also chooses all children. We could require the user to assign particular children for masking, but it is already far less user firendly as the proposed Mask2D. The whole purpose of the Mask2D proposal was about usability and making it easily discoverable, intuitive and simple to mask any CanvasItem. This proposal does not cover what is proposed in the Mask2D proposal to make this happen. For that I still think we need a Mask2D node. |
Masking one child at a time wouldn't make much sense though, because you'd need to copy the back buffer for each child you add to the mask. That would just be horribly inefficient to the point of running at a fraction of a FPS. Also, if it's able to have some children as masks and some as not, then it's masking previous siblings, not just the parent. Or, alternately, if it's only applying to the parent, then there's no point to the order of the children, and you should just apply all masks before drawing subsequent children. This can be done by making said "subsequent children" actually be following siblings of the parent instead of children. From the get-go, the fact that a mixed situation is ambiguous in how it should be rendered makes the application of it undesirable. Render results should not be ambiguous. This PR achieves the desired outcome for using multiple masked children, with the same scene structure, and just slightly different property settings. And it's about as intuitive as any other feature in Godot. |
beac7b7
to
9b9c295
Compare
Thus, it will effectively not provide what is required from the mask. This seems to be a fairly common case of laying out a tree using nested sprites. This approach makes this impossible with clip_children. Node with clip_children enabled is the last visible node in the branch, since all its descendants are transparent masks of it. Is there a workaround here? I think that proposal is mostly about usability and it's definitely a different point of view. Is it possible to implement within this PR a (possibly ineffective) Mask2D node, but more convenient for use? |
Again, trying to mix mask and non-mask nodes under the same parent is ambiguous. As in, if you ask 10 different people what the render result of such a set-up should be, you'll get 10 different answers. To call this "intuitive" or "convenient" is a stretch, to put it mildly. In theory, you could create a "cut into previous siblings" node by using backbuffer effects. But that would make it an issue when mixing with the parent. And, again, we're talking under 1 FPS results, if you have even a handful of sprites on the screen. Setting up a clear parent to be cut, and cutting children is a lot more clear and intuitive. And the tree order is unchanged from a parent with multiple "Mask2D" children (And no non-"Mask2D" children). It doesn't get more intuitive than that. What I do see possibly lacking is multiple children cut by multiple children. That requires accessing multiple buffers at the same time. This could be done by combining the backbuffer with a canvas group buffer, but the aliasing of |
9b9c295
to
e45d961
Compare
Changed it so |
You are right, there can be many opinions. Also I'm not sure what exactly is happening with performance inside the engine, I think you are much more competent here. As I understand it, clip_children was intended as a simple way to mask without using shaders. It's definitely useful in many cases, although in other situations I just don't understand how to work with it. Perhaps the use of this function is simply intended to be limited due to the rendering structure and the binding of the mask to the parent. I think the proposal noted above is about creating the mask as a separate entity that is somewhat easier to use. Is it possible to discuss the creation of such an abstract node here or is this offtopic? Here is a simple description of the proposed node. If this is off-topic, then it’s hardly worth closing the proposal if it asks for another functionality. Here is a basic list of differences between the current method and the proposed one:
This is more of a consultation than a critique of the current implementation. Is it possible to create a similar node based on BackBuffer effects? |
da05fad
to
5c916f3
Compare
5c916f3
to
23dde77
Compare
My project's design has visuals that depend on the ability to nest multiple canvas items with clip_children enabled, animating nested and parent nodes, along with some basic fragment shaders. I'm making my project on this branch and it's working well so far, so I personally would really like to see this merged. This resolves the up-front nesting and canvas group limitations of clip_children, more functionality can always be added later. Regarding the Mask2D proposal,
I designed my scene and art with clip_children in mind. The downward-only behavior of clip_children makes it easy for me to understand clipping when arranging the scene tree. Mask2D nodes could get the job done but there would be a lot of clutter since I use 10+ unique masks in a scene, not to mention the complication of using clip + draw. That's just to say there are different use cases for a clip_children property versus Mask2D as a node which would make other use cases easier to do without the aforementioned change/workarounds, inefficiency or ambiguity aside. |
@404Fox have you evaluated your project for performance? As said in my previous comment, it doesn't take many parallel usages (350) of this feature to cause a 20FPS drop without adding anything else, but with MSAA turned on the usage budget becomes comically low (6-20) that most games (tilemaps, spawning enemies) will have absolutely no problem smashing past. When I tested it at least*. I suspect this feature might be intended for simple things like a title screen or other one-at-a-time uses, so if this is not viable as the backbone for a project's art style that might end up as a won'tfix. *=Small update: I just ran my test project again for a sanity check, EDIT: nevermind I accidentally tested the stable version of clipping in my last edit, testing the actual cascade build gives me the same low-performance results as before. Possibly more viable with some other AA technique (downscaling?). |
@insomniacUNDERSCORElemon I put together a performance test that constantly spawns a scene with x5 uses of clipping, running at 4K resolution. You can observe the deltas between the current clip and this PR's clip, with and without MSAA. This is to try and rule out the combination of MSAA+this PR's clipping having some unusual performance loss over the base line. MSAA+clip is already expensive in the current implementation (yellow), but the clip cost from beta to this PR is too (red -> orange; no MSAA) and (yellow -> blue; with MSAA). We can also see by the purple and light-blue lines (versus others) that clipping in general has a significant performance cost, and that the PR does not harm non-clipping performance. When you are using lots of clipping you will eventually reach a point of poor performance even in the current implementation. But that limit is lower here. I don't think it is at a level that makes this unusable for some games but it is worth seeing if there are ways performance can be improved, ideally if it could be close to the current clip implementation, or reach that under certain conditions such as if nesting is not being used. Or perhaps you could look at it the other way around, that we need better AA solutions, like a method that produces a result just as good as MSAA but with a lower cost. This is run on a 7900XT so don't take the instance (x5 uses) number/framerate as a representation of end-user use case like a minimum spec PC or steam deck, it's just to illustrate the delta between different conditions. Edit: |
The reason clipping is slow is because it needs to switch drawing to a different target, and also stop any draw calls batching in between. GPUs are optimized to make lots of draws into a single buffer, not to draw into multiple buffers. However, to do any form of complex compositing, there isn't really much in terms of an alternative to using multiple buffers. Maybe combining a mega-buffer with compute shader rendering, or writable textures. But either option would reduce GPU compatibility, as well as require overhauling fundamental parts of the renderer. Ultimately, GPUs aren't made for reading and writing to the same buffer by default, and switching back and forth between two buffers introduces an inevitable pause. There are a few places where performance could maybe be improved a bit, with regards to blitting and dirty regions, or reuse of buffers. But not to the point of hitting significant performance gains. |
23dde77
to
c4021fa
Compare
@SlugFiller I should've pointed it out after doing the testing, but the big point is that for the same performance impact this PR can only render 22-25% of clipping instances compared to the implementation in 4.2.2. Which is to say the impact with this PR is there whether or not you're actually using clipping to do something complex (my tester is a literal tile design on a tilemap with clipping just used to put a line onto a polygon). Though the discussions related to what 404Fox linked to (mention of cache, proposal 8747) is relevant here too, as allowing explicitly static art (to not impact every frame) would open up a lot of possibility. It would make sense too if low-FPS frame animation was less costly than linear animation. Though ideally this would be some backend optimization rather than another thing the user needs to change on every node (assuming it's viable, I mean it seems like the engine should know if something is never/sometimes/always updated). EDIT: Made a comment on that proposal Also small note: I tried the newest version (same performance), crashes on -1 Z index w/clipping |
Are performance issues with clipping the primary thing keeping this PR from being merged? I for one would love to see it merged if only to have nested CanvasGroup nodes and more flexible render modes. Would it be possible to separate this functionality from the clipping and merge these changes sooner rather than later, or are these features too tightly coupled at this point? Regardless, thanks for all your hard work on this. I've been really struggling with the current CanvasGroup implementation, and these changes would fix a lot of problems for me. |
What this PR is
Add the ability of putting canvas group instead of another canvas group.
What to test
Big list of changes related to canvas groups and clip children:
n
canvas groups and noBackBufferCopy
nodes may perform up to2n+1
back buffer copies (one inside, and one after each canvas group, plus one for the main scene).hint_screen_texture
for the drawn children texture. Instead, a newhint_mask_texture
is added. However,hint_screen_texture
is still used as an alias forhint_mask_texture
if used inside a canvas group or canvas item with clip children set, to preserve compatibility with existing shaders.hint_mask_texture
(norhint_screen_texture
) is added to a canvas group or parent with clip children, the masking effect is applied automatically, and the shader can control the tint and opacity applied to the children, viaCOLOR
. Ifhint_mask_texture
is used, the default masking operation is disabled, and a custom blending effect can be applied. Closes Improve CanvasItem Clip Children for more flexibility godot-proposals#5595CanvasItemMaterial
works as expected, allowing to change the blend mode without losing the default masking effect. (Re)fixes CanvasGroup breaks with any material applied #51204Breaking changes:
use_mipmaps
onCanvasGroup
is now largely pointless, since any shader that uses a mipmap-enabled sampler on the mask texture will automatically trigger mipmap generation. The value still works, and will generate mipmaps even when unused by the shader.Video memory usage: Cascading canvas groups and nodes with clip children creates full-screen buffers based on the cascading depth. Allocated buffers are cleared after a lower depth of canvas groups has been maintained for more than 2 seconds.