diff --git a/contributing/development/core_and_modules/img/rendering_architecture_diagram.webp b/contributing/development/core_and_modules/img/rendering_architecture_diagram.webp new file mode 100644 index 000000000000..f44a54a738f8 Binary files /dev/null and b/contributing/development/core_and_modules/img/rendering_architecture_diagram.webp differ diff --git a/contributing/development/core_and_modules/index.rst b/contributing/development/core_and_modules/index.rst index 8c5036e25b37..c261fa75de44 100644 --- a/contributing/development/core_and_modules/index.rst +++ b/contributing/development/core_and_modules/index.rst @@ -19,6 +19,7 @@ This section covers the basics that you will encounter in (almost) every source variant_class object_class inheritance_class_tree + internal_rendering_architecture Extending Godot by modifying its source code -------------------------------------------- diff --git a/contributing/development/core_and_modules/internal_rendering_architecture.rst b/contributing/development/core_and_modules/internal_rendering_architecture.rst new file mode 100644 index 000000000000..8c1abfeaafc3 --- /dev/null +++ b/contributing/development/core_and_modules/internal_rendering_architecture.rst @@ -0,0 +1,749 @@ +.. _doc_internal_rendering_architecture: + +Internal rendering architecture +=============================== + +This page is a high-level overview of Godot 4's internal renderer design. +It does not apply to previous Godot versions. + +The goal of this page is to document design decisions taken to best suit +:ref:`Godot's design philosophy `, +while providing a starting point for new rendering contributors. + +If you have questions about rendering internals not answered here, feel free to +ask in the ``#rendering`` channel of the +`Godot Contributors Chat `__. + +Rendering drivers +----------------- + +Godot 4 supports the following graphics APIs: + +Vulkan +^^^^^^ + +This is the main driver in Godot 4, with most of the development focus going +towards this driver. + +Vulkan 1.0 is required as a baseline, with optional Vulkan 1.1 and 1.2 features +used when available. `volk `__ is used as a Vulkan +loader, and +`Vulkan Memory Allocator `__ +is used for memory management. + +Both the Clustered Forward and Forward Mobile +:ref:`doc_internal_rendering_architecture_backends` are supported when using the +Vulkan driver. + +**Vulkan context creation:** + +- `drivers/vulkan/vulkan_context.cpp `__ + +Direct3D 12 +^^^^^^^^^^^ + +Like Vulkan, the Direct3D 12 driver targets modern platforms only. +**TODO:** Mention required feature level/optional feature levels. + +Both the clustered and mobile :ref:`doc_internal_rendering_architecture_backends` can be +used with Direct3D 12. + +:ref:`doc_internal_rendering_architecture_core_shaders` are shared with the +Vulkan renderer. Shaders are transpiled from GLSL to HLSL using +`SPIRV-Cross `__. +This means you don't need to know HLSL to work on the Direct3D 12 renderer, +although knowing the language's basics is recommended to ease debugging. + +**As of September 2022, this backend is still in development.** While Direct3D 12 +allows supporting Direct3D-exclusive features on Windows 11 such as windowed +optimizations and Auto HDR, Vulkan is still recommended for most projects. +See the +`pull request that introduced Direct3D 12 support `__ +for more information. + +Metal +^^^^^ + +Godot supports Metal rendering via `MoltenVK `__, +as macOS and iOS do not support Vulkan natively. +This is done automatically when specifying the Vulkan driver in the Project Settings. + +MoltenVK makes driver maintenance easy at the cost of some performance overhead. +Also, MoltenVK has several limitations that a native Metal driver implementation +wouldn't have. + +A native Metal driver is planned in the future for better performance and +compatibility. + +OpenGL +^^^^^^ + +This driver uses OpenGL ES 3.0 and targets legacy and low-end devices that don't +support Vulkan. OpenGL 3.3 Core Profile is used on desktop platforms to run this +backend, as most graphics drivers on desktop don't actually support OpenGL ES. + +Only the :ref:`doc_internal_rendering_architecture_compatibility` backend can be used +with the OpenGL driver. + +:ref:`doc_internal_rendering_architecture_core_shaders` are entirely different +from the Vulkan renderer. + +**As of February 2023, this backend is still in development.** Many features +are still not implemented. + +.. _doc_internal_rendering_architecture_backends: + +Rendering backends +------------------ + +Clustered Forward +^^^^^^^^^^^^^^^^^ + +This is a forward renderer that uses a *clustered* approach to lighting. + +Intended for desktop platforms only. This backend will **not** run on mobile +platforms as it requires higher limits than what is available on mobile GPUs as +a baseline (such as the number of texture units). + +Forward Mobile +^^^^^^^^^^^^^^ + +This is a forward renderer that uses a traditional single-pass approach to lighting. + +Intended for mobile platforms, but can also run on desktop platforms. This +backend is optimized to perform well on mobile GPUs. Mobile GPUs have a very +different architecture compared to desktop GPUs due to their unique constraints +around battery usage, heat, and overall bandwidth limitations of reading and +writing data. Compute shaders also have very limited support or aren't supported +at all. As a result, the mobile renderer purely uses raster-based shaders +(fragment/vertex). + +Unlike desktop GPUs, mobile GPUs perform *tile-based rendering*. Instead of +rendering the whole image as a single unit, the image is divided in smaller +tiles that fit within the faster internal memory of the mobile GPU. Each tile is +rendered and then written out to the destination texture. This all happens +automatically on the graphics driver. + +The problem is that this introduces bottlenecks in our traditional approach. For +desktop rendering, we render all opaque geometry, then handle the background, +then transparent geometry, then post-processing. Each pass will need to read the +current result into tile memory, perform its operations and then write it out +again. We then wait for all tiles to be completed before moving on to the next +pass. + +The first important change in the mobile renderer is that the mobile renderer +does not use the RGBA16F texture formats that the desktop renderer does. +Instead, it is using a R10B10G10A2 texture format. This halves the bandwidth +required and has further improvements as mobile hardware often further optimises +for 32-bit formats. The tradeoff is that the mobile renderer has limited HDR +capabilities due to the reduced precision in the color data. + +The second important change is the use of sub-passes whenever possible. +Sub-passes allows us to perform the rendering steps end-to-end per tile saving +on the overhead introduced by reading from and writing to the tiles between each +rendering pass. The ability to use sub-passes is limited by the inability to +read neighboring pixels, as we're constrained to working within a single tile. + +This limitation of subpasses results in not being able to implement features +such as glow and depth of field. Similarly, if there is a requirement to read +from the screen texture or depth texture, we must fully write out the rendering +result limiting our ability to use sub-passes. When such features are enabled, a +mix of sub-passes and normal passes are used, and these features result in a +notable performance penalty. + +On desktop platforms, the use of sub-passes won't have any impact on +performance. However, this backend can still perform better than Clustered +Forward in simple scenes thanks to its lower complexity and lower bandwidth +usage. This is especially noticeable on low-end GPUs, integrated graphics or in +VR applications. + +Given its low-end focus, this backend does not provide high-end rendering +features such as SDFGI and :ref:`doc_volumetric_fog`. Most post-processing +effects are also not available. + +.. _doc_internal_rendering_architecture_compatibility: + +Compatibility +^^^^^^^^^^^^^ + +.. note:: + + This is the only backend available when using the OpenGL driver. This + backend is not available when using Vulkan or Direct3D 12. + +This is a traditional (non-clustered) forward renderer. It's intended for old +GPUs that don't have Vulkan support, as well as HTML5 (with WebGL 2.0). + +Given its low-end focus, this backend does not provide high-end rendering +features (even less so compared to Forward Mobile). Most post-processing effects +are not available. + +Why not deferred rendering? +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Forward rendering generally provides a better tradeoff for performance versus +flexibility, especially when a clustered approach to lighting is used. While +deferred rendering can be faster in some cases, it's also less flexible and +requires using hacks to be able to use MSAA. Since games with a less realistic +art style can benefit a lot from MSAA, we chose to go with forward rendering for +Godot 4 (like Godot 3). + +That said, parts of the forward renderer *are* performed with a deferred approach +to allow for some optimizations when possible. This applies to VoxelGI and SDFGI +in particular. + +A clustered deferred renderer may be developed in the future. This renderer +could be used in situations where performance is favored over flexibility. + +Summary of rendering drivers/backends +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The following rendering API + backend combinations are currently possible: + +- Vulkan + Clustered Forward +- Vulkan + Forward Mobile +- Direct3D 12 + Clustered Forward +- Direct3D 12 + Forward Mobile +- Metal + Clustered Forward (via MoltenVK) +- Metal + Forward Mobile (via MoltenVK) +- OpenGL + Compatibility + +Each combination has its own limitations and performance characteristics. Make +sure to test your changes on all backends if possible before opening a pull +request. + +RenderingDevice abstraction +--------------------------- + +.. note:: + + The OpenGL driver does not use the RenderingDevice abstraction. + +To make the complexity of modern low-level graphics APIs more manageable, +Godot uses its own abstraction called RenderingDevice. + +This means that when writing code for modern rendering backends, you don't +actually use the Vulkan or Direct3D 12 APIs directly. While this is still +lower-level than an API like OpenGL, this makes working on the renderer easier, +as RenderingDevice will abstract many API-specific quirks for you. + +**Vulkan RenderingDevice implementation:** + +- `drivers/vulkan/rendering_device_vulkan.cpp `__ + +Core rendering classes architecture +----------------------------------- + +This diagram represents the structure of rendering classes in Godot, including the RenderingDevice abstraction: + +.. image:: img/rendering_architecture_diagram.webp + +`View at full size `__ + +.. _doc_internal_rendering_architecture_core_shaders: + +Core shaders +------------ + +While shaders in Godot projects are written using a +:ref:`custom language inspired by GLSL `, core shaders are +written directly in GLSL. + +These core shaders are embedded in the editor and export template binaries at +compile-time. To see any changes you've made to those GLSL shaders, you need to +recompile the editor or export template binary. + +Some material features such as height mapping, refraction and proximity fade are +not part of core shaders, and are performed in the default BaseMaterial3D using +the Godot shader language instead (not GLSL). This is done by procedurally +generating the required shader code depending on the features enabled in the +material. + +By convention, shader files with ``_inc`` in their name are included in other +GLSL files for better code reuse. Standard GLSL preprocessing is used to achieve +this. + +.. warning:: + + Core material shaders will be used by every material in the scene – both + with the default BaseMaterial3D and custom shaders. As a result, these + shaders must be kept as simple as possible to avoid performance issues and + ensure shader compilation doesn't become too slow. + + If you use ``if`` branching in a shader, performance may decrease as + :abbr`VGPR (Vector General-Purpose Register)` usage will increase in the + shader. This happens even if all pixels evaluate to ``true`` or ``false`` in + a given frame. + + If you use ``#if`` preprocessor branching, the number of required shader + versions will increase in the scene. In a worst-case scenario, adding a + single boolean ``#define`` can *double* the number of shader versions that + may need to be compiled in a given scene. In some cases, Vulkan + specialization constants can be used as a faster (but more limited) + alternative. + + This means there is a high barrier to adding new built-in material features + in Godot, both in the core shaders and BaseMaterial3D. While BaseMaterial3D + can make use of dynamic code generation to only include the shader code if + the feature is enabled, it'll still require generating more shader versions + when these features are used in a project. This can make shader compilation + stutter more noticeable in complex 3D scenes. + + See + `The Shader Permutation Problem `__ + and + `Branching on a GPU `__ + blog posts for more information. + +**Core GLSL material shaders:** + +- Clustered Forward: `servers/rendering/renderer_rd/shaders/scene_forward_clustered.glsl `__ +- Forward Mobile: `servers/rendering/renderer_rd/shaders/scene_forward_mobile.glsl `__ +- Compatibility: `drivers/gles3/shaders/scene.glsl `__ + +**Material shader generation:** + +- `scene/resources/material.cpp `__ + +**Other GLSL shaders for Clustered Forward and Forward Mobile backends:** + +- `servers/rendering/renderer_rd/shaders/ `__ +- `modules/lightmapper_rd/ `__ + +**Other GLSL shaders for the Compatibility backend:** + +- `drivers/gles3/shaders/ `__ + +2D and 3D rendering separation +------------------------------ + +.. note:: + + Only available in the Clustered Forward and Forward Mobile backends, not the + Compatibility backend. Multiple Viewports can still be used to emulate this + when using the Compatibility backend, or to perform 2D resolution scaling. + +2D and 3D are rendered to separate buffers, as 2D rendering in Godot is performed +in :abbr:`LDR (Low Dynamic Range)` sRGB-space while 3D rendering uses +:abbr:`HDR (High Dynamic Range)` linear space. + +3D resolution scaling is performed differently depending on whether bilinear or +FSR 1.0 scaling is used. When bilinear scaling is used, no special upscaling +shader is run. Instead, the viewport's texture is stretched and displayed with a +linear sampler (which makes the filtering happen directly on the hardware). This +allows maximizing the performance of bilinear 3D scaling. + +The ``configure()`` function in RenderSceneBuffersRD reallocates the 2D/3D +buffers when the resolution or scaling changes. + +Dynamic resolution scaling isn't supported yet, but is planned in a future Godot +release. + +**2D and 3D rendering buffer configuration C++ code:** + +- `servers/rendering/renderer_rd/storage_rd/render_scene_buffers_rd.cpp `__ + +2D rendering techniques +----------------------- + +2D light rendering is performed in a single pass to allow for better performance +with large amounts of lights. + +The clustered and mobile backends don't feature 2D batching yet, but it's +planned for a future release. + +The compatibility backend features 2D batching to improve performance, which is +especially noticeable with lots of text on screen. + +MSAA can be enabled in 2D to provide line and polygon antialiasing, but FXAA +will never affect 2D rendering as it would look too blurry. + +A 2D signed distance field representing the viewport is automatically generated +if an user shader requests it. This can be used for various effects in custom +shaders, such as 2D global illumination. + +**2D SDF generation GLSL shader:** + +- `https://github.com/godotengine/godot/blob/master/servers/rendering/renderer_rd/shaders/canvas_sdf.glsl `__ + +3D rendering techniques +----------------------- + +Batching and instancing +^^^^^^^^^^^^^^^^^^^^^^^ + +In the Clustered Forward backend, Vulkan instancing is used to group rendering +of identical objects for performance. This is not as fast as static mesh +merging, but it still allows instances to be culled individually. + +Light, decal and reflection probe rendering +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. note:: + + Reflection probe and decal rendering are currently not available in the + Compatibility backend. + +As its name implies, the Clustered Forward backend uses clustered lighting. This +allows using as many lights as you want; performance largely depends on screen +coverage. Shadow-less lights can be almost free if they don't occupy much space +on screen. + +The Clustered Forward backend also supports rendering several directional lights +at the same time (albeit with lower shadow quality when more than one has +shadows enabled). Other backends are limited to only one DirectionalLight +rendered at a given time. + +The Forward Mobile backend uses a single-pass lighting approach, with a +limitation of 8 OmniLights + 8 SpotLights affecting each Mesh *resource*. This +limit is hardcoded and can't be adjusted in the project settings. + +The Compatibility backend uses a hybrid single-pass + multi-pass lighting +approach. Lights without shadows are rendered in a single pass (using less +accurate additive blending for performance reasons). Lights with shadows are +rendered in multiple passes. This is required for compatibility with low-end +mobile devices, but it has a huge cost to performance. As a result, it's +recommended to only have a handful of lights with shadows in the camera frustum +at most. The maximum number of lights visible at once can be adjusted in the +project settings. + +In all 3 backends, lights without shadows are much cheaper than lights with +shadows. To improve performance, lights are only updated when the light is +modified or when objects in its radius are modified. Godot currently doesn't +separate static shadow rendering from dynamic shadow rendering, but this is +planned in a future release. + +Clustering is also used for reflection probes and decal rendering in the +Clustered Forward backend. + +Shadow mapping +^^^^^^^^^^^^^^ + +Both Clustered Forward and Forward Mobile backends use +:abbr:`PCF (Percentage Closer Filtering)` to filter shadow maps and create a +soft penumbra. Instead of using a fixed PCF pattern, these backends use a vogel +disk pattern which allows for changing the number of samples and smoothly +changing the quality. + +Godot also supports percentage-closer soft shadows (PCSS) for more realistic +shadow penumbra rendering. PCSS shadows are limited to the Clustered Forward +backend as they're too demanding to be usable in the Forward Mobile backend. +PCSS also uses a vogel-disk shaped kernel. + +Additionally, both shadow-mapping techniques rotate the kernel on a per-pixel +basis to help soften under-sampling artifacts. + +The Compatibility backend doesn't support shadow mapping for any light types yet. + +Temporal antialiasing +^^^^^^^^^^^^^^^^^^^^^ + +.. note:: + + Only available in the Clustered Forward backend, not the Forward Mobile or + Compatibility backends. + +Godot uses a custom TAA implementation based on the old TAA implementation from +`Spartan Engine `__. + +Temporal antialiasing requires motion vectors to work. If motion vectors +are not correctly generated, ghosting will occur when the camera or objects move. + +Motion vectors are generated on the GPU in the main material shader. This is +done by running the vertex shader corresponding to the previous rendered frame +(with the previous camera transform) in addition to the vertex shader for the +current rendered frame, then storing the difference between them in a color buffer. + +Using `FSR 2.0 `__ instead +of a custom TAA implementation is planned in a future release. + +**TAA resolve:** + +- `servers/rendering/renderer_rd/shaders/effects/taa_resolve.glsl `__ + +Global illumination +^^^^^^^^^^^^^^^^^^^ + +.. note:: + + VoxelGI and SDFGI are only available in the Clustered Forward backend, not the + Forward Mobile or Compatibility backends. + + LightmapGI *baking* is only available in the Clustered Forward and Forward + Mobile backends, and can only be performed within the editor (not in an + exported project). LightmapGI *rendering* is available in all backends + including Compatibility. + +Godot supports voxel-based GI (VoxelGI), signed distance field GI (SDFGI) and +lightmap baking and rendering (LightmapGI). These techniques can be used +simultaneously if desired. + +Lightmap baking happens on the GPU using Vulkan compute shaders – CUDA or OpenCL +are **not** used. Using Vulkan compute allows for greater compatibility and +performance on all kinds of GPUs. In contrast, CUDA only runs on NVIDIA hardware +and OpenCL performance is crippled on NVIDIA GPUs. + +The GPU-based lightmapper is implemented in the LightmapperRD class, which inherits +from the Lightmapper class. This allows for implementing additional lightmappers, +paving the way for a future port of the CPU-based lightmapper present in Godot 3.x. +This would allow baking lightmaps while using the Compatibility backend. + +**Core GI C++ code:** + +- `servers/rendering/renderer_rd/environment/gi.cpp `__ +- `scene/3d/voxel_gi.cpp `__ - VoxelGI node +- `editor/plugins/voxel_gi_editor_plugin.cpp `__ - Editor UI for the VoxelGI node + +**Core GI GLSL shaders:** + +- `servers/rendering/renderer_rd/shaders/environment/voxel_gi.glsl `__ +- `servers/rendering/renderer_rd/shaders/environment/voxel_gi_debug.glsl `__ - VoxelGI debug draw mode +- `servers/rendering/renderer_rd/shaders/environment/sdfgi_debug.glsl `__ - SDFGI Cascades debug draw mode +- `servers/rendering/renderer_rd/shaders/environment/sdfgi_debug_probes.glsl `__ - SDFGI Probes debug draw mode +- `servers/rendering/renderer_rd/shaders/environment/sdfgi_integrate.glsl `__ +- `servers/rendering/renderer_rd/shaders/environment/sdfgi_preprocess.glsl `__ +- `servers/rendering/renderer_rd/shaders/environment/sdfgi_direct_light.glsl `__ + +**Lightmapper C++ code:** + +- `scene/3d/lightmap_gi.cpp `__ - LightmapGI node +- `editor/plugins/lightmap_gi_editor_plugin.cpp `__ - Editor UI for the LightmapGI node +- `scene/3d/lightmapper.cpp `__ - Abstract class +- `modules/lightmapper_rd/lightmapper_rd.cpp `__ - GPU-based lightmapper implementation + +**Lightmapper GLSL shaders:** + +- `modules/lightmapper_rd/lm_raster.glsl `__ +- `modules/lightmapper_rd/lm_compute.glsl `__ +- `modules/lightmapper_rd/lm_blendseams.glsl `__ + +Depth of field +^^^^^^^^^^^^^^ + +.. note:: + + Only available in the Clustered Forward and Forward Mobile backends, not the + Compatibility backend. + +The Clustered Forward and Forward Mobile backends use different approaches to +DOF rendering, with different visual results. This is done to best +match the performance characteristics of the target hardware. In Clustered +Forward, DOF is performed using a compute shader. In Forward Mobile, DOF is +performed using a fragment shader (raster). + +Box, hexagon and circle bokeh shapes are available (from fastest to slowest). +Depth of field can optionally be jittered every frame to improve its appearance +when temporal antialiasing is enabled. + +**Depth of field C++ code:** + +- `servers/rendering/renderer_rd/effects/bokeh_dof.cpp `__ + +**Depth of field GLSL shader (compute - used for Clustered Forward):** + +- `servers/rendering/renderer_rd/shaders/effects/bokeh_dof.glsl `__ + +**Depth of field GLSL shader (raster - used for Forward Mobile):** + +- `servers/rendering/renderer_rd/shaders/effects/bokeh_dof_raster.glsl `__ + +Screen-space effects (SSAO, SSIL, SSR, SSS) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. note:: + + Only available in the Clustered Forward backend, not the Forward Mobile or + Compatibility backends. + +The Clustered Forward backend supports screen-space ambient occlusion, +screen-space indirect lighting, screen-space reflections and subsurface scattering. + +SSAO uses an implementation derived from Intel's +`ASSAO `__ +(converted to Vulkan). SSIL is derived from SSAO to provide high-performance +indirect lighting. + +When both SSAO and SSIL are enabled, parts of SSAO and SSIL are run in parallel +to reduce the performance impact. + +SSAO and SSIL are performed at half resolution by default to improve performance. +SSR is always performed at half resolution to improve performance. + +**Screen-space effects C++ code:** + +- `servers/rendering/renderer_rd/effects/ss_effects.cpp `__ + +**Screen-space ambient occlusion GLSL shader:** + +- `servers/rendering/renderer_rd/shaders/effects/ssao.glsl `__ +- `servers/rendering/renderer_rd/shaders/effects/ssao_blur.glsl `__ +- `servers/rendering/renderer_rd/shaders/effects/ssao_interleave.glsl `__ +- `servers/rendering/renderer_rd/shaders/effects/ssao_importance_map.glsl `__ + +**Screen-space indirect lighting GLSL shader:** + +- `servers/rendering/renderer_rd/shaders/effects/ssil.glsl `__ +- `servers/rendering/renderer_rd/shaders/effects/ssil_blur.glsl `__ +- `servers/rendering/renderer_rd/shaders/effects/ssil_interleave.glsl `__ +- `servers/rendering/renderer_rd/shaders/effects/ssil_importance_map.glsl `__ + +**Screen-space reflections GLSL shader:** + +- `servers/rendering/renderer_rd/shaders/effects/screen_space_reflection.glsl `__ +- `servers/rendering/renderer_rd/shaders/effects/screen_space_reflection_scale.glsl `__ +- `servers/rendering/renderer_rd/shaders/effects/screen_space_reflection_filter.glsl `__ + +**Subsurface scattering GLSL:** + +- `servers/rendering/renderer_rd/shaders/effects/subsurface_scattering.glsl `__ + +Sky rendering +^^^^^^^^^^^^^ + +.. seealso:: + + :ref:`doc_sky_shader` + +Godot supports using shaders to render the sky background. The radiance map +(which is used to provide ambient light and reflections for PBR materials) is +automatically updated based on the sky shader. + +The SkyMaterial resources such as ProceduralSkyMaterial, PhysicalSkyMaterial and +PanoramaSkyMaterial generate a built-in shader for sky rendering. This is +similar to what BaseMaterial3D provides for 3D scene materials. + +A detailed technical implementation can be found in the +`Custom sky shaders in Godot 4.0 `__ +article. + +**Sky rendering C++ code:** + +- ` `__ - Sky rendering +- ` `__ - Sky resource (not to be confused with sky rendering) +- ` `__ SkyMaterial resources (used in the Sky resource) + +**Sky rendering GLSL shader:** + +Volumetric fog +^^^^^^^^^^^^^^ + +.. note:: + + Only available in the Clustered Forward backend, not the Forward Mobile or + Compatibility backends. + +.. seealso:: + + :ref:`doc_fog_shader` + +Godot supports a frustum-aligned voxel (froxel) approach to volumetric fog +rendering. As opposed to a post-processing filter, this approach is more +general-purpose as it can work with any light type. Fog can also use shaders for +custom behavior, which allows animating the fog or using a 3D texture to +represent density. + +The FogMaterial resource generates a built-in shader for FogVolume nodes. This is +similar to what BaseMaterial3D provides for 3D scene materials. + +A detailed technical explanation can be found in the +`Fog Volumes arrive in Godot 4.0 `__ +article. + +**Volumetric fog C++ code:** + +- `servers/rendering/renderer_rd/environment/fog.cpp `__ - General volumetric fog +- `scene/3d/fog_volume.cpp `__ - FogVolume node +- `scene/resources/fog_material.cpp `__ - FogMaterial resource (used by FogVolume) + +**Volumetric fog GLSL shaders:** + +- `servers/rendering/renderer_rd/shaders/environment/volumetric_fog.glsl `__ +- `servers/rendering/renderer_rd/shaders/environment/volumetric_fog_process.glsl `__ + +Occlusion culling +^^^^^^^^^^^^^^^^^ + +While modern GPUs can handle drawing a lot of triangles, the number of draw +calls in complex scenes can still be a bottleneck (even with Vulkan and Direct3D +12). + +Godot 4 supports occlusion culling to reduce overdraw (when the depth prepass +is disabled) and reduce vertex throughput. +This is done by rasterizing a low-resolution buffer on the CPU using +`Embree `__. The buffer's resolution depends +on the number of CPU threads on the system, as this is done in parallel. +This buffer includes occluder shapes that were baked in the editor or created at +run-time. + +As complex occluders can introduce a lot of strain on the CPU, baked occluders +can be simplified automatically when generated in the editor. + +Godot's occlusion culling doesn't support dynamic occluders yet, but +OccluderInstance3D nodes can still have their visibility toggled or be moved. +However, this will be slow when updating complex occluders this way. Therefore, +updating occluders at run-time is best done only on simple occluder shapes such +as quads or cuboids. + +This CPU-based approach has a few advantages over other solutions, such as +portals and rooms or a GPU-based culling solution: + +- No manual setup required. +- No frame delay, which is problematic in cutscenes during camera cuts or when + the camera moves fast behind a wall. +- Works the same on all rendering drivers and backends, with no unpredictable + behavior depending on the driver or GPU hardware. + +Occlusion culling is performed by registering occluder meshes, which is done +using OccluderInstance3D *nodes* (which themselves use Occluder3D *resources*). +RenderingServer then performs occlusion culling by calling Embree in +RendererSceneOcclusionCull. + +**Occlusion culling C++ code:** + +- `scene/3d/occluder_instance_3d.cpp `__ +- `servers/rendering/renderer_scene_occlusion_cull.cpp `__ + +Visibility range (LOD) +^^^^^^^^^^^^^^^^^^^^^^^ + +Godot supports manually authored hierarchical level of detail (HLOD), with +distances specified by the user in the inspector. + +In RenderingSceneCull, the ``_scene_cull()`` and ``_render_scene()`` functions +are where most of the LOD determination happens. Each viewport can render the +same mesh with different LODs (to allow for split screen rendering to look correct). + +**Visibility range C++ code:** + +- `servers/rendering/renderer_scene_cull.cpp `__ + +Automatic mesh LOD +^^^^^^^^^^^^^^^^^^ + +The ImporterMesh class is used for the 3D mesh import workflow in the editor. +Its ``generate_lods()`` function handles generating using the +`meshoptimizer `__ library. + +LOD mesh generation also generates shadow meshes at the same time. These are +meshes that have their vertices welded regardless of smoothing and materials. +This is used to improve shadow rendering performance by lowering the vertex +throughput required to render shadows. + +The RenderingSceneCull class's ``_render_scene()`` function determines which +mesh LOD should be used when rendering. Each viewport can render the +same mesh with different LODs (to allow for split screen rendering to look correct). + +The mesh LOD is automatically chosen based on a screen coverage metric. This +takes resolution and camera FOV changes into account without requiring user +intervention. The threshold multiplier can be adjusted in the project settings. + +To improve performance, shadow rendering and reflection probe rendering also choose +their own mesh LOD thresholds (which can be different from the main scene rendering). + +**Mesh LOD generation on import C++ code:** + +- `scene/resources/importer_mesh.cpp `__ + +**Mesh LOD determination C++ code:** + +- `servers/rendering/renderer_scene_cull.cpp `__