Unit/integration testing: Testing graphical and UI code. #1760

bruvzg · 2020-11-02T10:08:03Z

Describe the project you are working on:
Godot engine.

Describe the problem or limitation you are having in your project:
Unit testing was introduced in godotengine/godot#40148, but currently there's no possibility to automatically test any GUI and rendering related code.

Related proposal: #1307 (testing contexts), #1533 (old tests had at least some rendering and UI tests)

Describe the feature / enhancement and how it helps to overcome the problem or limitation:
Implement off-screen DisplayServer for use on headless CI, and make it compatible with software Vulkan (SwiftShader) / OpenGL (OSMesa) implementations to run on CI without GPU, and add testing framework context with active rendering pipeline (initialized display and rendering servers, and normal project main loop).

Describe how your proposal will work, with code, pseudocode, mockups, and/or diagrams:

Testing framework render small, simple scenes for the isolated graphical features (materials, shaders, lighting/shadows e.t.c.) or reaction of the GUI elements to the simulated input event, with a fixed time steps for deterministic behavior.
It takes screenshots at the predefined moments of time (for testing multiple rendering steps successively and testing particles/animations), and store them (probably downscaled, to avoid too big files and smooth out).
Screenshots are compared (by the engine or external script) to the reference images, and marked for manual inspection if they substantial differences (by adding a thick, red border to the image for example).
Screenshots are uploaded as the build artifact (as archive with the one image per test suite).

If this enhancement will not be used often, can it be worked around with a few lines of script?:
It can be used as part CI to detect rendering, physics and GUI regressions, and can be used to quickly test specific hardware or driver versions for rendering issues (the same context should be usable with normal DisplayServers as well).

Is there a reason why this should be core and not an add-on in the asset library?:
It should be possible to achieve this with module or GDScript project, but probably better to have testing related stuff in the core for cleaner CI configs and to avoid duplicate code in multiple test projects.

The text was updated successfully, but these errors were encountered:

Xrayez · 2020-11-02T13:06:19Z

Test contexts

The minimal testing context was introduced in godotengine/godot#40980 without rendering capabilities, but has been working alright for unit testing specifically so far.

The way I see it, it may be feasible to just introduce another integration test context manually. I've previously attempted to create test contexts using doctest's dynamic filtering in https://github.com/Xrayez/godot/tree/test-contexts, but it may be too complex to maintain, and error-prone.

The main challenge is being able to register setup/teardown methods with doctest, which is not a feature of doctest (without code duplication). The suggested setup/teardown mechanism in doctest is to use SUBCASEs, but I think that works better for avoiding duplication in a test case itself, and not really the test environment.

The entry point for unit and integration testing could be rewritten to accept things like:

--test unit
--test integration
--test project
--test rendering

This way, I think it would be still possible to use doctest for those (like godotengine/godot#42938). It means that the entry point would go through additional interface layer, so to speak.

This kind of setup would also help #1533 because it means no compatibility breakage would have to be done in the first place. But godotengine/godot#40148 didn't preserve compatibility with the old tests.

Graphical and UI code testing

I think testing graphical and UI code requires a MainLoop to be running. It's totally possible to feed input events via code, as seen in Xrayez/godot-testbed#5:

extends "res://addons/gut/test.gd"

# https://github.com/godotengine/godot/issues/32597

class TabContainerGuiInputCrash extends TabContainer:

    var ev = InputEventMouseButton.new()

    func _ready():
        var pm := PopupMenu.new()
        set_popup(pm)
        pm.queue_free()

        yield(get_tree(), "idle_frame")
        yield(get_tree(), "idle_frame")
        yield(get_tree(), "idle_frame")

        ev.pressed = true
        ev.button_index = BUTTON_LEFT
        ev.button_mask = BUTTON_LEFT
        ev.position = Vector2(0, 14)

        Input.parse_input_event(ev)

        yield(get_tree(), "idle_frame")
        yield(get_tree(), "idle_frame")

        Input.parse_input_event(ev)
        Input.parse_input_event(ev)

var container

func setup():
    var gut_window = get_parent().get_node('Gut')
    gut_window.hide() # need to hide to properly detect input event

    container = TabContainerGuiInputCrash.new()
    add_child(container)


func test_tab_container_gui_input():
    yield(yield_for(1.0, 'Hopefully no crash happens.'), YIELD)
    assert_true(true, "No crash, great!")


func teardown():
    container.queue_free()

The --fixed-fps and --disable-render-loop command-line options could potentially be used to speed up simulation and controlling the rendering loop via code with RenderingServer.force_draw(). See also godotengine/godot#43260, I'm not sure whether those methods would be actually useful for this.

3. Screenshots are compared (by the engine or external script) to the reference images, and marked for manual inspection if they substantial differences (by adding a thick, red border to the image for example).

doctest could be used for this as for GDScript integration tests #1429, but may be overkill, so perhaps an extra step would be indeed required to do this.

But in theory, all this could be done from within a Godot project running on CI. This is where testing frameworks like GUT shine, in my opinion. For instance, I've been successfully running unit tests in Goost, but we still need a way to render stuff on CI.

c0d1f1ed · 2021-01-07T19:28:54Z

I noticed at https://bruvzg.github.io/using-godot-with-swiftshader-software-vulkan-emulation.html that you had to increase SwiftShader's bound descriptor set limit to 16 to get it to work with Godot. I'm curious why that's required. Currently only just over half of the Vulkan drivers support 16 or more: https://vulkan.gpuinfo.org/displaydevicelimit.php?name=maxBoundDescriptorSets. While this metric does not take deployments into account, it still seems to me that important classes of GPUs only support 8, or 4, bound descriptor sets.

I don't mind upstreaming this change to permanently increase it, but I'd love to understand how an engine like Godot uses more than 4 descriptor sets, and what might be a good balance. It seems like no GPU has 16, so I guess 8 would already suffice? Any significant advantage from increasing it to 32? Thanks!

bruvzg · 2021-01-07T19:41:13Z

I noticed at https://bruvzg.github.io/using-godot-with-swiftshader-software-vulkan-emulation.html that you had to increase SwiftShader's bound descriptor set limit to 16 to get it to work with Godot.

Godot's RenderingDeviceVulkan supports up to 16 descriptor sets, but 6 should be fine for the current version.

Edit: Actually it might work with 4 since godotengine/godot#44175 was merged.

bruvzg · 2021-01-07T20:03:31Z

Actually it might work with 4 since godotengine/godot#44175 was merged.

I have checked the current master of Godot, and it's working with a limit of 4 descriptor sets, so this change is not necessary anymore.

fire · 2021-01-18T21:34:04Z

Did anyone try robotframework to provide visual tests.

Need to evaluate:

With Sikulix https://github.com/rainmanwy/robotframework-SikuliLibrary#readme
With ScreenCapLibrary https://github.com/mihaiparvu/ScreenCapLibrary

https://robotframework.org/#documentation

We can use robotframework and pick one of the available frameworks that support vulkan.

fire · 2021-01-19T16:41:23Z

I have made a prototype using robotframework.

This sample does two things:

Execute godot on windows
- Close project manager
Execute notepad
- Type text
- Close

https://github.com/fire/robotframework-godot

fire · 2021-01-19T16:57:29Z

Added a video recording task.

Using vmaf we were able to get a score of 97.430362 for the same video and 65.083790 for different videos.

.\data\ffmpeg-N-100672-gf3f5ba0bf8-win64-lgpl-shared-vulkan\bin\ffmpeg.exe -i default_1.webm -pix_fmt yuv420p default_1.y4m
.\data\ffmpeg-N-100672-gf3f5ba0bf8-win64-lgpl-shared-vulkan\bin\ffmpeg.exe -i godot_1.webm -pix_fmt yuv420p godot_1.y4m
copy godot_1.y4m to godot_2.y4m
.\data\vmaf.exe --reference .\godot_1.y4m --distorted .\default_1.y4m
.\data\vmaf.exe --reference .\godot_1.y4m --distorted .\godot_2.y4m

Not written a script for it yet, but also able to take a screenshot and run comparison stats and have a visual diff. Used a single executable built reg-cli.

Calinou · 2022-02-23T21:18:59Z

I have a proof of concept that uses Nut.js here: https://github.com/Calinou/godot/tree/add-editor-ui-tests/misc/ui_tests

For the editor, I don't know what kind of "workflows" would be best to apply within the automated tests though. Creating a basic project automatically, running it then stopping it would be useful, but it wouldn't be testing a whole lot of functionality.

Also, I haven't figured out how to run it on a headless server (with Xvfb + Lavapipe/SwiftShader) yet.

fire · 2022-02-23T22:13:22Z

I was using robot framework because it can run the editor using image recognition to find buttons and then execute the process under swiftshader.

@nikitalita worked on swiftshader cicd integration.

Edited:

I evaluated nut.js it doesn't seem to have support for everything. https://robotframework.org/#resources

nikitalita · 2023-02-14T04:07:57Z

My initial attempts at visual regression testing has revealed that output can vary wildly between video cards and even different driver versions. It's not really noticeable to the human eye, but a 1-to-1 comparison or even a fuzzy comparison >95% of frame captures will fail if the test environment isn't set up the exact same for the baseline and the subsequent tests (preferably the exact same machine). @myaaaaaaaaa have you encountered this?

Calinou · 2023-02-14T19:07:07Z

My initial attempts at visual regression testing has revealed that output can vary wildly between video cards and even different driver versions. It's not really noticeable to the human eye, but a 1-to-1 comparison or even a fuzzy comparison >95% of frame captures will fail if the test environment isn't set up the exact same for the baseline and the subsequent tests (preferably the exact same machine). @myaaaaaaaaa have you encountered this?

See How (not) to test graphics algorithms. A dssim check should be able to work out decently if it has a large enough threshold, but in general, it's recommended to have a few "complete" test images over a lot of "partial" tests covering isolated features. This may be counter-intuitive, but it makes checking for regressions a lot less time-consuming. We should be careful about "alarm fatigue" in general when it comes to this kind of regression testing, as it's an easy trap to fall into.

mariomadproductions · 2024-04-22T13:59:55Z

I wonder if this would be useful for "whole game" tests. The developer would record the inputs, RNG seed and movie for a playthrough. The movie or perceptual hash of the movie would be stored, and then the inputs and RNG seed would be used to replay the movie and compare with the developer's playthrough. This could be useful to automatically test if a game still functions correctly when ported to another platform/godot version. A self-test option could also be included in published builds, for players to use. For the self-test, as the full thing might take too long for large and performance-heavy games, there could just be an option for a cut-down playthrough, or playthrough of a test level/test suite.

Calinou · 2024-04-24T15:49:46Z

I wonder if this would be useful for "whole game" tests.

Godot's physics engines are not determinstic, so this wouldn't be useful unless your game doesn't rely on the physics engine at all (and uses its own deterministic physics implementation).

Subtle differences in rendering (due to different GPU hardware or driver version) can also be introduced, which would cause the hash to be ivnalid.

mariomadproductions · 2024-04-24T17:36:45Z

Makes sense, regarding the physics.

For the differences in rendering, I think perceptual hashes are designed to allow leeway for small changes. And I'd think you'd want to detect large differences in a game when using different GPU/driver configurations?

But maybe this should be a separate discussion thread, actually.

Calinou · 2024-04-24T22:19:44Z

For the differences in rendering, I think perceptual hashes are designed to allow leeway for small changes. And I'd think you'd want to detect large differences in a game when using different GPU/driver configurations?

Yes, tools like dssim can be used to calculate a similarity score between two images. Tweaking the value threshold is an art in itself though, and you need to record your videos using lossless compression which results in huge files.

Calinou added topic:gui topic:tests labels Nov 2, 2020

Calinou mentioned this issue Jan 18, 2021

Automated testing of visuals godotengine/godot-tests#20

Closed

fire mentioned this issue Jan 19, 2021

Merge into a Godot Engine org project. fire/robotframework-godot#1

Open

1 task

Houkime mentioned this issue Jul 17, 2021

Popup menus (including context menus) pop up briefly and disappear if another menu is already open godotengine/godot#50367

Closed

Calinou mentioned this issue Oct 19, 2021

Make it as easy as possible for game developers to unit test their games #3448

Open

Calinou mentioned this issue Jul 16, 2022

Automated editor interaction for documentation screenshots and integration tests #4888

Open

TechnoPorg mentioned this issue Oct 10, 2022

Implement unit tests for Camera3D godotengine/godot#67170

Closed

bruvzg mentioned this issue May 21, 2024

Add methods to simulate input events. godotengine/godot#92198

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unit/integration testing: Testing graphical and UI code. #1760

Unit/integration testing: Testing graphical and UI code. #1760

bruvzg commented Nov 2, 2020

Xrayez commented Nov 2, 2020 •

edited

Loading

c0d1f1ed commented Jan 7, 2021

bruvzg commented Jan 7, 2021 •

edited

Loading

bruvzg commented Jan 7, 2021

fire commented Jan 18, 2021

fire commented Jan 19, 2021

fire commented Jan 19, 2021 •

edited

Loading

Calinou commented Feb 23, 2022 •

edited

Loading

fire commented Feb 23, 2022 •

edited

Loading

nikitalita commented Feb 14, 2023

Calinou commented Feb 14, 2023 •

edited

Loading

mariomadproductions commented Apr 22, 2024

Calinou commented Apr 24, 2024 •

edited

Loading

mariomadproductions commented Apr 24, 2024 •

edited

Loading

Calinou commented Apr 24, 2024

Unit/integration testing: Testing graphical and UI code. #1760

Unit/integration testing: Testing graphical and UI code. #1760

Comments

bruvzg commented Nov 2, 2020

Xrayez commented Nov 2, 2020 • edited Loading

Test contexts

Graphical and UI code testing

c0d1f1ed commented Jan 7, 2021

bruvzg commented Jan 7, 2021 • edited Loading

bruvzg commented Jan 7, 2021

fire commented Jan 18, 2021

fire commented Jan 19, 2021

fire commented Jan 19, 2021 • edited Loading

Calinou commented Feb 23, 2022 • edited Loading

fire commented Feb 23, 2022 • edited Loading

nikitalita commented Feb 14, 2023

Calinou commented Feb 14, 2023 • edited Loading

mariomadproductions commented Apr 22, 2024

Calinou commented Apr 24, 2024 • edited Loading

mariomadproductions commented Apr 24, 2024 • edited Loading

Calinou commented Apr 24, 2024

Xrayez commented Nov 2, 2020 •

edited

Loading

bruvzg commented Jan 7, 2021 •

edited

Loading

fire commented Jan 19, 2021 •

edited

Loading

Calinou commented Feb 23, 2022 •

edited

Loading

fire commented Feb 23, 2022 •

edited

Loading

Calinou commented Feb 14, 2023 •

edited

Loading

Calinou commented Apr 24, 2024 •

edited

Loading

mariomadproductions commented Apr 24, 2024 •

edited

Loading