Skip to content

Commit

Permalink
AtlasEngine: Import 19 commits from release-1.16
Browse files Browse the repository at this point in the history
AtlasEngine: Implement LRU invalidation for glyph tiles (#13458)

So far AtlasEngine would only grow the backing texture atlas once it gets full,
without the ability to reuse tiles once it gets full. This commit adds LRU
capabilities to the glyph-to-tile hashmap, allowing us to reuse the least
recently used tiles for new ones once the atlas texture is full.
This commit uses a quadratic growth factor with power-of-2 textures,
resulting in a backing atlas of 1x to 2x the size of the window.
While AtlasEngine is still incapable of shrinking the texture, it'll now at
least not grow to 128MB or result in weird glitches under most circumstances.

* Print `utf8_sequence_0-0x2ffff_assigned_printable_unseparated.txt`
  from https://github.com/bits/UTF-8-Unicode-Test-Documents
* Scroll back up to the top
* PowerShell input line is still there rendering as ASCII. ✅

AtlasEngine: Improve glyph generation performance (#13477)

so that we stop running out of GPU memory for complex Unicode.
This however can result in our glyph generation being a performance issue in
edge cases, to the point that the application may feel outright unuseable.

CJK glyphs for instance can easily exceed the maximum atlas texture size
(twice the window size), but take a significant amount of CPU and GPU time to
rasterize and draw, which results in "jelly scrolling" down to ~1 FPS.
This PR improves the situation of the latter half by directly drawing
glyphs into the texture atlas without an intermediate scratchpad texture.

This reduces GPU usage by 96% on my system (33% -> 2%) which improves general
render performance by ~100% (15 -> 30 FPS). CPU usage remains the same however,
but that's not really something we can do anything about at this time.
The atlas texture is already our primary means to reduce the CPU cost after all.

* Disable V-Sync for OpenConsole in NVIDIA Control Panel
* Enable `debugGlyphGenerationPerformance`
* Print the entire CJK block U+4E00..U+9FFF
* Measure the above GPU usage and FPS improvements ✅
  (Alternatively: Just scroll around and judge the "jellyness".)

AtlasEngine: Fix bugs introduced in 66f4f9d and d74b66a (#13496)

We only process glyphs within the dirtyRect, but glyphs outside of the dirtyRect
are still in use and shouldn't be discarded. This is critical if someone uses
a tool like tmux to split the terminal horizontally. If they then print a lot
of Unicode text on just one side, we have to ensure that the (for example)
plain ASCII glyphs on the other half of the viewport are still retained.

The cursor was drawn without a clip rect, causing the entire atlas
texture to be filled with black. This just so happened to work fine
in Windows Terminal but relied on a race condition.

Closes #13490

* Disappearing glyphs
  * Start `tmux` in `wsl`
  * Split horizontally with `Ctrl+B`, `"`
  * `cat` a huge Unicode text file on the bottom
  * Ensure ASCII glyphs in the top half don't disappear ✅
* Black viewport after font changes
  * Start `OpenConsole` with `AtlasEngine`
  * Open Properties dialog and click "Ok"
  * Viewport content doesn't disappear ✅

AtlasEngine: Improve robustness against TextBuffer bugs (#13530)

The current TextBuffer implementation will happily overwrite the
leading/trailing half of a wide glyph with a narrow one without
padding the other half with whitespace. This could crash AtlasEngine
which aggressively guarded against such inconsistencies.

Closes #13522

* Run .bat file linked in #13522
  (Override wide glyph with a single space.)
* `AtlasEngine` doesn't crash ✅

AtlasEngine: Handle IntenseIsBold (#13577)

This change adds support for the `IntenseIsBold` rendering setting.
Windows Terminal for instance defaults to `false` here, causing
intense colors to only be bright but not bold.

* Set "Intense text style" to "Bright colors"
* Enable AtlasEngine
* Print ``echo "`e[1mtest`e[0m"``
* "test" appears as bright without being bold ✅

AtlasEngine: Fix LRU state after scrolling (#13607)

66f4f9d had another bug: Just like how we scroll our viewport by `memmove`ing
the `_r.cells` array, we also have to `memmove` the new `_r.cellGlyphMapping`.
Without this fix drawing lots of glyphs while only scrolling slightly
(= not invalidating the entire viewport), would erroneously call
`makeNewest` on glyphs now outside of the viewport. This would cause
actually visible glyphs to be recycled and overwritten by new ones.

* Switch to Cascadia Code
* Print some text that fills the glyph atlas
* Scroll down by a few rows
* Write a long "==========" ligature (this quickly fills up
  any remaining space in the atlas and exacerbates the issue)
* Unrelated rows don't get corrupted ✅

AtlasEngine: Remove support for Windows 7 (#13608)

We recently figured that we can drop support for Windows 7. Coincidentally
AtlasEngine never actually supported Windows 7 properly, because it called
`ResizeBuffers` with `DXGI_SWAP_CHAIN_FLAG_FRAME_LATENCY_WAITABLE_OBJECT`
no matter whether the swap chain was created with it enabled.

The new minimally supported version is Windows 8.1.

AtlasEngine: Implement remaining grid lines (#13587)

This commit implements the remaining 5 of 8 grid lines:
left/top/right/bottom (COMMON_LVB) borders and double underline

`AtlasEngine::_resolveFontMetrics` was partially refactored to use `float`s
instead of `double`s, because that's what the remaining code uses as well.
It also helps the new, slightly more complex double underline calculation.

* Print characters with the `COMMON_LVB_GRID_HORIZONTAL`, `GRID_LVERTICAL`,
  `GRID_RVERTICAL` and `UNDERSCORE` attributes via `WriteConsoleOutputW`
* All 4 grid lines are visible ✅
* Grid lines correctly scale according to the `lineWidth` ✅
* Print a double underline with `printf "\033[21mtest\033[0m"`
* A double underline is fully visible ✅

AtlasEngine: Scale glyphs to better fit the cell size (#13549)

This commit contains 3 improvements for glyph rendering:
* Scale block element and box drawing characters to fit the cell size
  "perfectly" without leaving pixel gaps between cells.
* Use `IDWriteTextLayout::GetOverhangMetrics` to determine whether glyphs
  are outside the given layout box and if they are, offset their position
  to fit them back in. If that still fails to fit, we downscale them.
* Always scale up glyphs that are more than 2 cells wide
  This ensures that long ligatures that mimic box drawing characters like
  "===" under Cascadia Code are upscaled just like regular box drawings.
  Unfortunately this results in ligature-heavy text (like Myanmar) to get an
  "uneven" appearance because some ligatures can suddenly appear too large.
  It's difficult to come up with a good heuristic here.

Closes #12512

* Print UTF-8-demo.txt
* Block characters don't leave gaps ✅
* Print a lorem-ipsum in Myanmar
* Glyphs aren't cut off anymore ✅
* Print a long "===" ligature under Cascadia Code
* The ligature is as wide as the number of cells used ✅

AtlasEngine: Recognize Powerline glyphs (#13650)

This commit makes AtlasEngine recognize Powerline glyphs as box drawing ones.
The extra pixel offsets when determining the `scale` caused weird artifacts
and thus were removed. It seems like this causes no noticeable regressions.

Closes #13029

* Run all values of `wchar_t` through `isInInversionList`
  and ensure it produces the expected value ✅
* Powerline glyphs are correctly scaled with Cascadia Code PL ✅

AtlasEngine: Fix debugGlyphGenerationPerformance (#13757)

`debugGlyphGenerationPerformance` used to only test the performance of
text segmentation/parsing, so I renamed it to `debugTextParsingPerformance`.
The new `debugGlyphGenerationPerformance` actually clears the glyph atlas now.

Additionally this fixes a bug with `debugGeneralPerformance`:
If a `DXGI_SWAP_CHAIN_FLAG_FRAME_LATENCY_WAITABLE_OBJECT` is requested,
it needs to be used. Since `debugGeneralPerformance` is for testing without
V-Sync, we need to ensure that the waitable object is properly disabled.

AtlasEngine: Fix the fix for LRU state after scrolling (#13784)

The absolute disgrace of a fix called 65b71ff failed to account for `std::move`
being unsafe to use for overlapping ranges. While `std::move` works for trivial
types (it happens to delegate to `memmove`), we need to dynamically switch
between that and `std::move_backward` to be correct.

Without this fix the LRU refresh is incorrect and might lead to crashes.

I'm working on a new, pure D2D renderer inside AtlasEngine, which uses
the iterators contained in `_r.cellGlyphMapping` to draw text.
I noticed the bug, because scrolling up caused the text to be garbled
and with this fix applied it works as expected.

AtlasEngine: Round cell sizes to nearest instead of up (#13833)

After some deliberation I noticed that rounding the glyph advance up to yield
the cell width is at least just as wrong as rounding it. This is because
we draw glyphs centered, meaning that (at least in theory) anti-aliased
pixels might clip outside of the layout box on _both_ sides of the glyph
adding not 1 but 2 extra pixels to the glyph size. Instead of just `ceilf`
we would have had to use `ceilf(advanceWidth / 2) * 2` to account for that.

This commit simplifies our issue by just going with what other applications do:
Round all sizes (cell width and height) to the nearest pixel size.

Closes #13812

* Set a breakpoint on `scaling Required == true` in `AtlasEngine::_drawGlyph`
* Test an assortment of Cascadia Mono, Consolas, MS Gothic, Lucida Console
  at various font sizes (6, 7, 8, 10, 12, 24, ...)
* Ensure breakpoint isn't hit ✅
  This tells us that no glyph resizing was necessary

AtlasEngine: Improve RDP performance (#13816)

Direct2D is able to detect remote connections and will switch to sending
draw commands across RDP instead of rendering the data on the server.
This reduces the amount of data that needs to be transmitted as well
as the CPU load of the server, if it has no GPU installed.
This commit changes `AtlasEngine` to render with just Direct2D if a software or
remote device was chosen by `D3D11CreateDevice`. Selecting the DXGI adapter the
window is on explicitly in the future would allow us to be more precise here.

This new rendering mode doesn't implement some of the more fancy features just
yet, like inverted cursors or coloring a single wide glyph in multiple colors.
It reuses most existing facilities and uses the existing tile hash map to cache
DirectWrite text layouts to improve performance. Unfortunately this does incur
a fairly high memory overhead of approximately 25MB for a 120x30 viewport.

Additional drive-by changes include:
* Treat the given font size exactly as its given without rounding
  Apparently we don't really need to round the font size to whole pixels
* Stop updating the const buffer on every frame
* Support window resizing if `debugGeneralPerformance` is enabled

Closes #13079

* Tested fairly exhaustively over RDP ✅

AtlasEngine: Implement support for custom shaders (#13885)

This commit implements support for custom shaders in AtlasEngine
(`experimental.retroTerminalEffect` and `experimental.pixelShaderPath`).
Setting these properties invalidates the device because that made it
the easiest to implement this less often used feature.
The retro shader was slightly rewritten so that it compiles without warnings.

Additionally we noticed that AtlasEngine works well with D3D 10.0 hardware,
so support for that was added bringing feature parity with DxRenderer.

Closes #13853

* Default settings (Independent Flip) ✅
* ClearType (Independent Flip) ✅
* Retro Terminal Effect (Composed Flip) ✅
* Use wallpaper as background image (Composed Flip) ✅
  * Running `color 40` draws everything red ✅
  * With Retro Terminal Effect ✅

AtlasEngine: Add support for SetSoftwareRendering (#13886)

This commit implements support for `experimental.rendering.software`.
There's not much to it. It's just another 2 if conditions.

* `"experimental.rendering.software": false` renders with D3D ✅
* `"experimental.rendering.software": true` triggers the new code path ✅

atlas: only enable continuous redraw if the shader needs it (#13903)

We do this by detecting whether the shader is using variable 0 in
constant buffer 0 (typically "time", but it can go by many names.)

Closes #13901

AtlasEngine: Fix various bugs found in testing (#13906)

In testing the following issues were found in AtlasEngine and fixed:
1. "Toggle terminal visual effects" action not working
2. `d2dMode` failed to work with transparent backgrounds
3. `GetSwapChainHandle()` is thread-unsafe due to it being called outside
  of the console lock and with single-threaded Direct2D enabled
4. 2 swap chain buffers are less performant than 3
5. Flip-Discard and `Present()` is less energy efficient than
  Flip-Sequential and `Present1()`
6. `d2dMode` used to copy the front to back buffer for partial rendering,
  but always redraw the entire dirty region anyways
7. Added support for DirectX 9 hardware
8. If custom shaders are used not all pixels would be presented

Closes #13906

1. Toggling visual effects runs retro shader ✅
   With a custom shader set, it toggles the shader ✅
   Toggling `experimental.rendering.software` toggles the shader ✅
2. `"backgroundImage": "desktopWallpaper"` works with D2D ✅ and D3D ✅
3. Adding a `Sleep(3000)` in `_AttachDxgiSwapChainToXaml` doesn't break
   Windows 10 ✅ nor Windows 11 ✅
4. Screen animations run at 144 FPS ✅ even while moving the window ✅
5. No weird artefacts during cursor movement or scrolling ✅
6. No weird artefacts during cursor movement or scrolling ✅
7. Forcing DirectX 9.3 in `dxcpl` runs fine ✅

AtlasEngine: Fix a correctness bug (#13956)

`ATLAS_POD_OPS` doesn't check for `has_unique_object_representations` and so a
bug exists where `CachedCursorOptions` comparisons invoke undefined behavior.
  • Loading branch information
lhecker authored and DHowett committed Sep 9, 2022
1 parent 377cc05 commit 495bb78
Show file tree
Hide file tree
Showing 19 changed files with 2,295 additions and 869 deletions.
55 changes: 27 additions & 28 deletions samples/PixelShaders/Retro.hlsl
Original file line number Diff line number Diff line change
Expand Up @@ -2,47 +2,46 @@
Texture2D shaderTexture;
SamplerState samplerState;

cbuffer PixelShaderSettings {
float Time;
float Scale;
float2 Resolution;
float4 Background;
cbuffer PixelShaderSettings
{
float time;
float scale;
float2 resolution;
float4 background;
};

#define SCANLINE_FACTOR 0.5
#define SCALED_SCANLINE_PERIOD Scale
#define SCALED_GAUSSIAN_SIGMA (2.0*Scale)
#define SCANLINE_FACTOR 0.5f
#define SCALED_SCANLINE_PERIOD scale
#define SCALED_GAUSSIAN_SIGMA (2.0f * scale)

static const float M_PI = 3.14159265f;

float Gaussian2D(float x, float y, float sigma)
{
return 1/(sigma*sqrt(2*M_PI)) * exp(-0.5*(x*x + y*y)/sigma/sigma);
return 1 / (sigma * sqrt(2 * M_PI)) * exp(-0.5 * (x * x + y * y) / sigma / sigma);
}

float4 Blur(Texture2D input, float2 tex_coord, float sigma)
{
uint width, height;
float width, height;
shaderTexture.GetDimensions(width, height);

float texelWidth = 1.0f/width;
float texelHeight = 1.0f/height;
float texelWidth = 1.0f / width;
float texelHeight = 1.0f / height;

float4 color = { 0, 0, 0, 0 };

int sampleCount = 13;
float sampleCount = 13;

for (int x = 0; x < sampleCount; x++)
for (float x = 0; x < sampleCount; x++)
{
float2 samplePos = { 0, 0 };
samplePos.x = tex_coord.x + (x - sampleCount / 2.0f) * texelWidth;

samplePos.x = tex_coord.x + (x - sampleCount/2) * texelWidth;
for (int y = 0; y < sampleCount; y++)
for (float y = 0; y < sampleCount; y++)
{
samplePos.y = tex_coord.y + (y - sampleCount/2) * texelHeight;
if (samplePos.x <= 0 || samplePos.y <= 0 || samplePos.x >= width || samplePos.y >= height) continue;

color += input.Sample(samplerState, samplePos) * Gaussian2D((x - sampleCount/2), (y - sampleCount/2), sigma);
samplePos.y = tex_coord.y + (y - sampleCount / 2.0f) * texelHeight;
color += input.Sample(samplerState, samplePos) * Gaussian2D(x - sampleCount / 2.0f, y - sampleCount / 2.0f, sigma);
}
}

Expand All @@ -51,7 +50,7 @@ float4 Blur(Texture2D input, float2 tex_coord, float sigma)

float SquareWave(float y)
{
return 1 - (floor(y / SCALED_SCANLINE_PERIOD) % 2) * SCANLINE_FACTOR;
return 1.0f - (floor(y / SCALED_SCANLINE_PERIOD) % 2.0f) * SCANLINE_FACTOR;
}

float4 Scanline(float4 color, float4 pos)
Expand All @@ -60,24 +59,24 @@ float4 Scanline(float4 color, float4 pos)

// TODO:GH#3929 make this configurable.
// Remove the && false to draw scanlines everywhere.
if (length(color.rgb) < 0.2 && false)
if (length(color.rgb) < 0.2f && false)
{
return color + wave*0.1;
return color + wave * 0.1f;
}
else
{
return color * wave;
}
}

// clang-format off
float4 main(float4 pos : SV_POSITION, float2 tex : TEXCOORD) : SV_TARGET
// clang-format on
{
Texture2D input = shaderTexture;

// TODO:GH#3930 Make these configurable in some way.
float4 color = input.Sample(samplerState, tex);
color += Blur(input, tex, SCALED_GAUSSIAN_SIGMA)*0.3;
float4 color = shaderTexture.Sample(samplerState, tex);
color += Blur(shaderTexture, tex, SCALED_GAUSSIAN_SIGMA) * 0.3f;
color = Scanline(color, pos);

return color;
}
}
29 changes: 7 additions & 22 deletions src/cascadia/TerminalControl/ControlCore.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -313,7 +313,7 @@ namespace winrt::Microsoft::Terminal::Control::implementation

// Tell the DX Engine to notify us when the swap chain changes.
// We do this after we initially set the swapchain so as to avoid unnecessary callbacks (and locking problems)
_renderEngine->SetCallback(std::bind(&ControlCore::_renderEngineSwapChainChanged, this));
_renderEngine->SetCallback([this](auto handle) { _renderEngineSwapChainChanged(handle); });

_renderEngine->SetRetroTerminalEffect(_settings->RetroTerminalEffect());
_renderEngine->SetPixelShaderPath(_settings->PixelShaderPath());
Expand Down Expand Up @@ -566,24 +566,20 @@ namespace winrt::Microsoft::Terminal::Control::implementation

void ControlCore::ToggleShaderEffects()
{
const auto path = _settings->PixelShaderPath();
auto lock = _terminal->LockForWriting();
// Originally, this action could be used to enable the retro effects
// even when they're set to `false` in the settings. If the user didn't
// specify a custom pixel shader, manually enable the legacy retro
// effect first. This will ensure that a toggle off->on will still work,
// even if they currently have retro effect off.
if (_settings->PixelShaderPath().empty() && !_renderEngine->GetRetroTerminalEffect())
if (path.empty())
{
// SetRetroTerminalEffect to true will enable the effect. In this
// case, the shader effect will already be disabled (because neither
// a pixel shader nor the retro effects were originally requested).
// So we _don't_ want to toggle it again below, because that would
// toggle it back off.
_renderEngine->SetRetroTerminalEffect(true);
_renderEngine->SetRetroTerminalEffect(!_renderEngine->GetRetroTerminalEffect());
}
else
{
_renderEngine->ToggleShaderEffects();
_renderEngine->SetPixelShaderPath(_renderEngine->GetPixelShaderPath().empty() ? std::wstring_view{ path } : std::wstring_view{});
}
// Always redraw after toggling effects. This way even if the control
// does not have focus it will update immediately.
Expand Down Expand Up @@ -1517,25 +1513,14 @@ namespace winrt::Microsoft::Terminal::Control::implementation
}
}

uint64_t ControlCore::SwapChainHandle() const
{
// This is called by:
// * TermControl::RenderEngineSwapChainChanged, who is only registered
// after Core::Initialize() is called.
// * TermControl::_InitializeTerminal, after the call to Initialize, for
// _AttachDxgiSwapChainToXaml.
// In both cases, we'll have a _renderEngine by then.
return reinterpret_cast<uint64_t>(_renderEngine->GetSwapChainHandle());
}

void ControlCore::_rendererWarning(const HRESULT hr)
{
_RendererWarningHandlers(*this, winrt::make<RendererWarningArgs>(hr));
}

void ControlCore::_renderEngineSwapChainChanged()
void ControlCore::_renderEngineSwapChainChanged(const HANDLE handle)
{
_SwapChainChangedHandlers(*this, nullptr);
_SwapChainChangedHandlers(*this, winrt::box_value<uint64_t>(reinterpret_cast<uint64_t>(handle)));
}

void ControlCore::_rendererBackgroundColorChanged()
Expand Down
3 changes: 1 addition & 2 deletions src/cascadia/TerminalControl/ControlCore.h
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,6 @@ namespace winrt::Microsoft::Terminal::Control::implementation

void SizeChanged(const double width, const double height);
void ScaleChanged(const double scale);
uint64_t SwapChainHandle() const;

void AdjustFontSize(int fontSizeDelta);
void ResetFontSize();
Expand Down Expand Up @@ -301,7 +300,7 @@ namespace winrt::Microsoft::Terminal::Control::implementation

#pragma region RendererCallbacks
void _rendererWarning(const HRESULT hr);
void _renderEngineSwapChainChanged();
void _renderEngineSwapChainChanged(const HANDLE handle);
void _rendererBackgroundColorChanged();
void _rendererTabColorChanged();
#pragma endregion
Expand Down
2 changes: 0 additions & 2 deletions src/cascadia/TerminalControl/ControlCore.idl
Original file line number Diff line number Diff line change
Expand Up @@ -67,8 +67,6 @@ namespace Microsoft.Terminal.Control
IControlAppearance UnfocusedAppearance { get; };
Boolean HasUnfocusedAppearance();

UInt64 SwapChainHandle { get; };

Windows.Foundation.Size FontSize { get; };
String FontFaceName { get; };
UInt16 FontWeight { get; };
Expand Down
29 changes: 10 additions & 19 deletions src/cascadia/TerminalControl/TermControl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -703,19 +703,24 @@ namespace winrt::Microsoft::Terminal::Control::implementation
return _core.ConnectionState();
}

winrt::fire_and_forget TermControl::RenderEngineSwapChainChanged(IInspectable /*sender*/, IInspectable /*args*/)
winrt::fire_and_forget TermControl::RenderEngineSwapChainChanged(IInspectable /*sender*/, IInspectable args)
{
// This event is only registered during terminal initialization,
// so we don't need to check _initializedTerminal.
// We also don't lock for things that come back from the renderer.
auto weakThis{ get_weak() };
const auto weakThis{ get_weak() };

// Create a copy of the swap chain HANDLE in args, since we don't own that parameter.
// By the time we return from the co_await below, it might be deleted already.
winrt::handle handle;
const auto processHandle = GetCurrentProcess();
const auto sourceHandle = reinterpret_cast<HANDLE>(winrt::unbox_value<uint64_t>(args));
THROW_IF_WIN32_BOOL_FALSE(DuplicateHandle(processHandle, sourceHandle, processHandle, handle.put(), 0, FALSE, DUPLICATE_SAME_ACCESS));

co_await wil::resume_foreground(Dispatcher());

if (auto control{ weakThis.get() })
{
const auto chainHandle = reinterpret_cast<HANDLE>(control->_core.SwapChainHandle());
_AttachDxgiSwapChainToXaml(chainHandle);
_AttachDxgiSwapChainToXaml(handle.get());
}
}

Expand Down Expand Up @@ -802,21 +807,7 @@ namespace winrt::Microsoft::Terminal::Control::implementation
}
_interactivity.Initialize();

_AttachDxgiSwapChainToXaml(reinterpret_cast<HANDLE>(_core.SwapChainHandle()));

// Tell the DX Engine to notify us when the swap chain changes. We do
// this after we initially set the swapchain so as to avoid unnecessary
// callbacks (and locking problems)
_core.SwapChainChanged({ get_weak(), &TermControl::RenderEngineSwapChainChanged });

// !! LOAD BEARING !!
// Make sure you enable painting _AFTER_ calling _AttachDxgiSwapChainToXaml
//
// If you EnablePainting first, then you almost certainly won't have any
// problems when running in Debug. However, in Release, you'll run into
// issues where the Renderer starts trying to paint before we've
// actually attached the swapchain to anything, and the DxEngine is not
// prepared to handle that.
_core.EnablePainting();

auto bufferHeight = _core.BufferHeight();
Expand Down
Loading

0 comments on commit 495bb78

Please sign in to comment.