Reduce cost of resetting row attributes #15497

lhecker · 2023-06-02T00:01:05Z

Performance of printing enwik8.txt at the following block sizes:
4KiB (printf): 54MB/s -> 54MB/s
128KiB (cat): 101MB/s -> 104MB/s

Validation Steps Performed

This change is easily verifiable via review.

PankajBhojwani · 2023-06-02T18:10:42Z

src/inc/til/small_vector.h

+        // This is a very unsafe shortcut to free the buffer and get a direct
+        // hold to the _buffer. The caller can then fill it with `size` items.
+        [[nodiscard]] T* unsafe_shrink_to_size(size_t size) noexcept


Could you elaborate on why this is unsafe and why it is okay for us to do it anyway?

That's an excellent question. I would love to have it in a comment.

My best guess is, "it returns a raw pointer to the internal storage of the small_vector"... which literally lets us poke at its innards.

Leonard, quick question: why is it faster for us to do this (poke at the innards) than to have an operator= or a .resize_to_one_and_reset() or something on rle?

Does this introduce a primitive that is used in future PRs?

Reading from uninitialized memory with automatic storage duration (= allocated on the stack) is technically undefined behavior, even for trivial structs and arrays. Trivial structs are basically structs that consist entirely of primitive types like char or int. The pointer this function returns points into the small-vector's internal buffer that contains exactly such uninitialized memory and so the caller has to technically make sure to not read from it, or otherwise the CPU might trap / throw an exception, etc. (This doesn't really apply to any modern CPU architecture.)

More importantly however, since the array is not initialized, the returned pointer cannot be used for non-trivial structs. For instance, imagine I have this code:

// A vector with space for 4 items on the stack, without the need for heap allocations. til::small_vector<std::string, 4> vec; // Make room for 1 item on the stack. const auto ptr = vec.unsafe_shrink_to_size(1); // Initialize that first item in the vector. *ptr = std::string{ "foobar" };

This would fail in terrible ways, because the backing buffer on the stack is uninitialized. Its memory may contain any potential garbage, leftover data. When you then assign a std::string to it, the string's operator=() will try to free any potential memory the target (*ptr) may have had, in order to assign the new string contents. When it then reads the uninitialized/garbage values, it'll think that the target / *ptr had something allocated and try to free it. But that's wrong. It didn't have anything.

For cases like this the right choice is usually to use std::uninitialized_default_construct_n, which would initialize the memory and ensure that the returned pointer always points to valid data, in case of non-trivial structs. And even better, for trivial structs uninitialized_default_construct_n will leave the memory uninitialized, which is great for performance. (If this behavior is surprising, you can compare it with how in C/C++ you need to always write char c = 0; and int x = 123;, because int x; by itself will leave x with an undefined value. If this behavior is unwanted, the alternative is std::uninitialized_value_construct_n.)

But there's a problem... TextAttributes is not a trivial struct. Any struct that has a constructor - or even worse, destructor - cannot be trivial anymore. After all, if it has a constructor it surely wants to be initialized to some value. For instance, this struct:

struct Foo { int x = 0; };

is not trivial anymore, because the = 0 forces the addition of an implicit constructor, which initializes the value of x during construction to 0. Now you can safely write Foo foo; in some function and rest assured that reading foo.x will always return some initialized value. But while this is great for the general case, "97%" of all code basically, this is not great for the "3%" case, the code that's the hot path. Because then this added initialization does have a noticeable impact, if the compiler fails to optimize it away. (And in this case, it does.)

Leonard, quick question: why is it faster for us to do this (poke at the innards) than to have an operator= or a .resize_to_one_and_reset() or something on rle?

resize_to_one_and_reset would only work for one specific use case: Writing one value. A operator= would have the same limitation unless we have it accept a std::span or similar, but a std::span is not a trivial struct and thus not cost free (especially not on the Windows x64 ABI and especially-especially not with MSVC's still poor ability to remove struct parameter copies after inlining a call). The unsafe_shrink_to_size function on the other hand works with arbitrary size parameter values and can be fully inlined into the caller / optimized away.

Basically, IMO we should either keep code simple or fast. So, if an optimization can't be simple, it better be fast. I'm basically min-maxing. 😄

DHowett · 2023-06-02T18:18:07Z

src/buffer/out/Row.cpp

-    _attr = { _columnCount, attr };
+    // Constructing and then moving objects into place isn't free.
+    // Modifying the existing object is _much_ faster.
+    *_attr.runs().unsafe_shrink_to_size(1) = til::rle_pair{ attr, _columnCount };


DHowett · 2023-06-02T18:20:13Z

src/inc/til/small_vector.h

+        // This is a very unsafe shortcut to free the buffer and get a direct
+        // hold to the _buffer. The caller can then fill it with `size` items.
+        [[nodiscard]] T* unsafe_shrink_to_size(size_t size) noexcept


That's an excellent question. I would love to have it in a comment.

My best guess is, "it returns a raw pointer to the internal storage of the small_vector"... which literally lets us poke at its innards.

Leonard, quick question: why is it faster for us to do this (poke at the innards) than to have an operator= or a .resize_to_one_and_reset() or something on rle?

Does this introduce a primitive that is used in future PRs?

DHowett · 2023-06-14T20:51:31Z

@lhecker if you want to update the numbers before merging, I'm cool with. otherwise, leave em! :)

lhecker · 2023-06-15T15:01:46Z

@carlos-zamora Could you give this another ✅?
Edit: Seems like it wasn't necessary? Weird.

lhecker added Product-Conhost For issues in the Console codebase Area-Performance Performance-related issue labels Jun 2, 2023

Reduce cost of resetting row attributes

4d6a2ba

lhecker force-pushed the dev/lhecker/vt-perf1 branch from cf9356d to 4d6a2ba Compare June 2, 2023 00:12

PankajBhojwani reviewed Jun 2, 2023

View reviewed changes

DHowett reviewed Jun 2, 2023

View reviewed changes

Fix AuditMode failures

6fc567b

carlos-zamora approved these changes Jun 5, 2023

View reviewed changes

carlos-zamora added the Needs-Second It's a PR that needs another sign-off label Jun 5, 2023

microsoft-github-policy-service bot requested review from zadjii-msft, PankajBhojwani, carlos-zamora and DHowett June 5, 2023 23:20

DHowett approved these changes Jun 14, 2023

View reviewed changes

microsoft-github-policy-service bot removed the Needs-Second It's a PR that needs another sign-off label Jun 14, 2023

Merge remote-tracking branch 'origin/main' into dev/lhecker/vt-perf1

1b09f36

lhecker added the AutoMerge Marked for automatic merge by the bot when requirements are met label Jun 15, 2023

microsoft-github-policy-service bot enabled auto-merge (squash) June 15, 2023 15:02

microsoft-github-policy-service bot merged commit b8f402f into main Jun 15, 2023

microsoft-github-policy-service bot deleted the dev/lhecker/vt-perf1 branch June 15, 2023 15:34

zadjii-msft mentioned this pull request Sep 5, 2023

Very slow rendering of colored text #4129

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce cost of resetting row attributes #15497

Reduce cost of resetting row attributes #15497

lhecker commented Jun 2, 2023 •

edited

Loading

PankajBhojwani Jun 2, 2023

DHowett Jun 2, 2023

lhecker Jun 2, 2023

lhecker Jun 2, 2023 •

edited

Loading

DHowett Jun 2, 2023

DHowett Jun 2, 2023

DHowett commented Jun 14, 2023

lhecker commented Jun 15, 2023 •

edited

Loading

Reduce cost of resetting row attributes #15497

Reduce cost of resetting row attributes #15497

Conversation

lhecker commented Jun 2, 2023 • edited Loading

Validation Steps Performed

PankajBhojwani Jun 2, 2023

Choose a reason for hiding this comment

DHowett Jun 2, 2023

Choose a reason for hiding this comment

lhecker Jun 2, 2023

Choose a reason for hiding this comment

lhecker Jun 2, 2023 • edited Loading

Choose a reason for hiding this comment

DHowett Jun 2, 2023

Choose a reason for hiding this comment

DHowett Jun 2, 2023

Choose a reason for hiding this comment

DHowett commented Jun 14, 2023

lhecker commented Jun 15, 2023 • edited Loading

lhecker commented Jun 2, 2023 •

edited

Loading

lhecker Jun 2, 2023 •

edited

Loading

lhecker commented Jun 15, 2023 •

edited

Loading