-
Notifications
You must be signed in to change notification settings - Fork 8.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize hot path in OutputCellIterator #10811
Conversation
This has been sitting in my local repo for so long, might as well publish it to see if people like it. This is the OutputCellIterator version of #10621. Similar approach but this one is less impactful. OpenConsole.exe (in ConPTY mode) Before: After: Time needed for catting enwiki8 dropped from 16s to 15s. Before #10621, it takes about 21s. I think after PGO this would be more impactful, because it clearly separates the hot path from the cold path. |
@@ -206,6 +206,40 @@ OutputCellIterator& OutputCellIterator::operator++() | |||
// Keep track of total distance moved (cells filled) | |||
_distance++; | |||
|
|||
if (_mode == Mode::Loose && _currentView.DbcsAttr().IsSingle()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch here needs to be highly compact & optimized to avoid potential "I-Cache spill" (similar to the one in #10621), so I only handle the condition when _currentView.DbcsAttr().IsSingle()
const wchar_t wch = run.at(pos); | ||
if (0x20 <= wch && wch <= 0x7e) | ||
{ | ||
_currentView.UpdateText(run.substr(pos, 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, inspired by #10621, use UpdateText
instead of calling expensive constructors.
@@ -100,6 +100,8 @@ class OutputCellIterator final | |||
|
|||
bool _TryMoveTrailing() noexcept; | |||
|
|||
_declspec(noinline) void _MoveSlowPath() noexcept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, _declspec
works the same as __declspec
?... I need this to be not-inlined to reduce the instruction count.
To be honest I'd personally prefer if we replaced |
Thanks @lhecker for the suggestion. Template sounds like a wonderful idea. I actually have the same concern as you and I won't be upset if people don't like this PR. Feel free do whatever may be suitable with this PR. |
In my FHL branch I've simply deleted OutputCellIterator and friends ;P |
Glad OutputCellIterator won’t stay with us forever. I am closing this for now. |
Summary of the Pull Request
References
PR Checklist
Detailed Description of the Pull Request / Additional comments
Validation Steps Performed