Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize hot path in OutputCellIterator #10811

Closed

Conversation

skyline75489
Copy link
Collaborator

@skyline75489 skyline75489 commented Jul 28, 2021

Summary of the Pull Request

References

PR Checklist

  • Supports [Performance] vtebench tracking issue #10563
  • CLA signed. If not, go over here and sign the CLA
  • Tests added/passed
  • Documentation updated. If checked, please file a pull request on our docs repo and link it here: #xxx
  • Schema updated.
  • I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx

Detailed Description of the Pull Request / Additional comments

Validation Steps Performed

@skyline75489
Copy link
Collaborator Author

skyline75489 commented Jul 28, 2021

This has been sitting in my local repo for so long, might as well publish it to see if people like it. This is the OutputCellIterator version of #10621. Similar approach but this one is less impactful.

OpenConsole.exe (in ConPTY mode)

Before:

before

After:

after

Time needed for catting enwiki8 dropped from 16s to 15s. Before #10621, it takes about 21s.

I think after PGO this would be more impactful, because it clearly separates the hot path from the cold path.

@skyline75489
Copy link
Collaborator Author

@lhecker this is the OutpuCellIterator work that I mentioned before. I think it's still not ideal without the TextBuffer rewrite work, as you can see in previous comments. But it's better than nothing, I guess? I'll just leave the decision to you and @miniksa .

@@ -206,6 +206,40 @@ OutputCellIterator& OutputCellIterator::operator++()
// Keep track of total distance moved (cells filled)
_distance++;

if (_mode == Mode::Loose && _currentView.DbcsAttr().IsSingle())
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch here needs to be highly compact & optimized to avoid potential "I-Cache spill" (similar to the one in #10621), so I only handle the condition when _currentView.DbcsAttr().IsSingle()

const wchar_t wch = run.at(pos);
if (0x20 <= wch && wch <= 0x7e)
{
_currentView.UpdateText(run.substr(pos, 1));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, inspired by #10621, use UpdateText instead of calling expensive constructors.

@@ -100,6 +100,8 @@ class OutputCellIterator final

bool _TryMoveTrailing() noexcept;

_declspec(noinline) void _MoveSlowPath() noexcept;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, _declspec works the same as __declspec?... I need this to be not-inlined to reduce the instruction count.

@lhecker
Copy link
Member

lhecker commented Jul 29, 2021

To be honest I'd personally prefer if we replaced OutputCellIterator entirely in our hot paths.
It's exists as our data model is complex and requires a complex iterator to easily navigate. This shouldn't mean though that the design of our data model is an immutable fact and not just a coincidence of the way it was built over the years. We can restructure it to allow more efficient iteration.
Furthermore the _mode field in OutputCellIterator strongly indicates that this class should for instance be a template where _mode is a template parameter, or any other equivalent alternative. Because at the moment this class is sort of a "jack of all trades" (or the funnier German equivalent: an "egg-laying woolly milk sheep"), which heavily inhibits code clarity. Coincidentally it also inhibits performance.

@skyline75489
Copy link
Collaborator Author

Thanks @lhecker for the suggestion. Template sounds like a wonderful idea. I actually have the same concern as you and I won't be upset if people don't like this PR. Feel free do whatever may be suitable with this PR.

@DHowett
Copy link
Member

DHowett commented Jul 30, 2021

In my FHL branch I've simply deleted OutputCellIterator and friends ;P

@skyline75489
Copy link
Collaborator Author

Glad OutputCellIterator won’t stay with us forever. I am closing this for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants