Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a mechanism for passing through DCS data strings #9307

Merged
5 commits merged into from
Apr 30, 2021

Conversation

j4james
Copy link
Collaborator

@j4james j4james commented Feb 28, 2021

This PR introduces a mechanism via which DCS data strings can be passed
through directly to the dispatch method that will be handling them, so
the data can be processed as it is received, rather than being buffered
in the state machine. This also simplifies the way string termination is
handled, so it now more closely matches the behaviour of the original
DEC terminals.

The way this now works, a DCS sequence is dispatched as soon as the
final character of the VTID is received. Based on that ID, the
OutputStateMachineEngine should forward the call to the corresponding
dispatch method, and its the responsibility of that method to return an
appropriate handler function for the sequence.

From then on, the StateMachine will pass on all of the remaining bytes
in the data string to the handler function. When a data string is
terminated (with CAN, SUB, or ESC), the StateMachine will pass
on one final ESC character to let the handler know that the sequence
is finished. The handler can also end a sequence prematurely by
returning false, and then all remaining data bytes will be ignored.

Note that once a DCS sequence has been dispatched, it's not possible
to abort the data string. Both CAN and SUB are considered valid
forms of termination, and an ESC doesn't necessarily have to be
followed by a \ for the string terminator. This is because the data
string is typically processed as it's received. For example, when
outputting a Sixel image, you wouldn't erase the parts that had already
been displayed if the data string is terminated early.

With this new way of handling the string termination, I was also able to
simplify some of the StateMachine processing, and get rid of a few
states that are no longer necessary. These changes don't apply to the
OSC sequences, though, since we're more likely to want to match the
XTerm behavior for those cases (which requires a valid ST control for
the sequence to be accepted).

Validation Steps Performed

For the unit tests, I've had to make a few changes to some of the
OutputEngineTests to account for the updated StateMachine
processing. I've also added a new StateMachineTest to confirm that the
data strings are correctly passed through to the string handler under
all forms of termination.

To test whether the framework is actually usable, I've been working on
DRCS Soft Font support branched off of this PR, and haven't encountered
any problems. To test the throughput speed, I also hacked together a
basic Sixel parser, and that seemed to perform reasonably well.

Closes #7316

@ghost ghost added Area-VT Virtual Terminal sequence support Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Product-Conhost For issues in the Console codebase labels Feb 28, 2021
@j4james
Copy link
Collaborator Author

j4james commented Feb 28, 2021

I'm leaving this as a draft for now, because it's possible I may find I need to tweak things when I try and use it in different scenarios, but I think it's probably OK as is. If anyone is waiting for this functionality, I'd be happy to make it ready for review.

@skyline75489
Copy link
Collaborator

Personally I love to see it land. It opens up all the possibility for DCS sequences. The core devs are kind of busy these days. I'm kindly asking @j4james to have more patience on this one.

@zadjii-msft
Copy link
Member

FYI we're working on some internal deadlines that's slowing down the works. Thanks always for your patience ☺️

@@ -20,6 +20,8 @@ namespace Microsoft::Console::VirtualTerminal
class IStateMachineEngine
{
public:
using StringHandler = std::function<bool(const wchar_t)>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had this question back when I was baking the Sixel implementation. And now I'm trying to make tmux control mode happen, which also uses DCS. The same question emerged: sending the data 1 wchar_t a time might also slow down the parsing process when the DCS sequence is very large, because the handler method was called so frequently that it hurts. Perhaps the soft font feature would not suffer from the performance downgrade, because the length of the sequence is limited?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If your concern is about passing the data over conpty, that's not something I've really tried to cover with this PR. Technically I think it could be done, and for something like tmux cc you could probably buffer a line at a time if that makes a difference, but I don't think it's practical for something like Sixel. I've become more and more convinced that the only way we're going to get some of these more complicated VT operations working reasonably in Windows Terminal is with a rewrite of conpty. The VT data stream really has to be passed through uninterrupted.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mainly concerned about the application side, because I'm not using conpty when prototyping tmux cc. After parsing the output from tmux, I choose to send the parsed data over NamedPipe, a nice Win32 feature, to avoid the complicated logic of passing data around. Also using NamedPipe will enable tmux cc under cross-instance scenarios (in the future, after the tab-tearing and all that). However, I think this use case might not be generally feasible for every DCS sequences.

Copy link
Collaborator

@skyline75489 skyline75489 Mar 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually for a serious implementation, I think I still need to pass all the DCS sequences across the ConPTY layer for WT to handle in a upper layer. Yeah you're probably right about "the VT data stream really has to be passed through uninterrupted". This would save so much effort & improve the overall performance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not sure what your concern is regarding the handler. The input is already being processed one character at a time by the state machine, so I wouldn't have thought the overhead of a std::function call would make a significant difference to that process. And what would the alternative be anyway? Buffer the data before passing it through to the handler? That's just another form of overhead - potentially even worse? - and then sequences like Sixel and REGIS wouldn't be able to handle real-time updates.

Maybe I need to hack together a POC of Sixel in conhost to see how well it performs. But I really can't think of a better way of handling this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is not about using std::function as the mechanism for DCS handler. My concern is mainly about the actual code that handles the DCS data. For example, the sixel parser that parses the sixel data, the network handler that transfers data for tmux cc, etc.

Because the data is sent per wchar_t, the actual code that handles them need to be fast enough, to avoid damaging the performance. But you're right. "Buffering the data before passing it through to the handler" is just another form of overhead. And I myself cannot think of a better way of handling this, either.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, I did eventually hack together a basic Sixel parser branched off of this, and it seemed to perform reasonably well. I was just blitting the resulting image to the screen to test, so obviously there would be more overhead for proper storage and rendering, but the parsing seemed instantaneous. So I don't think the way I've implemented the string handler is going to be a bottleneck.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly agree that the VT parsing has always operated character by character so I don't think this will be the thing to suddenly take it down a notch.

I do know from profiling that we should be looking for more opportunities to bulk dispatch content... but it tends to be elsewhere in the codebase that could use bulk optimization more desperately than here (like the output buffer storage itself) .

We'll cross the bridge when we get to it.

Also I'm excited by

I did eventually hack together a basic Sixel parser branched off of this

@j4james j4james marked this pull request as ready for review March 28, 2021 23:04
@j4james
Copy link
Collaborator Author

j4james commented Mar 28, 2021

FYI, I've marked this as ready for review now. I've been using it for a while and haven't felt the need to change anything, so I think it's probably OK. It's not blocking anything for me at the moment, though, so there's no hurry to get it merged.

@@ -20,6 +20,8 @@ namespace Microsoft::Console::VirtualTerminal
class IStateMachineEngine
{
public:
using StringHandler = std::function<bool(const wchar_t)>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly agree that the VT parsing has always operated character by character so I don't think this will be the thing to suddenly take it down a notch.

I do know from profiling that we should be looking for more opportunities to bulk dispatch content... but it tends to be elsewhere in the codebase that could use bulk optimization more desperately than here (like the output buffer storage itself) .

We'll cross the bridge when we get to it.

Also I'm excited by

I did eventually hack together a basic Sixel parser branched off of this

// Arguments:
// - wch - Character to dispatch.
// Return Value:
// - <none>
void StateMachine::_ActionDcsPassThrough(const wchar_t wch)
void StateMachine::_ActionDcsDispatch(const wchar_t wch)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the cleanup in here. I'm fine with fewer states as it's less difficult to mentally juggle.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally.

Copy link
Collaborator

@skyline75489 skyline75489 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excited about the actual usage of DCS sequences 😄

Copy link
Member

@DHowett DHowett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the case question. Clear for approval otherwise -- just like all your other work, this advances the state of the console significantly. Thank you!

{
Log::Comment(L"Escape from DcsTermination");
mach._state = StateMachine::VTStates::DcsTermination;
mach._dcsStringHandler = [](const auto) { return true; };
break;
}
case 18:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about this -- it looks like we're going from case 16 to 18, but the test data only runs to 17. 😄

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. That was a dumb mistake. Thanks for catching that.

@ghost ghost added the Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something label Apr 28, 2021
@ghost ghost removed the Needs-Author-Feedback The original author of the issue/PR needs to come back and respond to something label Apr 29, 2021
@DHowett DHowett added the AutoMerge Marked for automatic merge by the bot when requirements are met label Apr 30, 2021
@ghost
Copy link

ghost commented Apr 30, 2021

Hello @DHowett!

Because this pull request has the AutoMerge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

Copy link
Member

@DHowett DHowett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!!

@ghost ghost merged commit 2559ed6 into microsoft:main Apr 30, 2021
@j4james j4james deleted the dcs-passthrough branch May 6, 2021 14:46
@ghost
Copy link

ghost commented May 25, 2021

🎉Windows Terminal Preview v1.9.1445.0 has been released which incorporates this pull request.:tada:

Handy links:

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-VT Virtual Terminal sequence support AutoMerge Marked for automatic merge by the bot when requirements are met Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Product-Conhost For issues in the Console codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Send the DCS passthrough sequence to the engine
5 participants