-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce a mechanism for passing through DCS data strings #9307
Conversation
454cfc5
to
2cbc54d
Compare
I'm leaving this as a draft for now, because it's possible I may find I need to tweak things when I try and use it in different scenarios, but I think it's probably OK as is. If anyone is waiting for this functionality, I'd be happy to make it ready for review. |
Personally I love to see it land. It opens up all the possibility for DCS sequences. The core devs are kind of busy these days. I'm kindly asking @j4james to have more patience on this one. |
FYI we're working on some internal deadlines that's slowing down the works. Thanks always for your patience |
@@ -20,6 +20,8 @@ namespace Microsoft::Console::VirtualTerminal | |||
class IStateMachineEngine | |||
{ | |||
public: | |||
using StringHandler = std::function<bool(const wchar_t)>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had this question back when I was baking the Sixel implementation. And now I'm trying to make tmux control mode happen, which also uses DCS. The same question emerged: sending the data 1 wchar_t a time might also slow down the parsing process when the DCS sequence is very large, because the handler method was called so frequently that it hurts. Perhaps the soft font feature would not suffer from the performance downgrade, because the length of the sequence is limited?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If your concern is about passing the data over conpty, that's not something I've really tried to cover with this PR. Technically I think it could be done, and for something like tmux cc you could probably buffer a line at a time if that makes a difference, but I don't think it's practical for something like Sixel. I've become more and more convinced that the only way we're going to get some of these more complicated VT operations working reasonably in Windows Terminal is with a rewrite of conpty. The VT data stream really has to be passed through uninterrupted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was mainly concerned about the application side, because I'm not using conpty when prototyping tmux cc. After parsing the output from tmux, I choose to send the parsed data over NamedPipe, a nice Win32 feature, to avoid the complicated logic of passing data around. Also using NamedPipe will enable tmux cc under cross-instance scenarios (in the future, after the tab-tearing and all that). However, I think this use case might not be generally feasible for every DCS sequences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually for a serious implementation, I think I still need to pass all the DCS sequences across the ConPTY layer for WT to handle in a upper layer. Yeah you're probably right about "the VT data stream really has to be passed through uninterrupted". This would save so much effort & improve the overall performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not sure what your concern is regarding the handler. The input is already being processed one character at a time by the state machine, so I wouldn't have thought the overhead of a std::function call would make a significant difference to that process. And what would the alternative be anyway? Buffer the data before passing it through to the handler? That's just another form of overhead - potentially even worse? - and then sequences like Sixel and REGIS wouldn't be able to handle real-time updates.
Maybe I need to hack together a POC of Sixel in conhost to see how well it performs. But I really can't think of a better way of handling this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is not about using std::function
as the mechanism for DCS handler. My concern is mainly about the actual code that handles the DCS data. For example, the sixel parser that parses the sixel data, the network handler that transfers data for tmux cc, etc.
Because the data is sent per wchar_t
, the actual code that handles them need to be fast enough, to avoid damaging the performance. But you're right. "Buffering the data before passing it through to the handler" is just another form of overhead. And I myself cannot think of a better way of handling this, either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, I did eventually hack together a basic Sixel parser branched off of this, and it seemed to perform reasonably well. I was just blitting the resulting image to the screen to test, so obviously there would be more overhead for proper storage and rendering, but the parsing seemed instantaneous. So I don't think the way I've implemented the string handler is going to be a bottleneck.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mostly agree that the VT parsing has always operated character by character so I don't think this will be the thing to suddenly take it down a notch.
I do know from profiling that we should be looking for more opportunities to bulk dispatch content... but it tends to be elsewhere in the codebase that could use bulk optimization more desperately than here (like the output buffer storage itself) .
We'll cross the bridge when we get to it.
Also I'm excited by
I did eventually hack together a basic Sixel parser branched off of this
FYI, I've marked this as ready for review now. I've been using it for a while and haven't felt the need to change anything, so I think it's probably OK. It's not blocking anything for me at the moment, though, so there's no hurry to get it merged. |
@@ -20,6 +20,8 @@ namespace Microsoft::Console::VirtualTerminal | |||
class IStateMachineEngine | |||
{ | |||
public: | |||
using StringHandler = std::function<bool(const wchar_t)>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mostly agree that the VT parsing has always operated character by character so I don't think this will be the thing to suddenly take it down a notch.
I do know from profiling that we should be looking for more opportunities to bulk dispatch content... but it tends to be elsewhere in the codebase that could use bulk optimization more desperately than here (like the output buffer storage itself) .
We'll cross the bridge when we get to it.
Also I'm excited by
I did eventually hack together a basic Sixel parser branched off of this
// Arguments: | ||
// - wch - Character to dispatch. | ||
// Return Value: | ||
// - <none> | ||
void StateMachine::_ActionDcsPassThrough(const wchar_t wch) | ||
void StateMachine::_ActionDcsDispatch(const wchar_t wch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love the cleanup in here. I'm fine with fewer states as it's less difficult to mentally juggle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excited about the actual usage of DCS sequences 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the case question. Clear for approval otherwise -- just like all your other work, this advances the state of the console significantly. Thank you!
{ | ||
Log::Comment(L"Escape from DcsTermination"); | ||
mach._state = StateMachine::VTStates::DcsTermination; | ||
mach._dcsStringHandler = [](const auto) { return true; }; | ||
break; | ||
} | ||
case 18: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about this -- it looks like we're going from case 16 to 18, but the test data only runs to 17. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. That was a dumb mistake. Thanks for catching that.
Hello @DHowett! Because this pull request has the p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!!
🎉 Handy links: |
This PR introduces a mechanism via which DCS data strings can be passed
through directly to the dispatch method that will be handling them, so
the data can be processed as it is received, rather than being buffered
in the state machine. This also simplifies the way string termination is
handled, so it now more closely matches the behaviour of the original
DEC terminals.
The way this now works, a
DCS
sequence is dispatched as soon as thefinal character of the
VTID
is received. Based on that ID, theOutputStateMachineEngine
should forward the call to the correspondingdispatch method, and its the responsibility of that method to return an
appropriate handler function for the sequence.
From then on, the
StateMachine
will pass on all of the remaining bytesin the data string to the handler function. When a data string is
terminated (with
CAN
,SUB
, orESC
), theStateMachine
will passon one final
ESC
character to let the handler know that the sequenceis finished. The handler can also end a sequence prematurely by
returning false, and then all remaining data bytes will be ignored.
Note that once a
DCS
sequence has been dispatched, it's not possibleto abort the data string. Both
CAN
andSUB
are considered validforms of termination, and an
ESC
doesn't necessarily have to befollowed by a
\
for the string terminator. This is because the datastring is typically processed as it's received. For example, when
outputting a Sixel image, you wouldn't erase the parts that had already
been displayed if the data string is terminated early.
With this new way of handling the string termination, I was also able to
simplify some of the
StateMachine
processing, and get rid of a fewstates that are no longer necessary. These changes don't apply to the
OSC
sequences, though, since we're more likely to want to match theXTerm behavior for those cases (which requires a valid
ST
control forthe sequence to be accepted).
Validation Steps Performed
For the unit tests, I've had to make a few changes to some of the
OutputEngineTests
to account for the updatedStateMachine
processing. I've also added a new
StateMachineTest
to confirm that thedata strings are correctly passed through to the string handler under
all forms of termination.
To test whether the framework is actually usable, I've been working on
DRCS Soft Font support branched off of this PR, and haven't encountered
any problems. To test the throughput speed, I also hacked together a
basic Sixel parser, and that seemed to perform reasonably well.
Closes #7316