Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure til::u8u16 still works if the string consists of just a lead byte #4685

Merged
1 commit merged into from
Feb 21, 2020
Merged

Ensure til::u8u16 still works if the string consists of just a lead byte #4685

1 commit merged into from
Feb 21, 2020

Conversation

german-one
Copy link
Contributor

@german-one german-one commented Feb 21, 2020

Summary of the Pull Request

Fixes a flaw that happened if til::u8u16 received a single lead byte.

PR Checklist

Detailed Description of the Pull Request / Additional comments

The loop for caching partials didn't run and thus, the lead byte was
converted to U+FFFD. That's because the loop starts with sequenceLen
initialized with 1. And if the string has a length of 1 the initial
condition is 1<1 which is evaluated to false and the body of the
loop was never executed.

Validation Steps Performed

  1. updated the code of the state class and tested manually that printf "\xE2"; printf "\x98\xBA\n" prints a U+263A character
  2. updated the unit tests to make sure that still up to 3 partials are
    cached
  3. updated the unit tests to make sure caching also works if the string
    consists of a lead byte only
  4. tested manually that Certain invalid UTF-8 sequences can cause the output to fail #4086 is still resolved

… single lead byte only (GH#4673)

# Conflicts:
#	src/inc/til/u8u16convert.h
#	src/til/ut_til/u8u16convertTests.cpp
@german-one german-one changed the title Make sure caching of partials still works if the string consists of a single lead byte only Make sure caching of UTF-8 partials still works if the string consists of a single lead byte only Feb 21, 2020
@DHowett-MSFT DHowett-MSFT changed the title Make sure caching of UTF-8 partials still works if the string consists of a single lead byte only Ensure til::u8u16 still works if the string consists of just a lead byte Feb 21, 2020
Copy link
Contributor

@DHowett-MSFT DHowett-MSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this. I had to sit down with a pad and pencil to make sure I understood how sequebnceLen/stopLen worked w/ the partial masks, but I do understand it now and this seems like the correct fix. I changed the title a little bit to make it fit in a single git commit title. 😄

@german-one
Copy link
Contributor Author

Love it. Thanks!

@DHowett-MSFT DHowett-MSFT added Needs-Second It's a PR that needs another sign-off AutoMerge Marked for automatic merge by the bot when requirements are met labels Feb 21, 2020
@DHowett-MSFT
Copy link
Contributor

@msftbot merge this in 1 minute

@ghost
Copy link

ghost commented Feb 21, 2020

Hello @DHowett-MSFT!

Because you've given me some instructions on how to help merge this pull request, I'll be modifying my merge approach. Here's how I understand your requirements for merging this pull request:

  • I won't merge this pull request until after the UTC date Fri, 21 Feb 2020 20:36:49 GMT, which is in 1 minute

If this doesn't seem right to you, you can tell me to cancel these instructions and use the auto-merge policy that has been configured for this repository. Try telling me "forget everything I just told you".

@ghost
Copy link

ghost commented Feb 21, 2020

Hello @DHowett-MSFT!

Because this pull request has the AutoMerge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

Do note that I've been instructed to only help merge pull requests of this repository that have been opened for at least 8 hours, a condition that will be fulfilled in about 51 seconds. No worries though, I will be back when the time is right! 😉

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

@ghost ghost merged commit b8e3356 into microsoft:master Feb 21, 2020
@DHowett-MSFT DHowett-MSFT deleted the dev/duhowett/cherry-pick-3003 branch February 21, 2020 20:46
@DHowett-MSFT
Copy link
Contributor

🎉 Once again, thanks for the contribution!

This pull request was included in a set of conhost changes that was just
released with Windows Insider Build 19603.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AutoMerge Marked for automatic merge by the bot when requirements are met Needs-Second It's a PR that needs another sign-off
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Recognizing of UTF-8 partials in til :: u8u16 may fail
3 participants