-
-
Notifications
You must be signed in to change notification settings - Fork 650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode normalization follow up, adding character navigation and several fixes #16622
Conversation
Hi @LeonarddeR - just noting that #16616 hasn't been triaged yet. |
Thanks for pointing that out. Note that I was aware of the standpoint to delay this pr to 2024.4, but others might not have been. |
@seanbudd this seems quite stable, which points do you think need still clarification? If you need more community feedback, this needs to be merged into alpha. I don‘t think there is anythink open regarding speech in #16616. you can also keep this behavior in alpha until you think it is stable enough to bring it into Beta. At least this is how we dealt with new features in the past as well, see e.g. cancellable speech which was enabled only in alpha for a longer period of time until it finally reached the stable version. |
@seanbudd would you accept milestone 2024.3 for this PR if we change default value to disabled (see #16624 (comment))? This would at least avoid to ship 2024.3 with a buggy feature, closing #16624. A subsequent PR for 2024.4 could then switch the default value to enabled. |
@CyrilleB79 - yes as requested in #16624 (comment), however @LeonarddeR seems to suggest in #16624 (comment) that this is not possible for whatever reason |
@seanbudd I must have misunderstood you then. I thought you were asking in #16624 (comment) whether it would be possible to fix #16624 without changing character navigation behavior. That would be very difficult. If you'd agree changing this pr to disable normalization again by default but yet change character navigation to also normalize, that is certainly possibly and I can change the pr as requested by @CyrilleB79. |
That would be great yes. The normalization fix can go into 2024.3, but we don't want to turn it on by default until it has had some wider testing in a release |
See test results for failed build of commit 8e0d3c7fdb |
See test results for failed build of commit de6705d08d |
Tip For best results, initiate chat on the files or code changes. @LeonarddeR, after reviewing the To reduce code duplication, you could consider merging these two functions into a single function that accepts a bool calculateOffsets(UNIT unit, wchar_t* text, int textLength, int offset, int* startOffset, int* end, int* endOffset) {
return _calculateUniscribeOffsets(unit, text, textLength, offset, startOffset, endOffset);
} You can then replace calls to Let me know if you need further assistance with this refactoring or any other part of your code! Analysis chainScripts executedThe following scripts were executed for the analysis: Script: #!/bin/bash
# Locate the textUtils.cpp file in the repository
fd textUtils.cpp
Length of output: 49 Script: #!/bin/bash
# Display the contents of the textUtils.cpp file
cat nvdaHelper/local/textUtils.cpp
Length of output: 4852 |
@coderabbitai I'd rather fix duplication between calculateCharacterBoundaries and _calculateUniscribeOffsets. Note that the former swaps some c style code for c++, but a major part of both is very similar. |
Tip For best results, initiate chat on the files or code changes. @LeonarddeR, after reviewing the implementations of Observations:
Proposed Refactoring:To reduce duplication, you can abstract the common parts of these functions into helper functions. Here’s a proposed refactoring:
This approach not only reduces duplication but also enhances maintainability by centralizing the error handling and memory management, making the code cleaner and easier to manage. Let me know if this refactoring aligns with your expectations or if you need further modifications! Analysis chainScripts executedThe following scripts were executed for the analysis: Script: #!/bin/bash
# Extract the implementations of calculateCharacterBoundaries and _calculateUniscribeOffsets from textUtils.cpp
ast-grep --lang cpp --pattern $'$_ $_(calculateCharacterBoundaries, _calculateUniscribeOffsets)($$$) {
$$$
}' --file textUtils.cpp
Length of output: 327 Script: #!/bin/bash
# Correctly extract the implementations of calculateCharacterBoundaries and _calculateUniscribeOffsets from textUtils.cpp
ast-grep --lang cpp --pattern $'$_ $_(calculateCharacterBoundaries, _calculateUniscribeOffsets)($$$) {
$$$
}'
Length of output: 112 Script: #!/bin/bash
# Fetch the entire content of textUtils.cpp
cat textUtils.cpp
Length of output: 65 Script: #!/bin/bash
# Search for the textUtils.cpp file in the repository to find the correct path
fd textUtils.cpp
Length of output: 49 Script: #!/bin/bash
# Fetch the entire content of textUtils.cpp from the correct path
cat nvdaHelper/local/textUtils.cpp
Length of output: 4852 |
@coderabbitai I didn't like your suggestions much. I refactored it in the last commit. |
Tip For best results, initiate chat on the files or code changes. @LeonarddeR, I've reviewed the latest changes in Using If there's anything specific you'd like feedback on or further improvements you're considering, feel free to let me know! Great work on these refinements. Analysis chainScripts executedThe following scripts were executed for the analysis: Script: #!/bin/bash
# Fetch the latest changes made to the textUtils.cpp file in the last commit
git diff HEAD~1 -- nvdaHelper/local/textUtils.cpp
Length of output: 3463 |
@LeonarddeR -is this ready for re-review? I'd encourage any change of policy/usage of feature flags to be moved to a separate discussion. |
@seanbudd Yes, I think the unit tests, including some edge cases, seem to prove that this works as advertised. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UserGuide looks good.
This is because people didn't pay attention to the actual purpose of feature flags when reviewing the coresponding PRs. I think it makes sense to clearly not encourage using these feature flags outside of the advanced settings panel anymore, and state the actuual purpose in the policy docs. |
is this ready for re-review? |
Yes, sorry. Forgot to mark it as ready |
…symbol definition, the symbol replacement is spoken (#16950) Fixup for #16622 Summary of the issue: When unicode normalization of a character (e.g. ·) resulted into a character that had a symbol definition (e.g. ·, middle dot), the symbol definition wasn't applied to the normalization. This resulted in NVDA speaking nothing or only the word normalized. Description of user facing changes NVDA will now properly speak the · character (Greek Ano Teleia) as middle dot when normalizing. This also applies to other characters where normalization results in a character that's part of the symbol dictionary. Description of development approach When normalizing a character, ensure it is thrown through characterProcessing.processSpeechSymbol.
Link to issue number:
Fixes #16622
fixes #16640
Summary of the issue:
It has been discussed that normalization would also be helpful for character navigation. There's also an issue where character descriptions and symbol pronunciation didn't work correctly because normalization took place before symbol processing. Furthermore, for the UnicodeNormalizationOffsetConverter used for braille, it was discovered that diffing didn't turn out to be accurate enough.
Description of user facing changes
Report normalized when navigating by character
in the speech settings.Description of development approach
SuppressUnicodeNormalizationCommand
that allows you to suppress global normalization within a speech sequence. This command is used when creating a spelling sequence, because spelling has its own normalization logic now. It also ensures that when spelling a character or providing a character description (i.e. NVDA+. double press), normalization does not occur. It can even be used to disable character normalization altogether if necessary.UnicodeNormalizationOffsetConverter
to use a new function in NVDAHelper local that uses uniscribe to give a list of offsets for character boundaries. This allows us to split a string into glyphs and then apply normalization to every glyph, which results in more reliable offset calculation and less complex code.Testing strategy:
a accute
when navigating by character, thereby improving that behavior as wellKnown issues with pull request:
None known
Code Review Checklist:
Summary by CodeRabbit
New Features
Enhancements
Tests
Documentation