Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use locale.getpreferredencoding rather than locale.getlocale as latter can fail if Python doesn't know the locale #11384

Merged
merged 3 commits into from
Jul 20, 2020

Conversation

lukaszgo1
Copy link
Contributor

Link to issue number:

Fixes #11155

Summary of the issue:

Whenever we needed to access the ANSI Windows code page we were using locale.getlocale. This was problematic for two reasons:

  1. `locale.getlocale`` returns the Python locale which we're changing when setting NVDA language so returned code page may not correspond to the user code page for example when someone's system is in Polish but NVDA in English.
  2. More importantly Python, or rather the win32 function which Python invokes under the hood, cannot determine what code page should be used for some locales e.g. Aragonese.

Description of how this pull request fixes the issue:

In places where locale.getlocale was formerly used I've switched to locale.getpreferredencoding as this method returns ANSI code page of the current Windows user.

Testing performed:

Switched NVDA language to Aragonese, navigated in Notepad, run dialog and cmd - previously navigation was not possible due to the fact that AN was not recognized as a valid locale by Python.

Known issues with pull request:

Change log entry:

Bug fixes

It is once again possible to navigate in various controls when NVDA is set to Aragonese.

@michaelDCurran
Copy link
Member

I'm not sure that locale.getpreferredencoding actually honours the locale NVDA is set to, rather it only honours the locale the Windows user account is configured for.
At least, for me:
On my system:
locale.getlocale returns:
('English_United Kingdom', '1252')
locale.getpreferredencoding returns:
'cp1252'
Which is fine. But:
Then if I call
locale.setlocale(locale.LC_ALL, 'zh')
I get:
locale.getlocale returns:
('zh_CN', 'eucCN')
locale.getpreferredencoding returns:
'cp1252'
This last one looks wrong to me.

Perhaps we can catch the exception from locale.getlocale and fall back to locale.getpreferredencoding?

@LeonarddeR
Copy link
Collaborator

I actually think this pr is right, and @michaelDCurran's concern doesn't apply here.

  1. First of all, the only reason why i decided to use locale.getlocale here was that it was used all over the place in the code pre textUtils module to deal with offset differences between Python 3 strings and Windows wide character strings with surrogate characters #9545, so it was merely copying existing behaviour. The reason why this fails in Python 3 versions of NVDA is that this code is used more widely now and that we're fetching the locale more often.
  2. locale.getlocale returns locale and codepage of NVDA while locale.getpreferredencoding returns the code page of the system. I believe that we really should use the code page of the system in this code, which can be fetched with locale.getpreferredencoding.

Copy link
Collaborator

@LeonarddeR LeonarddeR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this pr doesn't fix the issue that locale.getlocale fails when NVDA is set to Aragonese. I think that's still a thing that should be addressed in a follow up. May be we should file this against Python or something.

source/textUtils.py Outdated Show resolved Hide resolved
@@ -18,6 +17,8 @@
from logHandler import log

WCHAR_ENCODING = "utf_16_le"
USERANSICODEPAGE = locale.getpreferredencoding()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the style used for constants here:

Suggested change
USERANSICODEPAGE = locale.getpreferredencoding()
USER_ANSI_CODE_PAGE = locale.getpreferredencoding()

Or rather

Suggested change
USERANSICODEPAGE = locale.getpreferredencoding()
SYSTEM_PREFERRED_ENCODING = locale.getpreferredencoding()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed it to USER_ANSI_CODE_PAGE because it can be set per Windows user account and is not global to the system.

source/textInfos/offsets.py Outdated Show resolved Hide resolved
@josephsl
Copy link
Collaborator

josephsl commented Jul 16, 2020 via email

@lukaszgo1
Copy link
Contributor Author

@LeonarddeR wrote:

Note that this pr doesn't fix the issue that locale.getlocale fails when NVDA is set to Aragonese. I think that's still a thing that should be addressed in a follow up. May be we should file this against Python or something.

I don't think there is much we can do about that - an as a locale code is simply not known to Windows that's why Python fails.

@lukaszgo1
Copy link
Contributor Author

@LeonarddeR All your commends are now addressed.

@michaelDCurran
Copy link
Member

michaelDCurran commented Jul 16, 2020 via email

@michaelDCurran michaelDCurran merged commit 41970ca into nvaccess:master Jul 20, 2020
@nvaccessAuto nvaccessAuto added this to the 2020.3 milestone Jul 20, 2020
michaelDCurran added a commit that referenced this pull request Jul 20, 2020
@lukaszgo1 lukaszgo1 deleted the I11155 branch July 20, 2020 09:00
@josephsl josephsl mentioned this pull request Sep 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Odd behaviour when NVDA is in aragonese
5 participants