Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to not use wchar/wstring on windows? #239

Closed
nuuSolutions opened this issue Feb 16, 2023 · 6 comments
Closed

How to not use wchar/wstring on windows? #239

nuuSolutions opened this issue Feb 16, 2023 · 6 comments
Assignees
Milestone

Comments

@nuuSolutions
Copy link

First up, I like the simplicity of this.

I'm on windows, but I always try to avoid using wchar_t or wstring etc.

There is the PLOG_ENABLE_WCHAR_INPUT flag, but setting it to 0 does nothing but prevent direct use of L"Hello"
internally you still use std::wstring and std::wostringstream (as far as I understand)
I'm worried about the conversions that silently happen - they are just not necessary
Maybe you assume that's what windows users want, but it's not always the case.

@SergiusTheBest
Copy link
Owner

wchar is your friend on Windows. Windows internally stores everything in wchar and converts char calls to wchar calls internally (for example CreateFileA performs conversion and calls CreateFileW under the hood). For historical reasons wchar is the only way to deal reliably with non ASCII characters on Windows, so that's why it's used in plog.

If you're using only ASCII and a specific code page (ANSI) then the conversion looks unnecessary. But it's still required if you want UTF-8 logs. There is no conversion path ANSI-->UTF-8 available, so you have to use a longer path ANSI-->WCHAR-->UTF-8. So wchar can't be dropped on Windows.

With other OSes the situation is different. wchar is almost never used and char is already UTF-8 encoded. So PLOG_ENABLE_WCHAR_INPUT is meant for non-Windows OSes where wchar is optional.

@dgrunwald
Copy link

dgrunwald commented Feb 23, 2023

wchar is your friend on Windows.

I strongly disagree there. UTF-8 everywhere is the way to go.
A logging framework should not force an ANSI->WCHAR->ANSI conversion (when logging to stdout and it's redirected) on a program that prefers to use UTF-8 everywhere.

Yes, OS system calls needs to use wchar_t, but everything else doesn't. Libraries like poco use UTF-8 for all their std::strings. A program following this approach is currently unable to use plog.

plog needs a mode where it
a) assumes char* and std::string are already UTF-8
b) doesn't bother with conversions to UTF-16 since those are unnecessary in almost all cases (WriteConsoleW is the exception)
wchar-input would not be neccessary in this mode; as unicode strings in such programs are using std::string, not std::wstring.

@SergiusTheBest
Copy link
Owner

@dgrunwald UTF-8 everywhere has drawbacks on Windows:

  • there will be more char conversions than it will be using a native char encoding
  • no tools including a debugger assume char is UTF-8, so you won't see a correct string content
  • WinAPI and 3rd-party libraries don't expect UTF-8 char (some libraries support such mode though)
  • int main(int argc, char** argv) is not UTF-8
  • you can misinterpret what char is: is it UTF-8 or is it from WinAPI and you didn't convert it yet or did you forget to convert it or did you convert it 2 times? no one knows :(

IMHO you'll have more pain than gain using such approach on Windows. However YMMV, for your case it could be a good option.

I'm adding support for such mode. Luckily it doesn't require a lot of changes.

@dgrunwald
Copy link

This is an existing, decades-old multi-platform codebase. I can't change it's unicode strategy without a massive multi-year conversion project (which no sane person would attempt).
Having had a decade of experience with the UTF-8 everywhere approach, I'd say you're overstating it's downsides. I'd use this approach again for new projects!

Also, all third-party libraries we've used so far have been compatible with this approach (by default! without having to enable "a mode"). plog was the first library giving us trouble. For now we've forked plog and replaced the occurrences of CP_ACP with CP_UTF8 to make it work for us.

@SergiusTheBest
Copy link
Owner

@dgrunwald Recently I worked with sentry and their library uses wchar_t for file paths. LZMA SDK is also wchar_t. As I remember boost doesn't support UTF-8 chars on Windows too. So the experience may vary. I'm glad you shared yours.

@SergiusTheBest SergiusTheBest added this to the 1.1.10 milestone Mar 3, 2023
SergiusTheBest added a commit that referenced this issue Mar 3, 2023
SergiusTheBest added a commit that referenced this issue Mar 3, 2023
@SergiusTheBest
Copy link
Owner

@nuuSolutions I've added support for Utf8Everywhere. Just compile your project with /utf-8 command line switch of define PLOG_CHAR_IS_UTF8 to 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants