Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dump throwing exception #1416

Closed
Duchex opened this issue Jan 7, 2019 · 7 comments
Closed

Dump throwing exception #1416

Duchex opened this issue Jan 7, 2019 · 7 comments
Labels
kind: question solution: proposed fix a fix for the issue has been proposed and waits for confirmation

Comments

@Duchex
Copy link

Duchex commented Jan 7, 2019

Hi
If I run the following code, the dump method throws an exception:

#include "json.hpp"

nlohmann::json d;
std::string test = "Test string [1094:!èéçßëêâú_{}_[+!£$#*=`~]_&34;B31-TestTest] - Test";
d["test"] = test;

std::cout << d.dump(4) << std::endl;

I understand that the json library uses UTF8, but why does it not encode what I give it into its correct format please? (or am I expcected to do that?)

@nlohmann
Copy link
Owner

nlohmann commented Jan 7, 2019

Yes, the library only supports UTF-8, see the README:

Note the library only supports UTF-8. When you store strings with different encodings in the library, calling dump() may throw an exception unless json::error_handler_t::replace or json::error_handler_t::ignore are used as error handlers.

The error handlers are described here.

The library does not do re-encoding, because this is a very hard thing to do.

@Duchex
Copy link
Author

Duchex commented Jan 8, 2019

So, just to clarify please, any strings I pass to the json object must be UTF-8 encoded beforehand?

@nlohmann
Copy link
Owner

nlohmann commented Jan 8, 2019

Yes.

@jaredgrubb
Copy link
Contributor

The encoding used for string literals ("this") is chosen by the compiler. When your compiler is digesting the bytes of your source files, it's making choices about what those bytes and, equally important, how they will be represented in the sections of your binary. By the time std::string or nlohmann::json see the strings, they have a fixed encoding, and if that's not UTF-8, then you'll see the weirdenss you see.

Many OS's (and their compilers) natively use UTF-8 so all of this just works.

However, I'm guessing you're on Windows, which has a rich history on the topics of codepages, Unicode, etc. I don't know if there's a way to ask MSVC to read your source files as UTF-8, but you could look around (eg, this StackOverflow question). Doing that could cuase you other headaches as this could break how you interact with other non-UTF-8 libraries. You might have some luck using the C++11 u8 string prefix (u8"this"),but I've never used it personally so I'm not sure how that would interact with the fact that your source files would still be interpreted by the compiler in some other code-page.

@nlohmann nlohmann added kind: question solution: proposed fix a fix for the issue has been proposed and waits for confirmation labels Jan 9, 2019
@nlohmann
Copy link
Owner

@Duchex Do you need further assistance?

@Duchex
Copy link
Author

Duchex commented Jan 14, 2019

Thank you for the help all. Encoding to UTF-8 before passing to the JSON object solved the issue.

It might be worth clarifying this in the readme since when you say

Note the library only supports UTF-8. When you store strings with different encodings in the library, calling dump() may throw an exception unless json::error_handler_t::replace or json::error_handler_t::ignore are used as error handlers.

It does not expliciltly say that in order to get dump to work, you must use UTF-8 encoded strings in the JSON object when strings are used.

Anyway, thank you for a great library. It works really well.

@Duchex Duchex closed this as completed Jan 14, 2019
@nlohmann
Copy link
Owner

Well, "only supports UTF-8" and "strings with different encodings" should be clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: question solution: proposed fix a fix for the issue has been proposed and waits for confirmation
Projects
None yet
Development

No branches or pull requests

3 participants