-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
non-ascii utf-8 string encoding error #1331
Comments
correction: exporting usdc is fine (sorry!). just usda. |
another example. this may need a different fix. t1.usda
usdcat t1.usda > t2.usda
usdcat t2.usda
|
Filed as internal issue #USD-6353 |
Hey @takahito-tejima -- thanks so much! Great catch, we'll try to get this fixed up as soon as we can! |
thank you for the fix! |
Oh! I missed that second case Takahito. Sorry about that. :-/ I will take a look and see what's going on there. |
long hex escape sequences. This was added to match C's treatment of escape sequences in string literals. Unfortunately this means you cannot have a string with a hex code followed by characters that are valid hex digits. For example, the sequence "\x02defaced" would be treated as a single character. In C you can work around this by breaking the literal into two, since they get concatenated after escapes are evaluated. You could write this example as "\x02" "defaced". But this feature does harm rather than good, and no current code relies on this behavior so we're changing it. Now we limit hex constants to at most two digits, and we encourage encoders always to write two digits to ensure the above confusion cannot occur. Fixes #1331 (Internal change: 2121412)
…ed by characters that are hex digits but not part of the hex code correctly. Fixes #1331 (Internal change: 2121413)
Description of Issue
When I store non-ascii utf-8 string (such as '日本語' or 'ピクサー') as a string attribute in usd, they always get corrupted when I export into usda.
https://github.com/PixarAnimationStudios/USD/blob/release/pxr/usd/sdf/fileIO_Common.cpp#L679
static const char* hexdigit = "0123456789abcedf";
This table looks somehow shuffled ('e' and 'd'). Is this intentional? (I hope not...)
Steps to Reproduce
prim = stage.DefinePrim('/prim')
prim.CreateAttribute('str', Sdf.ValueTypeNames.String).Set('ピクサー')
stage.ExportToString()
(snip)
System Information (OS, Hardware)
Package Versions
20.08
Build Flags
The text was updated successfully, but these errors were encountered: