-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XML2GFF: Serialize color codes #55
Comments
Hi, can I vouch for this ticket? I happen to need to parse thousands of character (BIC) files, process them in XML a save them back. When I fixed up encodings (cp1250->utf-8->cp1250) I encountered color codes in one of the files. Specifically <c\x01\x01\x01> ie. black which does convert to UTF-8 (noop, remains <c\x01\x01\x01>) but does not convert back to cp-1250 in xml2gff. The actual error was:
Thanks. |
To clarify: gff2xml does NOT transform them to <CRRGGBBAA>, they stays in raw values. See attachment. |
Ah, yes, seems like we're only doing that for LocStrings and TLK strings, not for ExoStrings. Since we're not reconstructing them anyway, do consider gff2xml and xml2gff broken for GFF files with color codes for now. What do you mean with wanting to "vouch" for this ticket? Just to add your vote that you consider this an important issue, or that you'd want to work on a fix yourself? I can offer pointers for the latter case, but I can't guarantee when I'll have time to work on this myself. To be fair, I do have a few weeks of vacations coming up, though, so chances are good I'll find time to work on this before the end of the year, but I wouldn't say no to "converting" you to a contributors to the xoreos project. ;) |
😄 I intended to just add a vote but if it annoys me too much, I'd try to fix it somehow. My issue is not only with color codes but with any invalid characters. I would be quite happy if it was just encoded like Maybe it would be more reasonable to base64 encode whole string (when it contains |
Yeah, gff2xml already base64 encodes string with invalid byte values (which it detects by checking if iconv throws an error), but again not for ExoStrings, that's one thing I overlooked. Also apparently not for all LocStrings? I need to recheck that. You can see this happening for your example file, because it shows a warning on stderr, and some strings consist of "[!?!]" (a dummy value) instead. That's certainly a bug, in that case gff2xml should base64-encode the string instead, so that we're at least no losing any data. In the case of the color codes, I would still like to convert them to something reasonable and human-readable, like the |
I haven't look into code yet but I don't think that iconv would fail (in some instances) as all characters < 0x80 are valid in UTF-8 but not in XML. That's what causes the error. I use cp1250 and so far didn't see an issue with dummy characters. Entire cp1250 (0-255) should map to UTF-8 and back to cp1250 correctly. |
No, there are undefined values in CP1250, for example 0x90. Compare
vs
For gorky.bic, look at line 593ff of the xml produced by gff2xml:
Still, for color codes, certain values might not throw an error. Which is why they still need to be preparsed first. See line 287, for example:
For readability, it might an idea to wrap color-coded strings in a CDATA section, to increase readiblity with the angle brackets. |
The Aurora games has a weird way to do color codes:
<cRGB>Foo</c>
where R, G and B are raw byte values (!).When reading in LocStrings, we demunge those into
<cRRGGBBAA>Foo</c>
where RR, GG, BB and AA are textual hexadecimal values.When converting XML files back to GFF, we need to reverse this as well.
The text was updated successfully, but these errors were encountered: