-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for iso-8859-1 #149
Comments
I didn't try but is it not possible to use And give the the possibility to define a custom Charset with TikXmlConfig that will be used in TikXml.java : |
@qLag It is possible, Okio's API allows you to provide a charset with Buffer#readString(), Buffer#writeString(), and ByteString#encodeString() I think the only issue is skipping the leading BOM for each charset. This is the current implementation. private int nextNonWhitespace(boolean throwOnEof, boolean isDocumentBeginning) throws IOException {
// Look for UTF-8 BOM sequence 0xEFBBBF and skip it
if (isDocumentBeginning && source.rangeEquals(0, UTF8_BOM)) {
source.skip(3);
}
...
} Not sure if this is the most optimal way to support skipping the BOM for each charset, but here's how OkHttp does it for several UTF charsets. Edit: |
I made a draft here #150, needs unit tests but I went ahead and started the leg work. |
Hi reline, |
@qLag In the meantime you can always build a snapshot off of that branch if it's urgent and meets your needs. |
Hi reline, I tried your draft using this line in Gradle : And this in my code : And... it works great ! 👍 😊 🎉 Its a really good new. How can we proceed now to be included in Tickaroo/tikXML ? Qlag |
@qLag Glad that worked for you! I updated the PR with some unit tests, only significant difference I made was fixing the XML declaration when writing in charsets other than UTF-8. - XML_DECLARATION = ByteString.encodeUtf8("<?xml version=\"1.0\" encoding=\"UTF-8\"?>");
+ XML_DECLARATION = ByteString.encodeString("<?xml version=\"1.0\" encoding=\"" + charset.name() + "\"?>", charset); |
I try to parse an XML that comes from an iso-8859-1 API (somes strings have french accents).
Unfortunately, tikXml seems only to work with UTF-8.
I tried to use a TypeConverter :
`class StringUT8Converter : TypeConverter {
}
`
but it doesn't work.
Do you think you can include other encodings than UTF-8 (for poor old webservices 😝 )?
Thx
The text was updated successfully, but these errors were encountered: