You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It looks like InvalidCharHandler can only be set when writing, but not reading. Do you think it makes sense to support this for reading as well? If a user is dealing with a Reader they could do this sort of transformation pretty easily before passing the stream to woodstox. However, it requires their code to understand which characters are invalid (rather than having woodstox be the source of truth for that). And if the user is dealing with an InputStream, they may not have an easy way to do character-based filtering/replacement
The text was updated successfully, but these errors were encountered:
I am bit hesitant about trying to support fully configurable approach, given complexity of XML character validity rules. But maybe something to fully disable validity checks for, say, textual content, would be ok -- because if so, user could provide custom InputStream (or, more likely, Reader) to implement validation they want and then Woodstox would just take whatever it gets.
To me it seems that validation at Reader is probably way easier to layer than try to make decoder have validation calls.
I probably won't have time to work on this on my own, either way.
But if anyone wants to create a PR that does not add measurable overhead for the default case, I'd of course be happy to help sanity check it & help get merged if and when it makes sense.
Also, now that I think about this -- aside from the question of performance, I don't think I am against InvalidCharHandler on per-character basis. If someone has time to implement it (I don't, but I always do my best to find time to review contributions).
It should be possible to hide the complexity behind error reporting functionality; I assume return value could be the character to use and so on.
It looks like
InvalidCharHandler
can only be set when writing, but not reading. Do you think it makes sense to support this for reading as well? If a user is dealing with aReader
they could do this sort of transformation pretty easily before passing the stream to woodstox. However, it requires their code to understand which characters are invalid (rather than having woodstox be the source of truth for that). And if the user is dealing with anInputStream
, they may not have an easy way to do character-based filtering/replacementThe text was updated successfully, but these errors were encountered: