Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-71139] Do not even try to save NULs #521

Merged
merged 1 commit into from
Apr 24, 2023

Conversation

jglick
Copy link
Member

@jglick jglick commented Apr 24, 2023

Continues #520 by just avoiding the core regression (if it is even considered valid).

There may be other Unicode characters which cause problems in XML serialization; for purposes of this PR I am focusing just on NUL.

@jglick jglick added the bug label Apr 24, 2023
@jglick jglick requested a review from timja April 24, 2023 16:02
@@ -209,6 +209,10 @@ static String possiblyTrimStdio(Collection<CaseResult> results, boolean keepLong
return stdio.subSequence(0, halfMaxSize) + "\n...[truncated " + middle + " chars]...\n" + stdio.subSequence(len - halfMaxSize, len);
}

static String fixNULs(String stdio) { // JENKINS-71139
return stdio == null ? null : stdio.replace("\u0000", "^@");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choice of display idiom for NUL is of course negotiable; I think this format is pretty conventional in the Unix world.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The U+FFFD � replacement character can also be used to replace an unknown, unrecognized, or unrepresentable character.

@timja timja enabled auto-merge (squash) April 24, 2023 16:14
@timja timja merged commit e38dbd1 into jenkinsci:master Apr 24, 2023
@jglick jglick deleted the xml-workaround-JENKINS-71139 branch April 24, 2023 16:21
@basil
Copy link
Member

basil commented Apr 24, 2023

There may be other Unicode characters which cause problems in XML serialization

In XML 1.1, all surrogates, U+FFFE, and U+FFFF are forbidden. I could not get Surefire to produce an XML file with a surrogate even when I tried, so I do not think that is an issue. U+FFFE and U+FFFF are guaranteed not to be Unicode characters at all; if they become a problem, they could be handled the same way this PR handles the null character.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants