Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mono] Fix assembly name parser to accommodate non-ASCII UTF8 strings #103363

Merged
merged 5 commits into from
Jun 14, 2024

Conversation

fanyang-mono
Copy link
Member

@fanyang-mono fanyang-mono commented Jun 12, 2024

Fixes #103276

Manually tested that the testcase failed in the above issue now passed with this change.

Ideally, Mono should share the same assembly name parsing logic as CoreCLR, which is AssemblyNameParser.TryParse.

@@ -1548,7 +1548,7 @@ assembly_name_to_aname (MonoAssemblyName *assembly, char *p)
}
assembly->name = p;
s = p;
while (*p && (isalnum (*p) || *p == '.' || *p == '-' || *p == '_' || *p == '$' || *p == '@' || g_ascii_isspace (*p)))
while (*p && (*p != ','))
p++;
if (quoted) {
if (*p != '"')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the quoted path will fail now?

Copy link
Member Author

@fanyang-mono fanyang-mono Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does quoted path look like?

Copy link
Member

@lewing lewing Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like there are some examples here https://github.com/adamsitnik/runtime/blob/b57b100933e306242ae422b9995b47fb3d778dbd/src/libraries/System.Runtime/tests/System.Reflection.Tests/AssemblyNameTests.cs#L37

but for the purposes of my comment I was just pointing out that if the name was qoted your change would walk to the , not the " then return 1 when if (*p != '"') was true

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea. @fanyang-mono I would suggest doing two separate loops for quoted and !quoted. if it's quoted, just take everything until the next close quote (I'm not sure if there's a possibility of escaping a quote). if !quoted, go until the first comma.


I can't actually find anything in the docs about quoted assembly names or about escaping quotes.

https://learn.microsoft.com/en-us/dotnet/api/system.reflection.assemblyname?view=net-8.0#remarks
https://learn.microsoft.com/en-us/dotnet/api/system.type.assemblyqualifiedname?view=net-8.0#remarks

Maybe this is some mono invention, or maybe it's backward compat for .NET Framework? need to check what CoreCLR does

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, just for clarity if you really want to walk character by character p += g_utf8_jump_table[*p]; is how you could do it here or with more validation like https://github.com/adamsitnik/runtime/blob/4246ba19bd196c5f374d94e5c1fc7b21d53bd9fc/src/mono/mono/eglib/gutf8.c#L106-L113

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like CoreCLR these days AssemblyNameParser and TryGetNextToken does try to handle single and double quotes and escape characters:

if (quoteChar != 0 && c == quoteChar)
break; // Terminate: Found closing quote of quoted string.
if (quoteChar == 0 && (c is ',' or '='))
{
_index--;
break; // Terminate: Found start of a new ',' or '=' token.
}
if (quoteChar == 0 && (c is '\'' or '\"'))
{
token = default;
return false;
}
if (c is '\\')
{
if (!TryGetNextChar(out c))
{
token = default;
return false;
}
switch (c)
{
case '\\':
case ',':
case '=':
case '\'':
case '"':
sb.Append(c);
break;
case 't':
sb.Append('\t');
break;
case 'r':
sb.Append('\r');
break;
case 'n':
sb.Append('\n');
break;
default:
token = default;
return false;
}
}
else
{
sb.Append(c);
}

Copy link
Member Author

@fanyang-mono fanyang-mono Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to share this logic between Mono and CoreCLR somehow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can probably share for most uses in reflection. The runtime also calls this internally from some places, and also I think the AOT compiler might need it. One idea is to compile the managed TypeNameParser as a library using NativeAOT and call it from the AOT compiler.

@kasperk81
Copy link
Contributor

any tests? enable some of these https://github.com/search?q=repo%3Adotnet%2Fruntime%2045033&type=code

@fanyang-mono fanyang-mono requested a review from steveisok as a code owner June 13, 2024 15:44
@steveisok
Copy link
Member

/azp run runtime-ioslike runtime-android runtime-androidemulator

Copy link

No pipelines are associated with this pull request.

@steveisok
Copy link
Member

/azp run runtime-ioslike

@steveisok
Copy link
Member

/azp run runtime-android

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@steveisok
Copy link
Member

/azp run runtime-ioslike

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@steveisok
Copy link
Member

/azp run runtime-ioslike

@steveisok
Copy link
Member

/azp run runtime-android

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

1 similar comment
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@fanyang-mono
Copy link
Member Author

/azp run runtime-ioslike

@fanyang-mono
Copy link
Member Author

/azp run runtime-android

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

1 similar comment
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@steveisok
Copy link
Member

/ba-g The failures were leg timeouts that are hard to ping good error messages down to.

@steveisok steveisok merged commit 39c4905 into dotnet:main Jun 14, 2024
117 of 129 checks passed
@steveisok
Copy link
Member

/backport to release/8.0-staging

Copy link
Contributor

Started backporting to release/8.0-staging: https://github.com/dotnet/runtime/actions/runs/9508872460

Copy link
Contributor

@steveisok backporting to release/8.0-staging failed, the patch most likely resulted in conflicts:

$ git am --3way --ignore-whitespace --keep-non-patch changes.patch

Applying: Fix assembly name parser
Applying: Handle quoted case
Applying: Add utf8 validation
error: sha1 information is lacking or useless (src/mono/mono/metadata/reflection.c).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0003 Add utf8 validation
Error: The process '/usr/bin/git' failed with exit code 128

Please backport manually!

Copy link
Contributor

@steveisok an error occurred while backporting to release/8.0-staging, please check the run log for details!

Error: git am failed, most likely due to a merge conflict.

@fanyang-mono
Copy link
Member Author

/backport to release/8.0

Copy link
Contributor

Started backporting to release/8.0: https://github.com/dotnet/runtime/actions/runs/9509965297

Copy link
Contributor

@fanyang-mono backporting to release/8.0 failed, the patch most likely resulted in conflicts:

$ git am --3way --ignore-whitespace --keep-non-patch changes.patch

Applying: Fix assembly name parser
Applying: Handle quoted case
Applying: Add utf8 validation
error: sha1 information is lacking or useless (src/mono/mono/metadata/reflection.c).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0003 Add utf8 validation
Error: The process '/usr/bin/git' failed with exit code 128

Please backport manually!

Copy link
Contributor

@fanyang-mono an error occurred while backporting to release/8.0, please check the run log for details!

Error: git am failed, most likely due to a merge conflict.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mono is unable to find assemblies named with some international characters
5 participants