Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symbolication for AOT-compiled stack traces on iOS #2180

Closed
Tracked by #1174
mattjohnsonpint opened this issue Feb 14, 2023 · 27 comments
Closed
Tracked by #1174

Symbolication for AOT-compiled stack traces on iOS #2180

mattjohnsonpint opened this issue Feb 14, 2023 · 27 comments
Assignees
Labels
Milestone

Comments

@mattjohnsonpint
Copy link
Contributor

mattjohnsonpint commented Feb 14, 2023

Package

Sentry.Maui

.NET Flavor

.NET

.NET Version

7.0.102

OS

iOS

SDK Version

3.28.1

Steps to Reproduce

  1. Make a new MAUI application
  2. Add Sentry.Maui
  3. Write code that throws an exception
  4. Configure MSBuild with org and project
  5. Build in Release mode
  6. Deploy to a real iOS device (not the simulator)
  7. Run the app and cause the exception to fire

Expected Result

Error in Sentry should include file names and line numbers in the stack trace.

Actual Result

No symbolication occurs.

image

Event:
https://sentry-sdks.sentry.io/issues/3936606932/events/6292f7f7ec3744e6a410e0bd111879ce/?project=5428537

The JSON of the event does not contain any debug_meta information, symbolication is not attempted.

@mattjohnsonpint
Copy link
Contributor Author

mattjohnsonpint commented Feb 14, 2023

Note - works in debug mode. Also works in debug or release mode on the simulator - just not on a real device.

@mattjohnsonpint
Copy link
Contributor Author

Just a note, Android does not have this issue. It symbolicates fine:
https://sentry-sdks.sentry.io/issues/3929952795/events/395517c261754371a951f61970be83f0/?project=5428537

image

@mattjohnsonpint
Copy link
Contributor Author

Found I can get at at the logs in release mode by connecting the iOS device via USB and running:

/Library/Frameworks/Xamarin.iOS.framework/Versions/Current/bin/mtouch --logdev | grep Sentry

With SentryOptions.Debug = true, I now can see the problem:

Feb 14 14:05:33 mattjs-iPhone Sentry.Samples.Maui[809] <Notice>:    Info: Capturing event.
Feb 14 14:05:33 mattjs-iPhone Sentry.Samples.Maui[809] <Notice>:   Debug: Running processor on exception: This exception was thrown and captured manually, without crashing the app.
Feb 14 14:05:33 mattjs-iPhone Sentry.Samples.Maui[809] <Notice>:   Debug: Creating SentryStackTrace. isCurrentStackTrace: False.
Feb 14 14:05:33 mattjs-iPhone Sentry.Samples.Maui[809] <Notice>:    Info: Skipping DebugImage for module 'Sentry.Samples.Maui.dll' because DebugId couldn't be determined
Feb 14 14:05:33 mattjs-iPhone Sentry.Samples.Maui[809] <Notice>:   Debug: Created DebugStackTrace with 1 frames.
Feb 14 14:05:33 mattjs-iPhone Sentry.Samples.Maui[809] <Notice>:   Debug: Running main event processor on: Event 42dbc64b0868418cad97794f1f95c39d
Feb 14 14:05:33 mattjs-iPhone Sentry.Samples.Maui[809] <Notice>:   Debug: Merging 0 debug images from stacktrace.

@mattjohnsonpint mattjohnsonpint moved this from Needs Investigation to In Progress in Mobile & Cross Platform SDK Feb 14, 2023
@mattjohnsonpint
Copy link
Contributor Author

Well, I found the problem.

if (!entry.IsPortableCodeView)

This is returning false. It's defined as:

public bool IsPortableCodeView => MinorVersion == PortablePdbVersions.PortableCodeViewVersionMagic;

And in the case of the released iOS app, MinorVersion is 0. I can get some parts of it working if I test with entry.Type == DebugDirectoryEntryType.CodeView instead. But the debug id doesn't fully match the PDB.

The problem appears to be caused by dotnet assembly trimming, which is not something that can be disabled for iOS apps. (I've tried various options with no success as of yet.)

Building in release mode will create a regular dll and pdb where you'd expect them. For example:

  • bin/Release/net6.0-ios/ios-arm64/Sentry.Samples.Maui.dll
  • bin/Release/net6.0-ios/ios-arm64/Sentry.Samples.Maui.pdb

But then it will write a trimmed dll to the app bundle at:

  • bin/Release/net6.0-ios/ios-arm64/Sentry.Samples.Maui.app/Sentry.Samples.Maui.dll

Both dll's will have the same guid part of the debug id, but the last component (the timestamp) will vary.

Check with sentry-cli:

matt@XHFHLG2CJP Sentry.Samples.Maui % sentry-cli dif check bin/Release/net6.0-ios/ios-arm64/Sentry.Samples.Maui.dll
Debug Info File Check
  Type: pe executable
  Contained debug identifiers:
    > Debug ID: c66dceee-2c72-4bf6-9a15-6e2aad6ab8fb-1
      Code ID:  d0a102d41c000
      Arch:     arm64
  Contained debug information:
    > none
  Usable: no (missing debug or unwind information)
matt@XHFHLG2CJP Sentry.Samples.Maui % sentry-cli dif check bin/Release/net6.0-ios/ios-arm64/Sentry.Samples.Maui.pdb
Debug Info File Check
  Type: portablepdb debug companion
  Contained debug identifiers:
    > Debug ID: c66dceee-2c72-4bf6-9a15-6e2aad6ab8fb-d0a102d4
      Arch:     unknown
  Contained debug information:
    > debug, sources
  Usable: yes
matt@XHFHLG2CJP Sentry.Samples.Maui % sentry-cli dif check bin/Release/net6.0-ios/ios-arm64/Sentry.Samples.Maui.app/Sentry.Samples.Maui.dll 
Debug Info File Check
  Type: pe executable
  Contained debug identifiers:
    > Debug ID: c66dceee-2c72-4bf6-9a15-6e2aad6ab8fb-1
      Code ID:  63ec4f2fc000
      Arch:     x86
  Contained debug information:
    > none
  Usable: no (missing debug or unwind information)
  • The PDB was uploaded with DebugID: c66dceee-2c72-4bf6-9a15-6e2aad6ab8fb-d0a102d4
  • The non-trimmed version will match: c66dceee-2c72-4bf6-9a15-6e2aad6ab8fb-d0a102d4
  • The trimmed version has a different one: c66dceee-2c72-4bf6-9a15-6e2aad6ab8fb-63ec4f2f

(Note the output of sentry-cli is showing -1 but the first 8 digits of the Code ID are what will actually be used there.)

I can see the same detail in Rider's "Assembly Explorer":

The non-trimmed DLL's debug headers:

image

The trimmed DLL's debug headers:

image

Notice that the non-trimmed version has valid values for MajorVersion and MinorVersion, the TimeDateStamp is a pseudo-timestamp (due to deterministic builds), and the Deterministic entry indicator is present.

In the non- trimmed headers, MajorVersion and MinorVersion are zero, and the TimeDateStamp field is a real value (the UTC time the linker ran), and the Deterministic entry is absent.

@mattjohnsonpint
Copy link
Contributor Author

Not sure what to do about this. At runtime, we need the debug_meta information to contain the full original Debug ID to match with the one in the PDB sent to Sentry. There is no corresponding PDB to upload that will match the trimmed version.

One thought I had was that it might be possible to use an MSBuild tasks with ILLink custom steps to read the full debug ID from the un-trimmed assembly and write it to another location (such as an AssemblyMetadata attribute) in the trimmed assembly. We could read it from there at runtime.

@bruno-garcia @vaind - thoughts?

@mattjohnsonpint
Copy link
Contributor Author

mattjohnsonpint commented Feb 15, 2023

Oh, and note that both parts of the Code ID differ as well.

Debug ID: c66dceee-2c72-4bf6-9a15-6e2aad6ab8fb-d0a102d4
Code ID: d0a102d41c000
Signature: c66dceee-2c72-4bf6-9a15-6e2aad6ab8fb
TimeDateStamp: d0a102d4
SizeOfData: 1c000

vs.

Debug ID: c66dceee-2c72-4bf6-9a15-6e2aad6ab8fb-63ec4f2f
Code ID: 63ec4f2fc000
Signature: c66dceee-2c72-4bf6-9a15-6e2aad6ab8fb
TimeDateStamp: 63ec4f2f
SizeOfData: c000

... which makes sense because the trimmed version is smaller, and the timestamp of the non-trimmed version is deterministic

@vaind
Copy link
Collaborator

vaind commented Feb 15, 2023

Notice that the non-trimmed version has valid values for MajorVersion and MinorVersion, the TimeDateStamp is a pseudo-timestamp (due to deterministic builds), and the Deterministic entry indicator is present.

In the non- trimmed headers, MajorVersion and MinorVersion are zero, and the TimeDateStamp field is a real value (the UTC time the linker ran), and the Deterministic entry is absent.

Seems to match the specification

Not sure what to do about this. At runtime, we need the debug_meta information to contain the full original Debug ID to match with the one in the PDB sent to Sentry. There is no corresponding PDB to upload that will match the trimmed version.

We may need to change the lookup logic to support this properly @Swatinem ?

@mattjohnsonpint
Copy link
Contributor Author

@vaind - I think you're saying that I'm seeing the first part of the spec there (Version Major=0, Minor=0...), and we've designed only for the second part (Version Major=any, Minor=0x504d...)? In that case should we be sending the Age instead of the TimeDateStamp? Even then, I'm not sure how it would match up on the server side, as the pdb would still have the TimeDateStamp in its debug_id.

@vaind
Copy link
Collaborator

vaind commented Feb 15, 2023

I'm not very familiar with the symbolication part of this, that's why I've tried pinging @Swatinem, I think he have originally programmed that.

@vaind
Copy link
Collaborator

vaind commented Feb 15, 2023

But yes, my suspicion is we're not handling both scenarios.

@mattjohnsonpint
Copy link
Contributor Author

TBH, feels like a .NET linker or SDK bug to me. I mean, if you set <Deterministic>true</Deterministic> on a project, that should apply to the trimmed assembly also, shouldn't it? And even without that, you'd think it would re-use the original timestamp instead of making a new one. I may need to open an issue over there.

Even so, if we can find a workaround, that would be useful. I can easily change the code in the SDK to handle both parts of the spec. Not sure what would need to change on the back-end.

@mattjohnsonpint
Copy link
Contributor Author

Note, I tried sending the debug id with the age instead, but that did not work either. I didn't expect it to though, as it still doesn't match the PDB sent.

Example event:
https://sentry-sdks.sentry.io/issues/3513674105/events/1ae28960b5424187b5ab85116a40b311/?project=5428537

image

Debug ID: 8c7772a1-42fb-4a8d-a1d0-cf77887c9f4d-1

The Debug ID on the PDB is 8c7772a1-42fb-4a8d-a1d0-cf77887c9f4d-d0a102d4, so it's a mismatch.

@mattjohnsonpint
Copy link
Contributor Author

mattjohnsonpint commented Feb 16, 2023

Another path I'm exploring - it seems that trimming a console app by default does give a new PDB that matches the debug id of the assembly generated by ILLink. There's a -b parameter for ILLink that controls this. It's off by default for ILLink, but the -b option is indeed passed by MSBuild both for the console app and for the iOS app.

Why then doesn't the modified PDB show up for the iOS app? Not sure, but I do see that there are custom ILLink steps for the iOS app. It's possible one of them is the culprit.

@mattjohnsonpint
Copy link
Contributor Author

mattjohnsonpint commented Feb 16, 2023

Forget the last post. I found that it's not the ILLink task that is the cause. That part actually works just fine. It seems that Xamarin.iOS pulls in an additional ILStrip task which is the source of the pain.

Trimming and Stripping are different tools. (Who knew? 😅 )

The good news is, I can turn stripping off with <EnableAssemblyILStripping>false</EnableAssemblyILStripping> in the csproj of the app. With it disabled, the debug ID appears to be matching now.

The bad news is, symbolication is still not working. Now I get status "Unreferenced".

https://sentry-sdks.sentry.io/issues/3513674105/events/bfbf35f2c9914dc1bb20684946f965fd/?project=5428537

image

image

image

Yet, if I click "Open in Settings", it is found there.

image

What am I missing?

@mattjohnsonpint
Copy link
Contributor Author

FYI, I figured out (through various GitHub issues, as there are no good docs) that ILStrip is a port from mono-cil-strip to the new .NET platform. Its purpose is to remove all IL from assemblies that have been AOT compiled, to reduce the bundle size of the app. So, not calling ILStrip will keep the original assembly as-is. But it is still AOT compiled - which might be why it's not working above.

@Swatinem
Copy link
Member

TBH, feels like a .NET linker or SDK bug to me. I mean, if you set <Deterministic>true</Deterministic> on a project, that should apply to the trimmed assembly also, shouldn't it? And even without that, you'd think it would re-use the original timestamp instead of making a new one. I may need to open an issue over there.

I also believe this might be an upstream bug. The PDB is deterministic, and the dll referencing it should retain that fact. Trimming the DLL does not touch the PDB in any way, does it? If not, it should indeed keep the exact same reference to the PDB.

I can take a look at the last issue without ILStrip.

@Swatinem
Copy link
Member

Well, the stack frame simply has no instruction_addr. The addr_mode references the debug file, and the function_id is there as well. There just isn’t any instruction_addr. Is it possible that this particular platform does not provide any GetILOffset value at runtime?

@mattjohnsonpint
Copy link
Contributor Author

I also believe this might be an upstream bug. The PDB is deterministic, and the dll referencing it should retain that fact. Trimming the DLL does not touch the PDB in any way, does it? If not, it should indeed keep the exact same reference to the PDB.

I was mistaken originally. ILLink (which does trimming) doesn't ignore the deterministic setting. However, it does create a new deterministic timestamp based on the IL that was left after it finished trimming. It also creates a matching pdb for that new assembly, which has that same debug id, so we are good there.

Well, the stack frame simply has no instruction_addr.

Still looking at exactly why, but I think it has to do with AOT compilation. I think we'll need to do something more than we currently do for AOT compiled assemblies. Perhaps this is what #579 was about? (I'm still learning about all these things.)

@mattjohnsonpint
Copy link
Contributor Author

#579 won't help here, as the stacktrace string it was parsing is not available at runtime.

However, we do still need some way to deal with full-AOT compiled apps - as with iOS.

@Swatinem - You are correct, GetILOffset() returns -1. There's no IL information available on the stack. We do have a value from GetNativeOffset() though. Could that be used in conjunction with the native dSYM?

I think for now we'll need to document that .NET symbolication doesn't work on iOS, and work on a more comprehensive solution.

@loewenheim
Copy link

@mattjohnsonpint IIUC GetNativeOffset() returns an offset relative to the start of the function. For native symbolication, we use instruction addresses that are at most relative to a module.

@bruno-garcia
Copy link
Member

Not sure if relevant but when I looked at this years ago (Xamarin.iOS) it was required 2 lookups to get to the C# line numbers:

The "AOT Id", likely a debug_id for dsyms would find the native symbols. And from there it would still rely on the .net pdb to do a further lookup. For that reason Mono encoded 2 debugs in a single stack trace: mvid#aotid.

https://github.com/mono/mono/blob/3a98eea66d20fb1db6c468690afbe500fc762ff9/mcs/class/corlib/System.Diagnostics/StackTrace.cs#L201-L204

On mono-symbolicate you can see they don't expect an IL offset when they rely on aotid:

https://github.com/mono/mono/blob/d336d6be307dfea8b7a07268270c6d885db9d399/mcs/tools/mono-symbolicate/SymbolManager.cs#L49

@mattjohnsonpint
Copy link
Contributor Author

Agreed, but I currently don't get stack traces that include either mvid or aotid. There may be some way to get them at runtime - I'll keep looking.

We also don't get .msym files, which is what mono-symbolicate is looking for. It's possible those are merged into another file somewhere, perhaps the dSYM - but really not sure.

@mattjohnsonpint mattjohnsonpint changed the title Symbolication not working with real iOS device Symbolication for AOT-compiled stack traces on iOS Feb 24, 2023
@mattjohnsonpint
Copy link
Contributor Author

Blocked on https://github.com/getsentry/team-stacktrace/issues/43. Will resume later.

@bruno-garcia
Copy link
Member

We possibly can support iOS through Native AOT for apps targeting .NET 8 and later. Assuming this lands:

If so, this ticket, given its complexity, becomes lower priority. That is, if Mono AOT can be replaced with Native AOT.

And what we could focus on is:

@SamMonoRT
Copy link

SamMonoRT commented Aug 11, 2023

We possibly can support iOS through Native AOT for apps targeting .NET 8 and later. Assuming this lands:

If so, this ticket, given its complexity, becomes lower priority. That is, if Mono AOT can be replaced with Native AOT.

And what we could focus on is:

We'll encourage you to try steps for NativeAOT targetting iOS as mentioned in this blog post on any .NET8 Preview 6 or 7 builds and give us feedback - https://devblogs.microsoft.com/dotnet/announcing-dotnet-8-preview-6/#support-for-targeting-ios-platforms-with-nativeaot

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 Aug 11, 2023
@getsantry getsantry bot removed the status in GitHub Issues with 👀 Aug 14, 2023
@vaind vaind self-assigned this Oct 20, 2023
@vaind vaind added this to the 4.0.0 milestone Oct 20, 2023
@vaind
Copy link
Collaborator

vaind commented Nov 11, 2023

@bruno-garcia @SamMonoRT support for NativeAOT on .net8-ios works fine with Sentry, I'm preparing a PR #2820

@vaind
Copy link
Collaborator

vaind commented Dec 5, 2023

For completeness, it doesn't look like we can add support for this to MonoAOT because of the missing information. This is all the info we have available at runtime:

        var realFrame = stackFrame.Frame as StackFrame;
        _options.LogDebug("Capturing frame {0}", frame.ToUtf8Json());
        _options.LogDebug("### ToString = {0}", realFrame!.ToString());
        _options.LogDebug("### Has Native Image = {0}", realFrame.HasNativeImage());
        _options.LogDebug("### Native Image Base = 0x{0:x}", realFrame.GetNativeImageBase());
        _options.LogDebug("### Native Interface pointer = 0x{0:x}", realFrame.GetNativeIP());
        _options.LogDebug("### Offset = {0}", realFrame.GetNativeOffset());
        _options.LogDebug("### FileName = {0}", realFrame.GetFileName());
        _options.LogDebug("### Line no = {0}", realFrame.GetFileLineNumber());
        _options.LogDebug("### HasMethod = {0}", realFrame.HasMethod());
        _options.LogDebug("### Method = {0}", realFrame.GetMethod());
        _options.LogDebug("### ILOffset = {0}", realFrame.GetILOffset());
        _options.LogDebug("====================");

Has the following output (the first line is from another log line above in the normal code):

Got debug image for 'Sentry.Samples.Ios.dll' having Debug ID: 04ef1045-df1d-4429-ad12-eb7d79019d57-1
Capturing frame {"FileName":"","Function":"void Program.\u003CMain\u003E$(?)","Module":"Program","LineNumber":null,"ColumnNumber":null,"AbsolutePath":null,"ContextLine":null,"InApp":null,"Package":"Sentry.Samples.Ios, Version=4.0.0.0, Culture=neutral, PublicKeyToken=fba2ec45388e2af0","Platform":null,"ImageAddress":null,"SymbolAddress":null,"InstructionAddress":null,"AddressMode":"rel:1","FunctionId":1}
### ToString = void Program.<Main>$(?)
### Has Native Image = False
### Native Image Base = 0x0
### Native Interface pointer = 0x0
### Offset = 63
### FileName =
### Line no = 0
### HasMethod = True
### Method = Void <Main>$(System.String[])
### ILOffset = 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Archived in project
Development

No branches or pull requests

7 participants