Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sentry-native SDK integration #2704

Merged
merged 32 commits into from
Oct 17, 2023
Merged

Conversation

vaind
Copy link
Collaborator

@vaind vaind commented Oct 6, 2023

Summary of the changes

  • This PR adds a Sentri.Native.Bindings nuget packages which collects compiled static libraries of the sentry-native submodule.
  • These libs are linked statically via NativeLibrary and DirectPInvoke if a dependant app is AOT compiled. The static libs mean end-users don't get an additional library alongside their single executable 🥳
  • the integration is currently limited to collecting native debug images, no crash handling yet

Next steps

#skip-changelog

@github-actions
Copy link
Contributor

github-actions bot commented Oct 6, 2023

Messages
📖 Do not forget to update Sentry-docs with your feature once the pull request gets approved.

Generated by 🚫 dangerJS against eb30270

@vaind vaind marked this pull request as ready for review October 10, 2023 18:44
@jamescrosswell jamescrosswell added this to the 4.0.0 milestone Oct 11, 2023
</None>
</ItemGroup>

<ItemGroup Condition="'$(CI_PUBLISHING_BUILD)' == 'true' or $([MSBuild]::IsOsPlatform('Linux'))">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine this CI_PUBLISHING_BUILD stuff is on the other binding projects and got copied here. But why do we need this? I'm concerned about stuff behaving differently in CI vs locally

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's intentional here though not yet necessary, I've extracted that as a "next step" in the PR description:

collect both platforms for the nuget package when running in CI

As to why: because we can't build all platforms on a single machine and I don't really want to do the download from a CI job the way I had to do in sentry-unity because that is very painful whenever there's a change, e.g. you want to run different sentry-native version on your branch than what has been compiled as an artifact on the main branch.

The way this project is currently implemented is that it compiles for your platform locally and only if necessary (i.e. sentry-native submodule checkout has changed or if there's a change in this Sentry.Bindings.Native.csproj file). In CI, it will collect and package for all platforms. At least that's the idea.

src/Sentry/DebugImage.cs Outdated Show resolved Hide resolved
{
relocations.Add(GetRelativeAddressMode(i), GetRelativeAddressMode(@event.DebugImages.Count));
@event.DebugImages.Add(DebugImages[i]);
_options.LogWarning("Unexpected debug image: neither ModuleVersionId nor ImageAddress is defined: {0}", DebugImages[i].ToString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No ToString seems to be implemented here:

public sealed class DebugImage : IJsonSerializable

Did you mean to add one? Or access a property here instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't really happen, I've removed .ToString() and added a Debug.Assert()

src/Sentry/Internal/ScopeObserver.cs Outdated Show resolved Hide resolved
src/Sentry/Platforms/Native/BaseContextWriter.cs Outdated Show resolved Hide resolved
src/Sentry/Platforms/Native/SentryNativeBridge.cs Outdated Show resolved Hide resolved
src/Sentry/Sentry.csproj Show resolved Hide resolved
@vaind
Copy link
Collaborator Author

vaind commented Oct 14, 2023

@vaind I'm wondering how to approach reviewing this... it's a pretty massive set of changes to reverse engineer.

Is there any chance you could give a high level problem summary and solution description in the PR? The Native integrations are a bit mysterious to me in general... so I don't have a lot of context coming into this.

I've update a description a bit (also make sure to read the issue first as that's where the problem statement is). However, it may still be insufficient if you're not very familiar with the native integrations (and native stack trace handling). In that case, either let me know and I'll try to walk you through some relevant bits or maybe someone else would have a chance to review this in the meantime.

Copy link
Contributor

@bitsandfoxes bitsandfoxes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned you'll move the ImageAddress type change to another PR?

src/Sentry/Platforms/Native/SentryNativeBridge.cs Outdated Show resolved Hide resolved
src/Sentry/Platforms/Native/SentryNativeBridge.cs Outdated Show resolved Hide resolved
@vaind vaind force-pushed the feat/sentry-native-integration branch from 0ba14ff to 29d02a4 Compare October 16, 2023 11:59
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to @bitsandfoxes, it'd be really good to get a breakdown of the changes in this file. I'll try to put some specific questions in where I've got enough context to ask something sensible 😉


if (!found)
{
relocations.Add(GetRelativeAddressMode(i), GetRelativeAddressMode(@event.DebugImages.Count));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why @event.DebugImages.Count here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because that's the index we're adding the image at on the next line.

@@ -88,25 +91,55 @@ internal void MergeDebugImagesInto(SentryEvent @event)
Dictionary<string, string> relocations = new();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only get this far if there are already debug images on the event... from the comments above, it seems that only ever happens when there are InnerExceptions (and so multiple stack traces) - correct?

I guess a high level question at that point: What is the information that could overlap from one stack trace to another, in this scenario and how/what are we trying to merge here?

{
if (i != j)
if (DebugImages[i].ModuleVersionId == @event.DebugImages[j].ModuleVersionId)
Copy link
Collaborator

@jamescrosswell jamescrosswell Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what we mean by ModuleVersionId here? That seems to identify a particular assembly... I take it the offsets in PDBs are module relative. I'm wondering what relocations is doing then... my best guess is that maybe the modules are stored in a different order in each of the different stack traces, but we only send the modules once in the SentryEvent so we need to map debug images from each stack trace to the correct module index to get the correct offsets??? It's just as likely that I have no idea whatsoever what's going on here though 😜

Copy link
Collaborator Author

@vaind vaind Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Firstly, this code is actually old, it's just moved under a if (DebugImages[i].ModuleVersionId is not null) condition so not really in need for reviewing (#2050). Also, this would be a good read if you wanted to start looking into this topic: https://github.com/getsentry/rfcs/blob/main/text/0013-portable-pdb.md

But I'll try to answer your questions to the best of my knowledge (and memory :) ):

Is this what we mean by ModuleVersionId here?

yes

I'm wondering what relocations is doing then...

To Sentry (backend), we're sending a list of debug images and each stack frame references an image in this list by "rel:IMAGE_INDEX" (see addr_mode).

my best guess is that maybe the modules are stored in a different order in each of the different stack traces

correct

, but we only send the modules once in the SentryEvent so we need to map debug images from each stack trace to the correct module index to get the correct offsets???

kinda - we can send the same image multiple times but it's wasteful. One could argue that running this code is also wasteful, but to me, it feels better to run the code once on the client and reduce the payload (less stuff to serialize, transfer, deserialize and process, store)

Copy link
Collaborator

@jamescrosswell jamescrosswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least as far as I understand, it looks good. Most of the meat is in DebugStackTrace.cs and we try to get a managed stack trace but now, failing that, fall back on native stack traces.

One important question is, other than the fact that it compiles, how do we know it works? Do we have any automated tests for this or was it just manual smoke tests?

Copy link
Collaborator Author

@vaind vaind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the outstanding questions were unrelated to the changes made in this PR, I'm merging

{
if (i != j)
if (DebugImages[i].ModuleVersionId == @event.DebugImages[j].ModuleVersionId)
Copy link
Collaborator Author

@vaind vaind Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Firstly, this code is actually old, it's just moved under a if (DebugImages[i].ModuleVersionId is not null) condition so not really in need for reviewing (#2050). Also, this would be a good read if you wanted to start looking into this topic: https://github.com/getsentry/rfcs/blob/main/text/0013-portable-pdb.md

But I'll try to answer your questions to the best of my knowledge (and memory :) ):

Is this what we mean by ModuleVersionId here?

yes

I'm wondering what relocations is doing then...

To Sentry (backend), we're sending a list of debug images and each stack frame references an image in this list by "rel:IMAGE_INDEX" (see addr_mode).

my best guess is that maybe the modules are stored in a different order in each of the different stack traces

correct

, but we only send the modules once in the SentryEvent so we need to map debug images from each stack trace to the correct module index to get the correct offsets???

kinda - we can send the same image multiple times but it's wasteful. One could argue that running this code is also wasteful, but to me, it feels better to run the code once on the client and reduce the payload (less stuff to serialize, transfer, deserialize and process, store)


if (!found)
{
relocations.Add(GetRelativeAddressMode(i), GetRelativeAddressMode(@event.DebugImages.Count));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because that's the index we're adding the image at on the next line.

@vaind
Copy link
Collaborator Author

vaind commented Oct 17, 2023

One important question is, other than the fact that it compiles, how do we know it works? Do we have any automated tests for this or was it just manual smoke tests?

https://github.com/getsentry/sentry-dotnet/blob/5e0d377e38ebfa8c6e6056225b1f2390d5ddc615/test/Sentry.Tests/Internals/DebugStackTraceTests.verify.cs

And there will be an integration test for Native AOT once we can build one, see "Next steps" in the PR description

@vaind vaind merged commit d32f05b into feat/4.0.0 Oct 17, 2023
13 of 14 checks passed
@vaind vaind deleted the feat/sentry-native-integration branch October 17, 2023 09:21
@jamescrosswell
Copy link
Collaborator

https://github.com/getsentry/sentry-dotnet/blob/5e0d377e38ebfa8c6e6056225b1f2390d5ddc615/test/Sentry.Tests/Internals/DebugStackTraceTests.verify.cs

And there will be an integration test for Native AOT once we can build one, see "Next steps" in the PR description

OK, but none of those tests changed in the PR right? And we'd need to be generating native debug images for any of the new code to get tested so I think those tests are just ensuring the PR doesn't break any existing functionality with managed code???

We will test the new code once we have the Native AOT integration tests though, so that answers my question.

@jamescrosswell
Copy link
Collaborator

Also, this would be a good read if you wanted to start looking into this topic: https://github.com/getsentry/rfcs/blob/main/text/0013-portable-pdb.md

🤯 OMG I had no idea those existed - thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants