Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stack_data, executing, and pure_eval #748

Closed
alexmojaki opened this issue Jun 27, 2020 · 9 comments
Closed

stack_data, executing, and pure_eval #748

alexmojaki opened this issue Jun 27, 2020 · 9 comments

Comments

@alexmojaki
Copy link
Contributor

Hi! I've recently created a few libraries that are helpful for extracting information from frames and tracebacks, and I think they could be very useful to sentry.

  • executing can identify the exact AST node being executed by a frame in many cases. It's always disappointed me that Python only points to a line in each frame in tracebacks, especially when the expression/statement at fault spans many lines. Sometimes it can be really hard to interpret. Currently executing is used in IPython (in master, unreleased) to highlight the node, here's what it looks like:

ipython

  • executing can also infer a function __qualname__ from a code object, meaning your traceback can say lorem/ipsum.py in MyClass.__init__ at line 123 instead of just __init__, which is much more informative.
  • pure_eval can safely evaluate certain AST nodes without side effects, so that if the source for a frame contains expressions like self.foo and bar[key] their values can often be shown alongside plain variables.
  • stack_data collects data from tracebacks for the purpose of formatting and displaying them. It uses executing and pure_eval and also provides a lot of functionality of its own. For example, in this frame:

Screen Shot 2020-06-27 at 19 57 57

there are lines from complex_filter which the user doesn't need to see, and stack_data knows how to filter those out by only collecting lines which belong in the current scope. I integratedstack_data into IPython which allowed removing a lot of flaky custom introspection code, fixing several bugs and making the code more maintainable.

  • stack_data identifies the locations of variables and evaluated expressions in the source lines. An intelligent UI could use this, for example, to allow the user to hover over a variable in source code to see its value rather than scroll through a list to find it.

Overall there are lots of possibilities. What would interest you? I'm willing to make a PR and make any necessary changes to the libraries.

@untitaker
Copy link
Member

Hey @alexmojaki, this seems like a nice library!

  • We want to avoid adding more dependencies than absolutely necessary, which is why you'll find we have no external dependencies beyond urllib3 and certifi.
  • I have strong concerns about the runtime overhead of this library, not only for the happy path when nothing is crashing but also when reporting a 500 to Sentry. Particularly concerning your idea to transfer the values of sub-expressions to sentry.

I'm only going to comment on the ideas you presented.

there are lines from complex_filter which the user doesn't need to see, and stack_data knows how to filter those out by only collecting lines which belong in the current scope.

Please keep in mind that you have no guarantee of the filename containing valid sourcecode, for example in Jinja templates that isn't the case (it's also one of the reasons we have refrained from adding syntax highlighting to Sentry, the other reason being laziness).

An intelligent UI could use this, for example, to allow the user to hover over a variable in source code to see its value rather than scroll through a list to find it.

We would have to establish how to send this kind of information to the server in a way that makes sense for other languages than Python.

I believe an initial PoC, of whichever feature you're thinking of implementing, should pull those libraries in as optional dependency only. before_send seems like a very cheap way to hook into the SDK since its hint object will contain exc_info and you have access to the entire payload that is sent to sentry.

Hope that helps,
Markus

@alexmojaki
Copy link
Contributor Author

OK, if these are going to be optional dependencies, I'm going to avoid stack_data, as that would just make the code more complicated.

I've made PRs for a couple of the features mentioned. There's some finicky details I haven't handled - for example the tests currently assume that the optional libraries are installed. But you can see the general idea well enough for a PoC as you said. Before I polish things up, can you confirm if the concept is proven?

The feature that would interest me the most is the first one I listed - highlighting the executing node. This would require changes on the server side in order to render the information. Shall I just ignore that complication for now and make another PoC PR which adds the data to the frame and we can worry about rendering later?

@untitaker
Copy link
Member

Shall I just ignore that complication for now and make another PoC PR which adds the data to the frame and we can worry about rendering later?

Yes, preferrably behind an integration like mentioned in #749. The server can handle superfluous attributes, sort of.

@untitaker
Copy link
Member

@alexmojaki did I get it right if I say that pure_eval is py3-only?

@alexmojaki
Copy link
Contributor Author

Yes.

@untitaker
Copy link
Member

Ok cool. Just so you know you're in charge of creating the sentry-docs PRs. There should be one per integration so one for executing and one for pure_eval, unless you plan to refactor those further.

@alexmojaki
Copy link
Contributor Author

I'm going to make a docs PR for pure_eval. I don't know how I can add an integration suggestion for it when the integration has three dependencies: pure_eval, executing, and asttokens.

For executing, I don't think there's much point in people using it unless the feature for highlighting the node range goes through. I've made a server side PR which I don't think I can complete on my own.

@untitaker
Copy link
Member

Makes sense. I think pure-eval has most value anyway. Still unsure what to do about the server-side changes. We would probably need to review the exact protocol changes with others in the company.

@github-actions
Copy link

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog or Status: In Progress, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants