Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADR 022 - Custom baseapp panic handling #6072

Merged
merged 16 commits into from
May 13, 2020
Merged

Conversation

iTiky
Copy link

@iTiky iTiky commented Apr 24, 2020

/cc @alexanderbez


For contributor use:

  • Targeted PR against correct branch (see CONTRIBUTING.md)
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the module structure standards.
  • Wrote unit and integration tests
  • Updated relevant documentation (docs/) or specification (x/<module>/spec/)
  • Added relevant godoc comments.
  • Added a relevant changelog entry to the Unreleased section in CHANGELOG.md
  • Re-reviewed Files changed in the Github PR explorer

For admin use:

  • Added appropriate labels to PR (ex. WIP, R4R, docs, etc)
  • Reviewers assigned
  • Squashed all commits, uses message "Merge pull request #XYZ: [title]" (coding standards)

@codecov
Copy link

codecov bot commented Apr 24, 2020

Codecov Report

Merging #6072 into master will decrease coverage by 0.24%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #6072      +/-   ##
==========================================
- Coverage   54.64%   54.39%   -0.25%     
==========================================
  Files         426      430       +4     
  Lines       25890    26106     +216     
==========================================
+ Hits        14148    14201      +53     
- Misses      10763    10922     +159     
- Partials      979      983       +4     

@aaronc
Copy link
Member

aaronc commented Apr 24, 2020

I think we should allow more customizability of these sorts of things in general. The approach in #6053 handles one such case and I don't think we should block it. But I do think we should reconsider a composable middleware approach to BaseApp, rather than handling each edge case one by one.

@alexanderbez alexanderbez added T: ADR An issue or PR relating to an architectural decision record R4R labels Apr 24, 2020
Copy link
Contributor

@alexanderbez alexanderbez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @iTiky! Left some minor feedback.

docs/architecture/adr-022-custom-panic-handling.md Outdated Show resolved Hide resolved
docs/architecture/adr-022-custom-panic-handling.md Outdated Show resolved Hide resolved
docs/architecture/adr-022-custom-panic-handling.md Outdated Show resolved Hide resolved
@iTiky iTiky requested a review from alexanderbez April 24, 2020 16:59
Copy link
Collaborator

@fedekunze fedekunze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well written ADR. Left a few suggestions. Thanks @iTiky! 👍

docs/architecture/adr-022-custom-panic-handling.md Outdated Show resolved Hide resolved
docs/architecture/adr-022-custom-panic-handling.md Outdated Show resolved Hide resolved
docs/architecture/adr-022-custom-panic-handling.md Outdated Show resolved Hide resolved
docs/architecture/adr-022-custom-panic-handling.md Outdated Show resolved Hide resolved
docs/architecture/adr-022-custom-panic-handling.md Outdated Show resolved Hide resolved
docs/architecture/adr-022-custom-panic-handling.md Outdated Show resolved Hide resolved
docs/architecture/adr-022-custom-panic-handling.md Outdated Show resolved Hide resolved
docs/architecture/adr-022-custom-panic-handling.md Outdated Show resolved Hide resolved
Copy link
Contributor

@alexanderbez alexanderbez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took another pass through over this ADR and things started to make more sense to me. I think certain sections can be either clarified or omitted entirely. But overall, the premise looks solid and I see no reason why we can't support this.

One thing I would like note (I've mentioned this in a comment), is that we should guarantee or assert that the last handler in the decorator/middleware stack is guaranteed to return an error -- a fail-safe/terminator.

docs/architecture/adr-022-custom-panic-handling.md Outdated Show resolved Hide resolved
docs/architecture/adr-022-custom-panic-handling.md Outdated Show resolved Hide resolved
docs/architecture/adr-022-custom-panic-handling.md Outdated Show resolved Hide resolved
Mikhail Kornilov and others added 2 commits April 27, 2020 13:31
Co-Authored-By: Alexander Bezobchuk <alexanderbez@users.noreply.github.com>
Copy link
Contributor

@alexanderbez alexanderbez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conceptACK as I'm generally in favor of (opinionated) flexibility where the concept is clean and modular and extensible. I would like to see an ACK from @aaronc and @cwgoes potentially before proceeding with the implementation.

Copy link
Contributor

@cwgoes cwgoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK but I would like to see more clearly explained examples

might be handled in a "standard" way (middleware) alongside the others.

We propose middleware-solution, which could help developers implement the following cases:
* add external logging (let's say sending reports to external services like [Sentry](https://sentry.io));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand how this would work. Would every node make an HTTP call to Sentry?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are examples of how developers can use those middlewares. They can send panic logs to Sentry if they need to or do other struf needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't understand how Sentry would be used in a replicated state machine.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are just examples of what developers of Cosmos SDK based project could use those handlers for. May be they want to get notifications on every panic() event (Sentry), that doesn't collide with state machine ideology.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that (in a production blockchain) the state machine is replicated, so many copies of every error would be sent to Sentry, and you would need to be careful to ensure that making HTTP requests doesn't introduce non-determinism. It's an alright example, but these points need to be clarified.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually Sentry can aggregate multiple similar event into one issue with some meta and statistics. As for non-deterministic behavior: developers can do that in so many ways including the proposed middlewares =)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's nice. The determinism bit is more important. I think it would be sufficient if we add a warning that middlewares must be deterministic, or consensus safety will be lost.

// ...
defer func() {
if r := recover(); r != nil {
recoveryMW := newOutOfGasRecoveryMiddleware(gasWanted, ctx, app.runTxRecoveryMiddleware)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we hard-coding newOutOfGasRecoveryMiddleware into BaseApp? I thought the idea was to defer to app.runTxRecoveryMiddleware?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is because OutOfGas error handling needs some extra context and can't be statically created.

Copy link
Member

@aaronc aaronc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I'm in favor of well-reasoned customizability. As I noted before, I would like us to think of a more general framework for BaseApp being composed of middleware in the future.

But, this seems like a good starting point with a concrete use case and I'd like to see it move forward.

Mikhail Kornilov and others added 7 commits April 28, 2020 12:15
@iTiky iTiky requested a review from cwgoes April 28, 2020 15:41
Copy link
Contributor

@cwgoes cwgoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

func newOutOfGasRecoveryMiddleware(gasWanted uint64, ctx sdk.Context, next recoveryMiddleware) recoveryMiddleware {
handler := func(recoveryObj interface{}) error {
err, ok := recoveryObj.(sdk.ErrorOutOfGas)
if !ok { return nil }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should middleware decorators simply pass recoveryObj along to the next middleware if it isn't an error they handle instead of returning?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually they do. The example above creates a handler and passes it to func newRecoveryMiddleware which creates a middleware. That way developer only needs to create a RecoveryHandler (which only return an error / nil). The rest is done with helper functions automatically.

@iTiky
Copy link
Author

iTiky commented Apr 29, 2020

@alexanderbez ADR's PR is approved, what are the next steps? I can update the implementation PR (add link to the ADR, make some minor improvements proposed in the discussion).

@alexanderbez
Copy link
Contributor

@iTiky, yes let's update this PR so we can merge it and then we can proceed with reviewing the implementation 👍

Thank you for all the hard work :)

@aaronc
Copy link
Member

aaronc commented May 5, 2020

Looks like the main thing needed to move this forward is merging master into this branch. @iTiky can you do that and then we'll merge.

@aaronc aaronc added the A:automerge Automatically merge PR once all prerequisites pass. label May 5, 2020
@iTiky iTiky requested a review from clevinson as a code owner May 5, 2020 18:10
@iTiky
Copy link
Author

iTiky commented May 13, 2020

@alexanderbez Can we merge this ADR as it breaks the markdown analyser for PR (invalid link to the ADR file).

@iTiky iTiky mentioned this pull request May 13, 2020
11 tasks
@alexanderbez
Copy link
Contributor

Yes, we just need to rebase 👍

@alexanderbez
Copy link
Contributor

Something did not go right in the rebase @iTiky

@iTiky
Copy link
Author

iTiky commented May 13, 2020

@alexanderbez Yes, you're right. Can I just close this PR and make a new cleaner one preserving all changes? I'm not sure I can do the force push right.

@alexanderbez
Copy link
Contributor

alexanderbez commented May 13, 2020

I don't think that is necessary. Revert back to 35fc451, checkout and pull latest master, and finally merge master into this branch.

@iTiky
Copy link
Author

iTiky commented May 13, 2020

@alexanderbez Thanks! I did the rollback and merged the latest master.

@alexanderbez alexanderbez merged commit b1b46db into cosmos:master May 13, 2020
@iTiky iTiky deleted the adr-022 branch August 26, 2020 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A:automerge Automatically merge PR once all prerequisites pass. C:baseapp T: ADR An issue or PR relating to an architectural decision record
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants