-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Classify & format panic logs correctly #521
Comments
[Fixing this issue will also unblock chainhook alerting for thread panic.] |
MicaiahReid
added a commit
that referenced
this issue
Mar 27, 2024
This PR introduces a few fixes in an effort to improve reliability and debugging problems when running Chainhook as a service: - Revisits log levels throughout the tool (fixes #498, fixes #521). The general approach for the logs were: - `crit` - fatal errors that will crash mission critical component of Chainhook. In these cases, Chainhook should automatically kill all main threads (not individual scanning threads, which is tracked by #404) to crash the service. - `erro` - something went wrong the could lead to a critical error, or that could impact all users - `warn` - something went wrong that could impact an end user (usually due to user error) - `info` - control flow logging and updates on the state of _all_ registered predicates - `debug` - updates on the state of _a_ predicate - Crash the service if a mission critical thread fails (see #517 (comment) for a list of these threads). Previously, if one of these threads failed, the remaining services would keep running. For example, if the event observer handler crashed, the event observer API would keep running. This means that the stacks node is successfully emitting blocks that Chainhook is acknowledging but not ingesting. This causes gaps in our database Fixes #517 - Removes an infinite loop with bitcoin ingestion, crashing the service instead: Fixes #506 - Fixes how we delete predicates from our db when one is deregistered. This should reduce the number of logs we have on startup. Fixes #510 - Warns on all reorgs. Fixes #519
MicaiahReid
added a commit
that referenced
this issue
Mar 27, 2024
This PR introduces a few fixes in an effort to improve reliability and debugging problems when running Chainhook as a service: - Revisits log levels throughout the tool (fixes #498, fixes #521). The general approach for the logs were: - `crit` - fatal errors that will crash mission critical component of Chainhook. In these cases, Chainhook should automatically kill all main threads (not individual scanning threads, which is tracked by #404) to crash the service. - `erro` - something went wrong the could lead to a critical error, or that could impact all users - `warn` - something went wrong that could impact an end user (usually due to user error) - `info` - control flow logging and updates on the state of _all_ registered predicates - `debug` - updates on the state of _a_ predicate - Crash the service if a mission critical thread fails (see #517 (comment) for a list of these threads). Previously, if one of these threads failed, the remaining services would keep running. For example, if the event observer handler crashed, the event observer API would keep running. This means that the stacks node is successfully emitting blocks that Chainhook is acknowledging but not ingesting. This causes gaps in our database Fixes #517 - Removes an infinite loop with bitcoin ingestion, crashing the service instead: Fixes #506 - Fixes how we delete predicates from our db when one is deregistered. This should reduce the number of logs we have on startup. Fixes #510 - Warns on all reorgs. Fixes #519
github-actions bot
pushed a commit
that referenced
this issue
Mar 27, 2024
## [1.4.0](v1.3.1...v1.4.0) (2024-03-27) ### Features * detect http / rpc errors as early as possible ([ad78669](ad78669)) * use stacks.rocksdb for predicate scan ([#514](#514)) ([a4f1663](a4f1663)), closes [#513](#513) [#485](#485) ### Bug Fixes * enable debug logs in release mode ([#537](#537)) ([fb49e28](fb49e28)) * improve error handling, and more! ([#524](#524)) ([86b5c78](86b5c78)), closes [#498](#498) [#521](#521) [#404](#404) [/github.com//issues/517#issuecomment-1992135101](https://github.com/hirosystems//github.com/hirosystems/chainhook/issues/517/issues/issuecomment-1992135101) [#517](#517) [#506](#506) [#510](#510) [#519](#519) * log errors on block download failure; implement max retries ([#503](#503)) ([0fc38cb](0fc38cb)) * **metrics:** update latest ingested block on reorg ([#515](#515)) ([8f728f7](8f728f7)) * order and filter blocks used to seed forking block pool ([#534](#534)) ([a11bc1c](a11bc1c)) * seed forking handler with unconfirmed blocks to improve startup stability ([#505](#505)) ([485394e](485394e)), closes [#487](#487) * skip db consolidation if no new dataset was downloaded ([#513](#513)) ([983a165](983a165)) * update scan status for non-triggering predicates ([#511](#511)) ([9073f42](9073f42)), closes [#498](#498)
🎉 This issue has been resolved in version 1.4.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
vabanaerytk
added a commit
to vabanaerytk/chainhook
that referenced
this issue
Aug 7, 2024
## [1.4.0](hirosystems/chainhook@v1.3.1...v1.4.0) (2024-03-27) ### Features * detect http / rpc errors as early as possible ([4a3e7e8](hirosystems/chainhook@4a3e7e8)) * use stacks.rocksdb for predicate scan ([#514](hirosystems/chainhook#514)) ([bff1372](hirosystems/chainhook@bff1372)), closes [#513](hirosystems/chainhook#513) [#485](hirosystems/chainhook#485) ### Bug Fixes * enable debug logs in release mode ([#537](hirosystems/chainhook#537)) ([5ff21e0](hirosystems/chainhook@5ff21e0)) * improve error handling, and more! ([#524](hirosystems/chainhook#524)) ([2cfedd7](hirosystems/chainhook@2cfedd7)), closes [#498](hirosystems/chainhook#498) [#521](hirosystems/chainhook#521) [#404](hirosystems/chainhook#404) [/github.com/hirosystems/chainhook/issues/517#issuecomment-1992135101](https://github.com/hirosystems//github.com/hirosystems/chainhook/issues/517/issues/issuecomment-1992135101) [#517](hirosystems/chainhook#517) [#506](hirosystems/chainhook#506) [#510](hirosystems/chainhook#510) [#519](hirosystems/chainhook#519) * log errors on block download failure; implement max retries ([#503](hirosystems/chainhook#503)) ([0454a4f](hirosystems/chainhook@0454a4f)) * **metrics:** update latest ingested block on reorg ([#515](hirosystems/chainhook#515)) ([7653766](hirosystems/chainhook@7653766)) * order and filter blocks used to seed forking block pool ([#534](hirosystems/chainhook#534)) ([ff775de](hirosystems/chainhook@ff775de)) * seed forking handler with unconfirmed blocks to improve startup stability ([#505](hirosystems/chainhook#505)) ([063e0e2](hirosystems/chainhook@063e0e2)), closes [#487](hirosystems/chainhook#487) * skip db consolidation if no new dataset was downloaded ([#513](hirosystems/chainhook#513)) ([f34dee3](hirosystems/chainhook@f34dee3)) * update scan status for non-triggering predicates ([#511](hirosystems/chainhook#511)) ([9647e72](hirosystems/chainhook@9647e72)), closes [#498](hirosystems/chainhook#498)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
It seems panic logs are not being printed in correct JSON syntax, which may cause issues when our log collectors try to parse them. Additionally, panic logs may not be getting classified correctly as
fatal
level (a rank aboveerror
level). See here for various supported log levels we can use.Ideally all panics would be printed in valid JSON, and be classified as
fatal
(or something similar to thecritical
log level referenced in the link above).Referenced panic, but a fix may be needed for all panics.
The text was updated successfully, but these errors were encountered: