Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-1268] [Spike] Integrate dbt exceptions into structured logging system #5958

Open
gshank opened this issue Sep 28, 2022 · 6 comments
Open
Assignees
Labels
logging stale Issues that have gone stale

Comments

@gshank
Copy link
Contributor

gshank commented Sep 28, 2022

Right now we have structured logging events for the majority of our logging, except for exceptions. In some but not all cases we have a separate logging event. Ideally all exceptions should be logged automatically. This ticket is labeled a "spike" because in order to scope this work we need to identify a mechanism/policies for doing this.

@github-actions github-actions bot changed the title [Spike] Integrate dbt exceptions into structured logging system [CT-1268] [Spike] Integrate dbt exceptions into structured logging system Sep 28, 2022
@gshank gshank added the logging label Sep 28, 2022
@emmyoop emmyoop assigned emmyoop and unassigned gshank Oct 26, 2022
@jtcohen6
Copy link
Contributor

Copying from Nate's comment in #4836 (comment), because the underlying thrust of it continues to feel important:

whenever possible, we want to be able to programmatically distinguish when the root cause of an error is a warehouse issue, an adapter issue, or a core issue. This will be a big improvement in the process for debugging failed runs.

Basically, when a dbt run fails, how can we tell whether it was user error, database error, or internal dbt error?

In a Pythonic world, we'd expect to do this via class inheritance and type introspection. Given that our log events are now proto-composed, we probably need a field on each exception-log. Something like exception_cause or exception_category. Categories I can imagine:

  • parsing / validation error (problem in user code)
  • disallowed behavior, e.g. using deprecated functionality or a version of dbt-core that's disallowed by require-dbt-version
  • database error due to connection timeout
  • database error due to malformed query
  • internal / unhandled error
  • ... more? ...

@emmyoop
Copy link
Member

emmyoop commented Nov 18, 2022

This can probably be accomplished with an enum that defines all the categories. Without verifying the syntax I think something like this would work:

enum ExceptionCause {
{
        PARSING = 0;
        DISALLOWED = 1;
        DATABASE_CONNECTION = 2;
        DATABASE_QUERY = 3;
        UNHANDLED = 4;
        ...
 }

And then within the message add

message SomeException {
   EventInfo info = 1;
   repeated ExceptionCause exception_cause = 2;
   ...
}

This also allows the flexibility to add more types to the end of the enum which I'm sure we'll end up needing.

@jtcohen6
Copy link
Contributor

Blocking question we need to answer: Should we have separate LogCategory versus ExceptionCategory, or combine the two into a single field?

(Idea with LogCategory is to formally serve the role that the event codes' starting letters have been informally playing: this is parsing-related, deps-related, adapter-related, ...)

@emmyoop
Copy link
Member

emmyoop commented Jan 18, 2023

One of the unknowns here is exactly what would be useful to consumers to get logged. We likely don't want to log all exceptions, just specific ones. Is it best to fire an event as part of those exceptions or where the exception is raised?

Another option would be to add more structured information to MainEncounteredError. This may cover a lot of use cases and then just give a single event that would need to be subscribed to for exception details. The exception could contain the error category and then the MainEncounteredError event could pass the category (connection error, etc) in the proto message.

Minimum:
Easily differentiate between errors in the users code or errors related to dbt itself

Better:
Categorize Errors

  • dbt code error (sql syntax or jinja, md, yml, etc.)
  • dbt project config error (something in project.yml not being done properly)
  • permissions error (to warehouse or git provider)
  • and more

@emmyoop emmyoop removed their assignment Jan 18, 2023
@jtcohen6
Copy link
Contributor

Another option would be to add more structured information to MainEncounteredError. This may cover a lot of use cases and then just give a single event that would need to be subscribed to for exception details. The exception could contain the error category and then the MainEncounteredError event could pass the category (connection error, etc) in the proto message.

After chatting about this with @isaac-taylor, this might get us a lot of what we need, at least in the medium-term. It would be a quality-of-life improvement over parsing the string exc from MainEncounteredError.

Even better if that catch all error had some sort of error code that we can use to categorize the error to either our system, or the user's code (like connection problems)

Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logging stale Issues that have gone stale
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants