Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[consumer] Add retryable error types and add counts to permanent errors #7439

Closed
wants to merge 8 commits into from

Conversation

evan-bradley
Copy link
Contributor

Description:
This will replace exporterhelper.throttleRetry with consumererror.Retryable[Traces|Metrics|Logs] types and add success/fail counts to the permanent error types.

Right now this PR is just intended to be a discussion point for how these types should look. I will complete a full implementation once there is consensus on how these should look.

Link to tracking Issue:

Resolves #7047

@jpkrohling jpkrohling self-requested a review April 5, 2023 16:45
Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Given the conversation we had today, I think there are a few things that can be changed here:

  1. add the toHTTP from Propagate errors from exporters to receivers #7486 somewhere here, so that gRPC <-> HTTP translation can happen
  2. add new constructors to be used with exporters, like NewFromGRPCStatus(*status.Status) and NewFromHTTPStatus(int). Perhaps also the other way around would make sense, to be used in receivers: ToHTTPStatus() and ToGRPCStatus().

consumer/consumererror/permanent.go Outdated Show resolved Hide resolved
Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this approach makes sense, just a couple of questions. I agree w/ @dmitryax that going w/ rejected makes the most sense here.

consumer/consumererror/retryable.go Outdated Show resolved Hide resolved
consumer/consumererror/permanent.go Show resolved Hide resolved
consumer/consumererror/permanent.go Outdated Show resolved Hide resolved
@jpkrohling
Copy link
Member

@evan-bradley, is this PR going to incorporate parts of #7486? I'm waiting to proceed with that PR under the assumption that parts of it will be incorporated here.

@evan-bradley
Copy link
Contributor Author

@jpkrohling Yes, that was my intent. Sorry for the delay, I was looking into that and got stopped just short of being able to implement it. I hope to have something this week.

// NewPermanentWithCount wraps an error to indicate that it is a permanent error, i.e. an
// error that will be always returned if its source receives the same inputs.
// The error should also include the number of rejected records.
func NewPermanentWithCount(err error, rejected int) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit worried that we require providing the rejected field here. I'm wondering if we can keep this field optional. If not provided, the caller can assume that all the items got rejected. My concern is that some exporters return this error when there is no access to the original pdata that was translated to some other schema. It'll be difficult for them to fetch the number of the items. One option is to have

NewPermanent(err error, otps ...PermanentErrorOptions)

and supply rejected via WithRejectedCount(int) option.

Or we don't even need Permanent part here and assume it's always permanent unless Retryable(ptrace.Traces) option is provided. Later we might have some other options.

These are just thoughts, not suggestions for changes. I believe we need to think more about an overarching design for this API that would cover all the use cases, including embedding the GRPC/HTTP and possible future additions. Maybe having some small design doc would be beneficial.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that having a design doc would be good, I'll think through some of your points and add them to a design doc here. If it looks like the PR is growing we can move the document to a separate PR.

@jpkrohling
Copy link
Member

Folks, what's the state of this? This is blocking #7486 and I would really love to move forward with that, perhaps even with a suboptimal solution at first, and a proper solution in the longer term.

@evan-bradley
Copy link
Contributor Author

@jpkrohling Sorry, I had to focus on a few other things and haven't made as much progress on this as I would have liked. In my opinion, if we can keep the errors you have in #7486 internal, we could move to a long-term solution in the OTLP receiver and exporter once we've agreed on a design here. That way the functionality wouldn't change from user's perspective (with respect to error code propagation) and we wouldn't have to make any breaking changes in the API. What do you think?

@jpkrohling
Copy link
Member

Sounds good -- that error is internal already anyway, usable only by collector components in the core repository. Or do you mean making them even more private?

@evan-bradley
Copy link
Contributor Author

that error is internal already anyway, usable only by collector components in the core repository. Or do you mean making them even more private?

I think the internal scope you have in that PR is good.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 8, 2023

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jun 8, 2023
@github-actions
Copy link
Contributor

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Jun 23, 2023
@dmitryax dmitryax reopened this Jun 29, 2023
@dmitryax
Copy link
Member

Reopening so it's still on our radar

@dmitryax dmitryax removed the Stale label Jun 29, 2023
@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jul 14, 2023
@evan-bradley
Copy link
Contributor Author

@dmitryax thanks for re-opening this. I began the design document that we discussed, but I feel that it is still woefully inadequate, and haven't been able to get it to a spot where I feel it's really ready for a review. If you take a look feel free to let me know if I'm totally off-track, otherwise I hope to improve it in the coming weeks and request another review.

@github-actions github-actions bot removed the Stale label Jul 26, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Aug 9, 2023

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Aug 9, 2023
@github-actions
Copy link
Contributor

Closed as inactive. Feel free to reopen if this PR is still being worked on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate how to expose exporterhelper.NewThrottleRetry in the consumererror
5 participants