Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRPC API error handling is wrong #6225

Closed
2 of 5 tasks
gzigzigzeo opened this issue Mar 30, 2021 · 4 comments
Closed
2 of 5 tasks

GRPC API error handling is wrong #6225

gzigzigzeo opened this issue Mar 30, 2021 · 4 comments
Labels

Comments

@gzigzigzeo
Copy link
Contributor

gzigzigzeo commented Mar 30, 2021

  • (nit) When I try to delete role belongs to a user via tctl, I get:

    "ERROR: failed to delete role that still in use by a user. Check system server logs for more details."

    It does not specify which user or users are involved, and, in the same time, logs are empty (most likely, -d will tell me something, but I do not usually start Terraform with -d)

  • (important) If keys are expired, no meaningful error is generated. After a long timeout, it reports the following instead:

    "Error: all auth methods failed
    context deadline exceeded"

    Code 503 (Gateway timeout?). This error could mean network timeout. I need to know for sure that keys are wrong to report it to the user.

  • (nit) If I try to generate keys for a missing user via tctl, instead of meaningful message, I get:
    "Error: [10] access denied"
    I expect to have "user is missing".

  • (important) If I use keys for a user missing on Teleport side, instead of meaningful message, I get:
    "Error: [10] access denied"
    I expect to get a message that something is wrong with my keys, "User %u not found" is preferrable.

  • (blocking) If I use user which has no rights on performing specific action, instead of meaningful message, I get:
    "Error: not found"

    In the same time, if I forgot to add permissions on some resources to this user, I get the same error.

    I expect to get something like "User %u has no permissions to perform CreateUser" with 403 in case I have no perms, and 404 in case I try to access a resource which does not exist, but I have rights to operate on.

    That is very confusing because I need to distinguish "no access" and "does not exist" situations in order to make Terraform provider work correctly with resources deleted outside Terraform (see https://github.com/hashicorp/terraform-provider-aws/blob/main/aws/resource_aws_cloudformation_stack_set_instance.go#L192, I need to know for sure if a resource is missing in the same situation on my side)

@gzigzigzeo gzigzigzeo added the bug label Mar 30, 2021
@Joerger
Copy link
Contributor

Joerger commented Mar 30, 2021

All of these issues seem to be related to the way the Teleport library performs authorization and is not specific to gRPC. Most of these are intentional and are not easily changed, so it would be best to work around them when possible.

  1. This is intentional, the debug logs does log more info: log.Warnf("Failed to delete role: role %v is used by user %v.", name, u.GetName())

  2. Unfortunately there is no way to propagate a more meaningful error to the client here. Teleport is currently limited to grpc v1.29.1 which does not have the WithReturnConnectionErrors dial option - gRPC with connection error #6163. The only way to see a more specific error for now would be in the auth server debug logs.

This could mean that the server address is wrong or the user's keys are wrong. If you are using client.LoadProfile credentials, then their current profile must be wrong. They either need to re-login with tsh login, or update their user's role to have proper permissions/logins.

  1. I'm not sure I understand this one. If the user does not exist, how could an error be made with the user's name? If you mean to pull the user's name from the keys, wouldn't this be possible on your side as well?

3, 5) This is intentional for security purposes:

teleport/lib/utils/utils.go

Lines 368 to 375 in fbae7ad

// OpaqueAccessDenied returns a generic NotFound instead of AccessDenied
// so as to avoid leaking the existence of secret resources.
func OpaqueAccessDenied(err error) error {
if trace.IsAccessDenied(err) {
return trace.NotFound("not found")
}
return trace.Wrap(err)
}

Are you able to avoid the access denied case by ensuring you only use users's with proper permissions? Or only go through with affecting the cloudformation state when the action is performed by an admin role?

@klizhentas
Copy link
Contributor

@Joerger @russjones regarding OpaqueAccessDenied - I don't think it adds anything security-wise, but breaks a lot of use-cases for us. What is the purpose of replacing access denied to not found?

@Joerger
Copy link
Contributor

Joerger commented Mar 30, 2021

OpaqueAccessDenied was added as part of @andrejtokarcik's security fixes - #5793

@gzigzigzeo
Copy link
Contributor Author

1 - Okay, this is not important at all.
2 - Yes, it could mean all that things. Okay, I could show generic error in this case (like "check that your certs are not expired and address is correct")
4 - I probably could pull user name on my side, yes. But I think that this case might be generic, hence, it might be better to handle it by the client.
5 - I really need to know if resource is missing or it is inaccessible, otherwise either process will be unclear for user or I could corrupt my local Terraform state. I can not rely on admin role because making Terraform an admin might be risky. I also could not pre-check generic permissions because Terraform might not have access to a specific resource instance having generic access to a resource at all.

I think, if knowing the access state presents the security risk - we should sacrifice convenience in the sake of security and treat all the "Not Found" errors as real. @klizhentas @Joerger

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants