-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate and fix flaky sql user test. #1212
Investigate and fix flaky sql user test. #1212
Conversation
This test fails for a few reasons: one, there are persistent quota and service-unavailable errors with the second-gen SQL instances. Two, the issue mentioned in hashicorp#1184, that there's unspecified behavior in tests where two resources have the same ID. This changes the ID to be user/host/instance instead of user/instance, and adds 429/503 handling to the sqladminOperationWait().
1bf4a3e
to
f1deed8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also confirm for me that if you have a sql user in your config from before this change, that adding this change (as if it were an update to terraform) doesn't break things? I don't think you need a migration for this but it doesn't hurt to check.
google/sqladmin_operation.go
Outdated
|
||
log.Printf("[DEBUG] self_link: %s", w.Op.SelfLink) | ||
op, err = w.Service.Operations.Get(w.Project, w.Op.Name).Do() | ||
|
||
for { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to add a new loop inside the loop we already have (the refreshfunc itself, which will get called with backoff from the WaitForState
call). Maybe just treat 429/503s as PENDING?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, that's way better.
google/resource_sql_user.go
Outdated
@@ -119,7 +119,7 @@ func resourceSqlUserRead(d *schema.ResourceData, meta interface{}) error { | |||
|
|||
var user *sqladmin.User | |||
for _, currentUser := range users.Items { | |||
if currentUser.Name == name && currentUser.Host == host { | |||
if currentUser.Name == name && (!hostOk || currentUser.Host == host.(string)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious, for second gen, what does currentUser.Host
return? If it's the empty string then you could leave this as is since d.Get("host").(string)
would be the empty string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
I'm checking that now. I think it should be okay, but let's be sure. |
Here's what I did to be sure: (first, write a config which uses a 2nd-gen sql user)
I think that proves that it's not a problem. What do you think? |
google/resource_sql_user.go
Outdated
users, err = config.clientSqlAdmin.Users.List(project, instance).Do() | ||
|
||
if e, ok := err.(*googleapi.Error); ok && (e.Code == 429 || e.Code == 503) { | ||
backoff = backoff * 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also be simplified with retry
, right? That already retries (with backoff) on 429s and 503s. The only obstacle I'd see is the users == nil
case- was that one that came up in testing or were you including it just-in-case?
err = retry(func() {
users, err = config.clientSqlAdmin.Users.List(project, instance).Do()
return err
})
if err != nil {
return handleNotFoundError(...)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was using that as an entry condition. Yours is smarter, changing it.
Done, that's better, thank you.
|
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks! |
This test fails for a few reasons: one, there are persistent quota
and service-unavailable errors with the second-gen SQL instances.
Two, the issue mentioned in #1184, that there's unspecified behavior
in tests where two resources have the same ID. This changes the ID
to be user/host/instance instead of user/instance, and adds 429/503
handling to the sqladminOperationWait().