Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate and fix flaky sql user test. #1212

Merged
merged 3 commits into from
Mar 23, 2018

Conversation

nat-henderson
Copy link
Contributor

This test fails for a few reasons: one, there are persistent quota
and service-unavailable errors with the second-gen SQL instances.
Two, the issue mentioned in #1184, that there's unspecified behavior
in tests where two resources have the same ID. This changes the ID
to be user/host/instance instead of user/instance, and adds 429/503
handling to the sqladminOperationWait().

This test fails for a few reasons: one, there are persistent quota
and service-unavailable errors with the second-gen SQL instances.
Two, the issue mentioned in hashicorp#1184, that there's unspecified behavior
in tests where two resources have the same ID.  This changes the ID
to be user/host/instance instead of user/instance, and adds 429/503
handling to the sqladminOperationWait().
@danawillow danawillow self-requested a review March 21, 2018 20:56
@danawillow danawillow self-assigned this Mar 21, 2018
Copy link
Contributor

@danawillow danawillow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also confirm for me that if you have a sql user in your config from before this change, that adding this change (as if it were an update to terraform) doesn't break things? I don't think you need a migration for this but it doesn't hurt to check.


log.Printf("[DEBUG] self_link: %s", w.Op.SelfLink)
op, err = w.Service.Operations.Get(w.Project, w.Op.Name).Do()

for {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to add a new loop inside the loop we already have (the refreshfunc itself, which will get called with backoff from the WaitForState call). Maybe just treat 429/503s as PENDING?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, that's way better.

@@ -119,7 +119,7 @@ func resourceSqlUserRead(d *schema.ResourceData, meta interface{}) error {

var user *sqladmin.User
for _, currentUser := range users.Items {
if currentUser.Name == name && currentUser.Host == host {
if currentUser.Name == name && (!hostOk || currentUser.Host == host.(string)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious, for second gen, what does currentUser.Host return? If it's the empty string then you could leave this as is since d.Get("host").(string) would be the empty string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@nat-henderson
Copy link
Contributor Author

I'm checking that now. I think it should be okay, but let's be sure.

@nat-henderson
Copy link
Contributor Author

Here's what I did to be sure:

(first, write a config which uses a 2nd-gen sql user)

git checkout master
make
terraform init
terraform apply
git checkout sql-user-flaky-test
make
terraform init
terraform state show google_sql_user.user
<result shows id = admin/testinstance>
terraform refresh
terraform state show google_sql_user.user
<result shows id = admin//testinstance>

I think that proves that it's not a problem. What do you think?

users, err = config.clientSqlAdmin.Users.List(project, instance).Do()

if e, ok := err.(*googleapi.Error); ok && (e.Code == 429 || e.Code == 503) {
backoff = backoff * 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be simplified with retry, right? That already retries (with backoff) on 429s and 503s. The only obstacle I'd see is the users == nil case- was that one that came up in testing or were you including it just-in-case?

err = retry(func() {
  users, err = config.clientSqlAdmin.Users.List(project, instance).Do()
  return err
})
if err != nil {
  return handleNotFoundError(...)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using that as an entry condition. Yours is smarter, changing it.

@nat-henderson
Copy link
Contributor Author

nat-henderson commented Mar 23, 2018

Done, that's better, thank you.

TF_ACC=1 go test ./google -v -run=TestAccSqlUser -timeout 120m
=== RUN   TestAccSqlUser_firstGen
=== RUN   TestAccSqlUser_secondGen
--- PASS: TestAccSqlUser_firstGen (90.28s)
--- PASS: TestAccSqlUser_secondGen (651.36s)
PASS
ok      github.com/terraform-providers/terraform-provider-google/google 651.380s

@ghost
Copy link

ghost commented Nov 19, 2018

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!

@ghost ghost locked and limited conversation to collaborators Nov 19, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants