Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws_vpn_connection: inconsistent remote state data (occasionally) #11293

Closed
n3ph opened this issue Dec 14, 2019 · 7 comments · Fixed by #19077
Closed

aws_vpn_connection: inconsistent remote state data (occasionally) #11293

n3ph opened this issue Dec 14, 2019 · 7 comments · Fixed by #19077
Assignees
Labels
bug Addresses a defect in current functionality. service/ec2 Issues and PRs that pertain to the ec2 service.
Milestone

Comments

@n3ph
Copy link
Contributor

n3ph commented Dec 14, 2019

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

Terraform v0.12.17

  • provider.aws v2.42.0

Affected Resource(s)

  • aws_vpn_connection

Terraform Configuration Files

does not matter

Background

resourceAwsVpnConnectionCreate assembles the connection options [1] and calls AWS API to create the VPN connection with these accordingly. It seems AWS is creating the VPN connections asynchronously as sometimes the order of the VPN connections in the XML config blob does not apply to the order of options[]. Since tunnel1_inside_cidr and tunnel2_inside_cidr are saved directly to the remote state and all other attributes are added later in resourceAwsVpnConnectionRead [2], the data for the resource might be inconsistent.

The code already tries to cover this unsuccessfully via sorting the XML config blob [3]. sort.Sort(vpnConfig) is acting on OutsideAddress of type XmlIpsecTunnel struct [4].

We need to check wether:

  • vpnConfig.Tunnels[0].VgwInsideAddress or vpnConfig.Tunnels[0].CgwInsideAddress is a vaild address of tunnel1_inside_cidr
  • vpnConfig.Tunnels[1].VgwInsideAddress or vpnConfig.Tunnels[1].CgwInsideAddress is a vaild address of tunnel2_inside_cidr

[1] https://github.com/terraform-providers/terraform-provider-aws/blob/master/aws/resource_aws_vpn_connection.go#L277-L291
[2] https://github.com/terraform-providers/terraform-provider-aws/blob/master/aws/resource_aws_vpn_connection.go#L452-L463
[3] https://github.com/terraform-providers/terraform-provider-aws/blob/master/aws/resource_aws_vpn_connection.go#L596
[4] https://github.com/terraform-providers/terraform-provider-aws/blob/master/aws/resource_aws_vpn_connection.go#L27-L28

Expected Behavior

terraform state show aws_vpn_connection.example | grep -e tunnel1 -e tunnel2 | grep -v -e tunnel1_address -e tunnel2_address  -e bgp -e key
    tunnel1_cgw_inside_address     = "169.254.254.250"
    tunnel1_inside_cidr            = "169.254.254.248/30"
    tunnel1_vgw_inside_address     = "169.254.254.249"
    tunnel2_cgw_inside_address     = "169.254.254.254"
    tunnel2_inside_cidr            = "169.254.254.252/30"
    tunnel2_vgw_inside_address     = "169.254.254.253"

Actual Behavior

terraform state show aws_vpn_connection.example | grep -e tunnel1 -e tunnel2 | grep -v -e tunnel1_address -e tunnel2_address  -e bgp -e key
    tunnel1_cgw_inside_address     = "169.254.254.250"
    tunnel1_inside_cidr            = "169.254.254.252/30"
    tunnel1_vgw_inside_address     = "169.254.254.249"
    tunnel2_cgw_inside_address     = "169.254.254.254"
    tunnel2_inside_cidr            = "169.254.254.248/30"
    tunnel2_vgw_inside_address     = "169.254.254.253"

References

@github-actions github-actions bot added the needs-triage Waiting for first response or review from a maintainer. label Dec 14, 2019
@n3ph
Copy link
Contributor Author

n3ph commented Dec 14, 2019

What a hassle. :-( sigh.
I wanted to push an updated version of the test first to show that this current kind of sorting does not work. But I fucked up commiting the right subset of my changes.

All good things come in threes...

@n3ph
Copy link
Contributor Author

n3ph commented Dec 15, 2019

As stated in #10584 (comment) it might be enough to just get rid of sort.Sort(vpnConfig)
As I was trying to reproduce #10584 as a solution I mixed up the custom provider builds..

It is NOT sufficient to just remove sort.Sort(vpnConfig).

@jochen42
Copy link

As stated in #10584 (comment) it might be enough to just get rid of sort.Sort(vpnConfig)
As I was trying to reproduce #10584 as a solution I mixed up the custom provider builds..

It is NOT sufficient to just remove sort.Sort(vpnConfig).

hi @n3ph ,

i think you are facing another issue than me. My issue is just the wrong order of the tunnels itself.
With the removed complex-object-sort i just want to have the correct order like aws-api answers.

it is hard to debug for the operators at the client-side if they are wrong ordered.

f.e. the results of the following command should be in the same order as the provided connection-parameter vom resource-output or statefile.

aws ec2 describe-vpn-connections --vpn-connection-id <some_connection_idf> --query 'VpnConnections[*].VgwTelemetry[*]'

the same for aws console.

your issue seems to be much more complex. for me it looks like you are right with the wrong tunnel*_vgw_inside_address, but your solution:

  • will also provide the wrong tunnel-order/sorting (hard to debug)
  • tries to fix a problem on aws-side with an assumption of their internal-behavior (what if you are wrong or they change the wrong behavior next week?)

Perhaps your issue should be reported to aws support?

@bflad bflad added bug Addresses a defect in current functionality. service/ec2 Issues and PRs that pertain to the ec2 service. and removed needs-triage Waiting for first response or review from a maintainer. labels Dec 18, 2019
bflad added a commit that referenced this issue Dec 18, 2019
…2 and aws_kinesis_analytics_* as service/kinesisanalytics

Reference: #11293
Reference: #11347
Reference: #11349
bflad added a commit that referenced this issue Dec 20, 2019
…2 and aws_kinesis_analytics_* as service/kinesisanalytics (#11355)

Reference: #11293
Reference: #11347
Reference: #11349
@bplunkert
Copy link

bplunkert commented Jan 13, 2020

I am experiencing this issue as well. While I like the idea in #11298 upon testing it does not seem to have any effect at all for me.

My tunnels are still down, with the inside addresses definitely flipped. tunnel1_cgw_inside_address is in the tunnel2_inside_cidr, and vice versa.

This is blocking me from deploying a VPN connection. Any workarounds (aside from removing VPN from Terraform management) would be extremely helpful.

Edit: I think I found a workaround: delete the VPN connection entirely, and re-create it again. Sometimes it comes back correctly. If it isn't correct, keep trying until it is created correctly. (It took me several tries.)

bflad added a commit that referenced this issue Apr 23, 2021
…_*` ordering when `tunnel1_inside_cidr`, `tunnel1_inside_ipv6_cidr`, or `tunnel1_preshared_key` is configured

Reference: #396
Reference: #3359
Reference: #4728
Reference: #5809
Reference: #11293

Previously (race condition of automatically assigned outside IP addresses):

```
=== CONT  TestAccAWSVpnConnection_tunnelOptions
resource_aws_vpn_connection_test.go:210: Step 15/15 error: Check failed: Check 4/6 error: aws_vpn_connection.test: Attribute 'tunnel1_preshared_key' expected "12345678", got "abcdefgh"
--- FAIL: TestAccAWSVpnConnection_tunnelOptions (738.28s)
```

Output from acceptance testing:

```
--- PASS: TestAccAWSVpnConnection_Tunnel1PresharedKey (251.02s)
--- PASS: TestAccAWSVpnConnection_withoutStaticRoutes (263.77s)
--- PASS: TestAccAWSVpnConnection_Tunnel1InsideCidr (335.14s)
--- PASS: TestAccAWSVpnConnection_tunnelOptions (342.30s)
--- PASS: TestAccAWSVpnConnection_disappears (388.07s)
--- PASS: TestAccAWSVpnConnection_tags (445.29s)
--- PASS: TestAccAWSVpnConnection_basic (797.33s)
--- PASS: TestAccAWSVpnConnection_withIpv6 (1235.35s)
--- PASS: TestAccAWSVpnConnection_TransitGatewayID (1235.72s)
--- PASS: TestAccAWSVpnConnection_withEnableAcceleration (1352.28s)
--- PASS: TestAccAWSVpnConnection_Tunnel1InsideIpv6Cidr (1595.79s)
```
@bflad bflad self-assigned this Apr 23, 2021
bflad added a commit that referenced this issue Apr 23, 2021
…_*` ordering when `tunnel1_inside_cidr`, `tunnel1_inside_ipv6_cidr`, or `tunnel1_preshared_key` is configured (#19077)

* resource/aws_vpn_connection: Prevent flipped `tunnel1_*` and `tunnel2_*` ordering when `tunnel1_inside_cidr`, `tunnel1_inside_ipv6_cidr`, or `tunnel1_preshared_key` is configured

Reference: #396
Reference: #3359
Reference: #4728
Reference: #5809
Reference: #11293

Previously (race condition of automatically assigned outside IP addresses):

```
=== CONT  TestAccAWSVpnConnection_tunnelOptions
resource_aws_vpn_connection_test.go:210: Step 15/15 error: Check failed: Check 4/6 error: aws_vpn_connection.test: Attribute 'tunnel1_preshared_key' expected "12345678", got "abcdefgh"
--- FAIL: TestAccAWSVpnConnection_tunnelOptions (738.28s)
```

Output from acceptance testing:

```
--- PASS: TestAccAWSVpnConnection_Tunnel1PresharedKey (251.02s)
--- PASS: TestAccAWSVpnConnection_withoutStaticRoutes (263.77s)
--- PASS: TestAccAWSVpnConnection_Tunnel1InsideCidr (335.14s)
--- PASS: TestAccAWSVpnConnection_tunnelOptions (342.30s)
--- PASS: TestAccAWSVpnConnection_disappears (388.07s)
--- PASS: TestAccAWSVpnConnection_tags (445.29s)
--- PASS: TestAccAWSVpnConnection_basic (797.33s)
--- PASS: TestAccAWSVpnConnection_withIpv6 (1235.35s)
--- PASS: TestAccAWSVpnConnection_TransitGatewayID (1235.72s)
--- PASS: TestAccAWSVpnConnection_withEnableAcceleration (1352.28s)
--- PASS: TestAccAWSVpnConnection_Tunnel1InsideIpv6Cidr (1595.79s)
```

* tests/resource/aws_vpn_connection: Add nosemgrep comment for errant situation

* resource/aws_vpn_connection: Fix comment typo
@github-actions github-actions bot added this to the v3.38.0 milestone Apr 23, 2021
bflad added a commit that referenced this issue Apr 23, 2021
…_*` ordering when `tunnel1_inside_cidr`, `tunnel1_inside_ipv6_cidr`, or `tunnel1_preshared_key` is configured (#19077)

* resource/aws_vpn_connection: Prevent flipped `tunnel1_*` and `tunnel2_*` ordering when `tunnel1_inside_cidr`, `tunnel1_inside_ipv6_cidr`, or `tunnel1_preshared_key` is configured

Reference: #396
Reference: #3359
Reference: #4728
Reference: #5809
Reference: #11293

Previously (race condition of automatically assigned outside IP addresses):

```
=== CONT  TestAccAWSVpnConnection_tunnelOptions
resource_aws_vpn_connection_test.go:210: Step 15/15 error: Check failed: Check 4/6 error: aws_vpn_connection.test: Attribute 'tunnel1_preshared_key' expected "12345678", got "abcdefgh"
--- FAIL: TestAccAWSVpnConnection_tunnelOptions (738.28s)
```

Output from acceptance testing:

```
--- PASS: TestAccAWSVpnConnection_Tunnel1PresharedKey (251.02s)
--- PASS: TestAccAWSVpnConnection_withoutStaticRoutes (263.77s)
--- PASS: TestAccAWSVpnConnection_Tunnel1InsideCidr (335.14s)
--- PASS: TestAccAWSVpnConnection_tunnelOptions (342.30s)
--- PASS: TestAccAWSVpnConnection_disappears (388.07s)
--- PASS: TestAccAWSVpnConnection_tags (445.29s)
--- PASS: TestAccAWSVpnConnection_basic (797.33s)
--- PASS: TestAccAWSVpnConnection_withIpv6 (1235.35s)
--- PASS: TestAccAWSVpnConnection_TransitGatewayID (1235.72s)
--- PASS: TestAccAWSVpnConnection_withEnableAcceleration (1352.28s)
--- PASS: TestAccAWSVpnConnection_Tunnel1InsideIpv6Cidr (1595.79s)
```

* tests/resource/aws_vpn_connection: Add nosemgrep comment for errant situation

* resource/aws_vpn_connection: Fix comment typo
@ghost
Copy link

ghost commented Apr 30, 2021

This has been released in version 3.38.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 31, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.