Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.6.0 Statefiles Not Backwards Compatible when Variables Use Validation #34014

Closed
bjf-rga opened this issue Oct 6, 2023 · 16 comments
Closed
Assignees
Labels
bug confirmed a Terraform Core team member has reproduced this issue explained a Terraform Core team member has described the root cause of this issue in code terraform test

Comments

@bjf-rga
Copy link

bjf-rga commented Oct 6, 2023

Terraform Version

1.6.0, 1.5.6

Terraform Configuration Files

terraform {
  required_providers {
    aws = {
      source = "hashicorp/aws"
    }
  }

  backend "s3" {
    bucket = "<bucket here>"
    key    = "issue_test/terraform.tfstate"
    region = "us-east-1"
  }
}

provider "aws" {
  region = "us-east-1"
}

Debug Output

n/a

Expected Behavior

State produced by 1.6.0 should be readable by previous versions

Actual Behavior

During init with TF < 1.6.0, the error Error refreshing state: unsupported checkable object kind "var" is produced.

Steps to Reproduce

variable "tfversion" {
  type = string

  validation {
    condition     = length(var.tfversion) > 2
    error_message = "This is an example"
  }
}

output "test-variable" {
  value = "This was produced with tf version ${var.tfversion}"
}
  1. tfenv use 1.6.0
  2. terraform init
  3. terraform apply -var tfversion=1.6.0 -auto-approve
  4. tfenv use 1.5.6
  5. terraform init

Additional Context

The statefile produced by 1.6.0 has an additional check_results section in the statefile that does not appear to be processable by version before 1.6.0. In 1.5.6, this key exists but has a value of null.

References

No response

@bjf-rga bjf-rga added bug new new issue not yet triaged labels Oct 6, 2023
@bjf-rga
Copy link
Author

bjf-rga commented Oct 6, 2023

In hindsight, the aws portions of this are entirely irrelevant, but this was the simplest code I had on hand that reproduced the issue.

@apparentlymart
Copy link
Contributor

Hi @bjf-rga! Thanks for reporting this.

This problem was caused by a bug in Terraform v1.5 where its state parsing logic wasn't fully forward-compatible with unknown check types. That bug was fixed in Terraform v1.5.7, and so if you first upgrade everything you have using v1.5.6 to v1.5.7 and then start upgrading to v1.6.0 you should not have this problem.

Unfortunately since prior releases are immutable we cannot retroactively fix the bug in Terraform v1.5.6, but v1.5.7 only has two small changes relative to v1.5.6 (one of which is the fix to this problem) and so upgrading from v1.5.6 to v1.5.7 should not require any changes to any other part of your system.

@bjf-rga
Copy link
Author

bjf-rga commented Oct 6, 2023

Hello @apparentlymart! Thanks for the quick response.

Could you clarify the upgrade path here? I'm not having any problems going from 1.x -> 1.6.0. The issue I'm seeing is that once I've created a statefile with 1.6.0, no earlier versions of Terraform can read the state file. So using the TF above, let's say I do an initial apply with 1.5.6, then apply with 1.5.7, then apply with 1.6.0. This all works great.

But as soon as I switch back to a version of TF < 1.6.0, I can no longer initialize because none of the previous versions can parse the data in the check_results section of the state file. I've not tested remote state reads yet, but what I'm seeing suggests that once I've done an apply with 1.6.0, there's no going back. And that would imply broken compatibility within 1.x.

Is this the expected behavior, or am I correct in thinking I should be able to use a version of TF < 1.6.0 to initialize against a stack that's been applied with >= 1.6.0?

@apparentlymart
Copy link
Contributor

Once you have a state snapshot that was created using 1.6.0 (assuming that you have custom validation rules for a variable, which is what causes the incompatibility), you cannot return to Terraform 1.5.6, but you should be able to downgrade to 1.5.7 instead.

@bjf-rga
Copy link
Author

bjf-rga commented Oct 6, 2023

As a follow up, I've done some remote state read testing, so I think I understand part of what you're saying. Using 1.5.7, I can do a remote state read of a statefile produced by 1.6.0. But no other version of Terraform can read the 1.6.0 remote state file.

This would mean that for me to update the Terraform version on any stack past 1.5.7, I would need every one of our partners (as defined by "folks reading the state files I'm producing") to be on 1.5.7+. That would prove a nearly insurmountable lift for my organization.

I can understand not being able to move backwards with the version of Terraform managing the stack, but shouldn't I have some guarantee that a statefile produced by a 1.x version of Terraform is remotely readable by other 1.x versions?

@apparentlymart
Copy link
Contributor

Indeed, if you intend to produce state files that can be read by Terraform v1.5.6 then you will need to either:

  • Not use Terraform v1.6.0 yet, or
  • Remove the validation blocks from your input variables in the meantime until all of your consumers are using a version of Terraform that doesn't have this bug (which is v1.5.7), since the recording of the outcomes of those rules is the new element that v1.5.6 cannot handle

Upgrading from v1.5.6 to v1.5.7 is intended to be a trivial operation; other than this fix, it includes only a security fix for module installation that should not affect any non-malicious modules.


I should also note that although the forward-compatibility problem here was accidental and thus fixed in a patch release for the v1.5 series, our compatibility promises do not guarantee that it will always be possible to roll backwards; the compatibility promises are primarily about upgrading rather than downgrading.

Upgrading and Downgrading:

You should be able to upgrade from any v1.x release to any later v1.x release. You might also be able to downgrade to an earlier v1.x release, but that isn't guaranteed: later releases may introduce new features that earlier versions cannot understand, including new storage formats for Terraform state snapshots.

In this case it was unintentional that v1.5.6 did not have sufficient forward-compatibility to read the newer version, and so that has been fixed in v1.5.7, but I'm pointing this out only because I think it's important to be aware of what is and is not promised for new versions in the v1.x series.

@bjf-rga
Copy link
Author

bjf-rga commented Oct 6, 2023

I appreciate the quick and thorough responses, although I can't say that I'm overly thrilled. This is going to prove a substantial barrier to being able to move Terraform in our organization beyond 1.5.7 (and I'm willing to suspect for many others). We have stacks that produce statefiles that are read by several consumers, and I cannot exercise control over the versions of Terraform that they're using.

I can understand and appreciate the argument about upgrading and downgrading, but I think this is larger than that -- a rather substantial backwards-compatibility issue has been introduced. The statefile is a contract between producers and consumers. Beginning with 1.6.0, that contract is now broken for consumers using an insufficiently recent version, despite operating within the same major version.

Was a version 2.0.0 considered for this release given that state files are no longer backwards compatible?

@apparentlymart
Copy link
Contributor

apparentlymart commented Oct 6, 2023

Thanks for that context, @bjf-rga.

Since this new kind of check was introduced in support of the new module testing framework (which uses check results in the state as part of the definition of whether tests are passing), I'm going to label this as feedback related to that feature, even though I understand that's not what you are concerned about here, just because I want to make this visible to the folks who were working on that to consider if there are any alternative paths forward that I'm not thinking of, since I wasn't working directly on this change.

@bjf-rga
Copy link
Author

bjf-rga commented Oct 6, 2023

Thanks. Again, I appreciate the quick feedback and the attention to the community. I'll see what creative solutions we can conjure to work through and around this. Thanks for raising visibility of this issue.

@alisdair
Copy link
Contributor

alisdair commented Oct 7, 2023

Thanks for raising this issue. We unfortunately caught this state interoperability issue well after its release. While as Martin says it's not strictly a violation of the 1.x guarantees, I recognize that it is enormously inconvenient for those sharing data using remote state across multiple Terraform versions.

As a result we merged the fix for this in #33815, and we triggered exceptional bug fix releases of 1.3.10, 1.4.7, and 1.5.7. Earlier Terraform minor releases are not affected.

I understand that you have no control over the Terraform versions used by your state consumers, but in case you can exert any influence, these releases give an option for working around this issue with minimal impact:

  • Users of Terraform prior to 1.3.0 are unaffected;
  • Terraform 1.3 series users should upgrade to 1.3.10;
  • Terraform 1.4 series users should upgrade to 1.4.7;
  • Terraform 1.5 series users should upgrade to 1.5.7;
  • Users of Terraform 1.6.0 and later are unaffected.

If it's an option for you, using the tfe_outputs data source instead of the full remote state is generally more robust and unaffected by this bug. In our documentation, we also describe some alternative ways to publish data for external consumers instead of using remote state.

The only other suggestion I have for avoiding this problem is unfortunately to remove all uses of checks from your configurations: variable validations, preconditions, postconditions, and check blocks. After an apply, this will remove the check results from state.

I hope the patch releases mentioned above are viable for your organization. I'm sorry not to have better news here, and hope that if you find another workaround you'll let us know.

@liamcervante liamcervante self-assigned this Oct 10, 2023
@liamcervante liamcervante added confirmed a Terraform Core team member has reproduced this issue explained a Terraform Core team member has described the root cause of this issue in code and removed new new issue not yet triaged labels Oct 10, 2023
@liamcervante
Copy link
Member

Just a quick update here, we're aiming to have a workaround in place for this in v1.6.2.

It is important to note that, as highlighted by Alisdair, this issue affects Terraform releases in the 1.3, 1.4 and 1.5 series and has been patched in the latest releases for each of them. We're committing to keeping a workaround in place for the 1.6 series to provide more time to update to the latest patch release for each of the affected releases but we can't maintain this workaround indefinitely, and this error will eventually resurface as new types of custom conditions are introduced to Terraform and the variable condition is reintroduced to the state in the 1.7 series.

@liamcervante
Copy link
Member

I've just merged #34058 into the 1.6 branch, so we should see any interoperability issues fixed in v1.6.2.

I do want to highlight that this change is only temporary. The issue here is a bug in earlier versions of Terraform that has been fixed in the patch versions listed in this comment: #34014 (comment). We will reintroduce the variable validations into the state file for the 1.7 minor series, so users of the affected minor series (1.3, 1.4, and 1.5) should upgrade to the latest patch release for the relevant series in order to maintain compatibility when 1.7.0 is released.

@yermulnik
Copy link

yermulnik commented Nov 10, 2023

Apologies for posting into closed issue, though I'm just trying to understand whether the below is a known behavior and this issue looks to be relevant as of compatibility point of view. Thanks.

terraform validate for this var definition:

variable "test" {
  type = number

  validation {
    condition     = var.test > 0
    error_message = "error"
  }
}

This is what I get with TF 0.14:

Validation error message must be at least one full English sentence starting
with an uppercase letter and ending with a period or question mark.

TF 0.15:

│ The validation error message must be at least one full sentence starting with an uppercase letter and ending with
│ a period or question mark.
│
│ Your given message will be included as part of a larger Terraform error message, written as English prose. For
│ broadly-shared modules we suggest using a similar writing style so that the overall result will be consistent.

While TF 1.6 does not output any error 😕 The last TF version I get validation error with is 1.1.9. Other versions (>= 1.2.0) produce no validation error. Is this expected? Has error_message validation rules been relaxed? If yes, where I can find an announcement on this?
Thank you.

@apparentlymart
Copy link
Contributor

Hi @yermulnik,

The behavior you've described here doesn't seem related to state snapshots and so isn't on topic for this issue.

If you'd like to discuss that behavior and what you have is a question rather than a specific bug report or enhancement request, please start a topic in the community forum about this question. Note the there is no version v0.16 of Terraform -- v0.15 was the last of the pre-v1.0 minor releases -- so I suggest also checking carefully which versions of Terraform you are running.

@yermulnik
Copy link

yermulnik commented Nov 10, 2023

@apparentlymart Got you. Started a topic: https://discuss.hashicorp.com/t/variable-validation-error-message-requirements-relaxed/59951

Re TF versions it's me just typing too inaccurate. Meant 1.6, not 1.16.

Copy link
Contributor

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug confirmed a Terraform Core team member has reproduced this issue explained a Terraform Core team member has described the root cause of this issue in code terraform test
Projects
None yet
Development

No branches or pull requests

5 participants