cty Refinements: Unknown values with tighter range #590

apparentlymart · 2023-02-24T00:37:33Z

My upstream project cty, which HCL uses as its type system, has a new feature in its v1.13.0 release where it can often refine the range of an unknown result from an operation based either on innate characteristics of that operation (e.g. many operations never return a null value) or based on already-refined input values.

There's more information on how exactly refinements work, and the scope of the initial design, in the cty documentation on Refinements.

Along with updating to the new cty which includes that feature, and updating some of the existing tests to expect the newly-refined unknown results, this also makes HCL itself aware of refinements in some specific cases where my research indicates that it'll make it more likely for applications like Terraform to raise errors sooner (e.g. during validation or planning rather than during apply) and to more often produce known results in count and for_each arguments, which Terraform currently requires to be known during planning.

The specific cases I implemented here are:

In an expression like "https://${hostname}", if hostname is unknown then we still know that the final string will be non-null and will start with https://, which in particular means that result != null and result != "" can both produce known true instead of an unknown bool result.
In an expression like unknown ? a : b we can propagate some refinements that are shared by both a and b into the result even though we don't know which match arm to choose yet. This isn't super useful on its own but composes with other rules (both here in HCL and down in cty) to increase the likelihood of producing a wholly-known or partially-known result.
When processing optional attributes in the typeexpr extension -- which Terraform uses for its input variable type constraint mechanism -- if either the given value or the default value are refined as non-null then we can be sure that the value of that attribute won't be null. Again, this is not super useful on its own but knowing that a value definitely isn't null gives more opportunities for replacing unknown values with wholly-known or partially-known results.
When resolving the splat operator [*] we can propagate refinements about the length of the input into the output, as long as we already know the type of the input and so can be sure about whether we're working in the list mode or in the special "null or not" mode.

There's also one extra thing which does not have immediate benefit but may be useful downstream in Terraform:

hcldec has a new "spec" that wraps another spec and applies refinements to its value. Terraform uses hcldec to interpret the content of resource blocks (and similar) in terms of the provider schema, and so if Terraform later allows providers to mention certain static refinements as part of the provider schema then Terraform could potentially check and apply those automatically while decoding the configuration.

In particular, legacy SDK providers are guaranteed to never return a null value for id, and so a hypothetical change to refine that attribute as never being null would make it easier to use the id of a not-yet-created resource as part of a conditional or other similar expression downstream without getting an unknown result.

These changes are all consistent with cty's compatibility rule that Unknown Values can become "more known", which HCL effectively passes on to its own callers as a result of using cty as its type system. Therefore applications adopting a new version of HCL that includes these changes should expect to see more tightly-refined results in their tests involving unknown values, and so should expect to update those tests to match as part of upgrading HCL, as long as the new results are actually correctly-refined for the expected final result.

(If this change is accepted, I have a similar sort of change ready to submit to Terraform itself which would make it refine even more cases, again building on the foundations from both cty and HCL.)

This new concept allows constraining the range of an unknown value beyond what can be captured in a type constraint. We'll make more use of this in subsequent commits.

If we encounter an interpolated unknown value during template rendering, we can report the partial buffer we've completed so far as the refined prefix of the resulting unknown value, which can then potentially allow downstream comparisons to produce a known false result instead of unknown if the prefix is sufficient to satisfy them.

When ConditionalExpr has an unknown predicate it can still often infer some refinement to the range of its result by noticing characteristics that the two results have in common. In all cases we can test if either result could be null and return a definitely-not-null unknown value if not. For two known numbers we can constrain the range to be between those two numbers. This is primarily aimed at the common case where the two possible results are zero and one, which significantly constrains the range. For two known collections of the same kind we can constrain the length to be between the two collection lengths. In these last two cases we can also sometimes collapse the unknown into a known value if the range gets reduced enough. For example, if choosing between two collections of the same length we might return a known collection of that length containing unknown elements, rather than an unknown collection.

If either the given value is refined non-null or if the default value is refined non-null then the final attribute value after defaults processing is also guaranteed non-null even if we don't yet know exactly what the value will be. This rule is pretty marginal on its own, but refining some types of value as non-null creates opportunities to deduce further information when the value is used under other operations later, such as collapsing an unknown but definitely not null list of a known length into a known list of that length containing unknown values.

We know that a splat expression can never produce a null result, and also in many cases we can use length refinements from the source collection to also refine the destination collection because we know that a splat expression produces exactly one result for each input element. This also allows us to be a little more precise in the case where the splat operator is projecting a non-list/set value into a zero or one element list and we know the source value isn't null. This refinement is a bit more marginal since it would be weird to apply the splat operator to a value already known to be non-null anyway, but the refinement might come from far away from the splat expression and so could still have useful downstream effects in some cases.

This new spec type allows adding value refinements to the results of some other spec, as long as the wrapped spec does indeed enforce the constraints described by the refinements.

apparentlymart · 2023-05-30T22:34:49Z

The following Terraform issues would either be resolved completely or partially improved by merging this and then adopting it in Terraform:

In an unknown choice of 1 of 2 lists, if the lists are the same length, the list length should be known terraform#33237
Support partial rendering of templatefile instead of marking the whole value "known after apply" terraform#32651
Allow "count" for non-null check on resource attributes that are always present terraform#26755 (not completed by this change, but this change provides some building-blocks for it)

apparentlymart · 2023-05-30T22:36:06Z

I ended up submitting separately the Terraform-specific parts of adopting refined values: hashicorp/terraform#33234

...but once this PR is merged and released I'll make a new PR in Terraform to adopt the new version of HCL too.

jbardin · 2023-05-31T16:16:36Z

go.mod

@@ -14,7 +14,7 @@ require (
 	github.com/mitchellh/go-wordwrap v0.0.0-20150314170334-ad45545899c7
 	github.com/sergi/go-diff v1.0.0
 	github.com/spf13/pflag v1.0.2
-	github.com/zclconf/go-cty v1.12.1
+	github.com/zclconf/go-cty v1.13.0


should we bump this to the latest patch?

These changes I made only require 1.13.0, so I don't see a strong reason to force it since any calling application can select a newer version if it wants to. This is only here to represent that this new version of HCL literally won't work if you use v1.12 or earlier. 🤷‍♂️

(We could still do it, but I'm inclined to make that a separate change since this particular change doesn't need it.)

apparentlymart added enhancement v2 Relates to the v2 line of releases labels Feb 24, 2023

apparentlymart requested a review from a team February 24, 2023 00:37

apparentlymart self-assigned this Feb 24, 2023

apparentlymart force-pushed the cty-refinements branch from 9522187 to 6d8c5d5 Compare February 24, 2023 01:08

apparentlymart added 6 commits May 30, 2023 15:18

go.mod: Use cty v1.13.0, with support for refined unknown values

624ec4e

This new concept allows constraining the range of an unknown value beyond what can be captured in a type constraint. We'll make more use of this in subsequent commits.

hcldec: RefineValueSpec

75e3f8a

This new spec type allows adding value refinements to the results of some other spec, as long as the wrapped spec does indeed enforce the constraints described by the refinements.

apparentlymart force-pushed the cty-refinements branch from 6d8c5d5 to 75e3f8a Compare May 30, 2023 22:25

jbardin approved these changes May 31, 2023

View reviewed changes

apparentlymart merged commit e4bd3b4 into main May 31, 2023

apparentlymart deleted the cty-refinements branch May 31, 2023 16:26

wata727 mentioned this pull request Jun 30, 2023

hclsyntax: Template expression produces too large value refinements #616

Closed

wata727 mentioned this pull request Aug 30, 2023

ext/typeexpr: Avoid refinements on dynamic values #625

Merged

crazy-max mentioned this pull request Oct 19, 2023

vendor: update hcl dependencies docker/buildx#2088

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cty Refinements: Unknown values with tighter range #590

cty Refinements: Unknown values with tighter range #590

apparentlymart commented Feb 24, 2023 •

edited

Loading

apparentlymart commented May 30, 2023

apparentlymart commented May 30, 2023

jbardin May 31, 2023

apparentlymart May 31, 2023

apparentlymart May 31, 2023

cty Refinements: Unknown values with tighter range #590

cty Refinements: Unknown values with tighter range #590

Conversation

apparentlymart commented Feb 24, 2023 • edited Loading

apparentlymart commented May 30, 2023

apparentlymart commented May 30, 2023

jbardin May 31, 2023

Choose a reason for hiding this comment

apparentlymart May 31, 2023

Choose a reason for hiding this comment

apparentlymart May 31, 2023

Choose a reason for hiding this comment

apparentlymart commented Feb 24, 2023 •

edited

Loading