Generate singleton lists as embedded objects #387

ulucinar · 2024-04-17T09:17:35Z

Description of your changes

Fixes #136

Terraform configuration blocks, even if they have a MaxItems constraint of 1, are (almost) always generated as lists. We now generate the lists with a MaxItems constraint of 1 as embedded objects in our MR APIs.

This also helps when updating or patching via SSA these list objects (which are now converted to embedded objects). The merging strategy implemented by SSA requires configuration for associative lists and converting the singleton lists into embedded objects removes the configuration need.

In addition to the changes in the code generation pipelines, this PR also implements the runtime conversion logic needed to convert the CRD's embedded objects back to the singleton lists when passing data to the Terraform stack. We also need to convert the singleton lists we read from the Terraform stack into embedded objects for updating spec.forProvider (late-initialization), status.atProvider, or when updating the connection details secrets.

This PR also introduces a schema traverser, which can decouple the Terraform schema traversal logic from the actions (such as code generation, inspection, or singleton-list-to-embedded-object conversion) taken while traversing the schema.

When providers consume these changes, they are forced to generate the singleton lists in their APIs as embedded objects. By default, the generated APIs should not be affected. A provider can opt in the conversion, by specifying the following Provider option in upjet's config.NewProvider:

pc := ujconfig.NewProvider(...
	ujconfig.WithSchemaTraversers(&ujconfig.SingletonListEmbedder{}),

This traverser will detect the singleton lists and mark them to be generated as embedded objects. Provider maintainers will also have the flexibility to override the resource configuration generated by this traverser with upjet's resource configuration framework. This implies that the resource configuration overrides are executed after the mutating traversers act on the resource configurations.

Although it's possible to overwrite the existing CRD API versions, in the providers, with the converted APIs, the providers consuming this change can also generate new versions of their CRD APIs with the embedded object schemas. We extend the existing multi-version CRD generation support to allow specifying the storage version and the hub version via config.Resource.SetCRDStorageVersion and config.Resource.SetCRDHubVersion, respectively. config.Resource.CRDStorageVersion and config.Resource.CRDHubVersion can be used to access the configured storage and hub versions, respectively. These have been implemented as functions instead of struct fields to enforce defaulting: If the storage version or the hub version is not specified, then the default is to use the configured API version. This is also the current behavior.

If the providers opt to generate the converted APIs as new versions, this PR also implements two common CRD API conversion functions to be invoked, at runtime, by their conversion webhooks. These conversion functions can be registered with the API conversion registry as follows:

r.Conversions = []conversion.Conversion{
	conversion.NewIdentityConversionExpandPaths(conversion.AllVersions, conversion.AllVersions, []string{"spec.forProvider", "spec.initProvider", "status.atProvider"}, r.CRDListConversionPaths()...),
	conversion.NewSingletonListConversion("v1beta1", "v1beta2", r.CRDListConversionPaths(), conversion.ToEmbeddedObject),
	conversion.NewSingletonListConversion("v1beta2", "v1beta1", r.CRDListConversionPaths(), conversion.ToSingletonList)}

The identity converter will handle copying of the same name fields from the conversion source object to the target, ignoring the converted fields (remember that the embedded objects in the new API have the same field names as the singleton lists in the old API but their types differ and thus, they cannot be handled by the identity converter).

The singleton list converter works on the converted fields at any depth and is responsible for the conversions between the singleton lists and the embedded objects.

If the providers use these converters, they can keep their old APIs (e.g., v1beta1) intact and introduce the embedded object schemas in the new API versions (e.g., v1beta2). When it encounters an old v1beta1 API with the singleton lists, the registered conversion webhook will be able to convert it to the new v1beta2 API with the embedded objects seamlessly.

I have:

Read and followed Upjet's contribution process.
Run make reviewable to ensure this PR is ready for review.
Added backport release-x.y labels to auto-backport this PR if necessary.
Add unit tests
Add runtime conversion between singleton lists & embedded objects to the TF plugin framework external client
If we are not going to support this conversion in the CLI-based runtime, then we should not just convert their APIs (as their runtimes will simply be broken)

How has this code been tested

Manually tested via an affected resource (AccessPolicy. conditionalaccess, which currently has many singleton lists in its API) in crossplane-contrib/provider-upjet-azuread.

Also validated the same resource by consuming the upjet changes from this PR's feature branch but not configuring the list converter in crossplane-contrib/provider-upjet-azuread. This is to make sure that the proposed changes in the code generation pipeline & the upjet runtime are both backwards-compatible.

mergenci

Thanks for your work @ulucinar. Please find my comments below.

pkg/config/common.go

pkg/config/conversion/conversions.go

pkg/controller/external_tfpluginsdk.go

mergenci · 2024-04-18T22:36:47Z

pkg/pipeline/templates/crd_types.go.tmpl

 // +kubebuilder:storageversion
+{{ end }}


Question: Would {{- end }} be more appropriate here? Otherwise, it seems to me that double blank lines will follow.

sergenyalcin

Thanks @ulucinar I left a few comments. I also tested this PRs' some functionalities in the following one: #397

sergenyalcin · 2024-05-06T19:12:00Z

pkg/config/conversion/runtime_conversion.go

+	switch mode {
+	case ToSingletonList:
+		slices.Sort(paths)
+	case ToEmbeddedObject:
+		sort.Slice(paths, func(i, j int) bool {
+			return paths[i] > paths[j]
+		})
+	}


Do we need this sorting because of the order of the path processing? I mean, we ensure that conversions are performed correctly even when nested structures are present. For example, in the ToEmbeddedObject case, the deeper structures are processed first. Because if we change the deeper structures before the higher-level structures, it will not affect the conversion of the higher-level structures.

What do you think about whether there is any potential risk during this approach? Do you think it's worth coding this in more conventional?

The relevant issue here is the fieldpath expressions use indices (e.g., [*]) for traversing lists. We generate the fieldpath expressions for the singleton lists that have been converted using the expressions that work on the original Terraform topology. And we need to convert from embedded objects to singleton lists (when passing data from the XP layer to the TF layer) and from singleton lists to embedded objects (when passing data from the TF layer to the XP layer) at runtime using these fieldpath expressions. If we do not process the field conversions in the correct order, the expressions may mismatch the temporary topology during a conversion. For example, let's assume a singleton list contains another nested singleton list, both of which we've converted and let's assume the corresponding fieldpath expressions for the conversion targets are a.b[*] and a.b[*].c[*], respectively. When converting from embedded objects to singleton lists, converting a.b (which is an embedded object in the source) before its children relieves us from having to adjust the fieldpath expressions with the a.b[*]. prefix because once a.b is converted into a singleton list, then the fieldpath expressions that target its children now match the (converted) topology. Similarly, when converting from singleton lists to embedded objects at runtime (e.g., after reading the TF state), converting a.b[*] after a.b[*].c[*] relieves us from having to adjust the fieldpath expressions that target a.b[*]'s children.

This is the motivation of processing the conversions in an order. We could as well adjust the fieldpath expressions to match the topology as we perform the conversions, but this looks simpler to me.

sergenyalcin · 2024-05-06T19:32:47Z

pkg/controller/conversion/functions.go

-	gvk := dst.GetObjectKind().GroupVersionKind()
-	if err := runtime.DefaultUnstructuredConverter.FromUnstructured(srcMap, dst); err != nil {
-		return errors.Wrap(err, "cannot convert the map[string]any representation of the source object to the conversion target object")
-	}
-	// restore the original GVK for the conversion destination
-	dst.GetObjectKind().SetGroupVersionKind(gvk)
-


As I understand, we moved this block to IdentityConversion, right? What do you think about a wrong configuration (via Resource.Conversions) that may override this part? Then we will miss the identity copying.

Yes, correct. Previously, conversion.RoundTrip would unconditionally do the identity conversion which copies fields with the same name & path into the target object. But in the scenario where we have converted singleton lists into embedded objects, the source and the target end up having fields with different types (a list and an object) at identical paths, which breaks the identity conversion.

Furthermore, it's possible to have, for instance, a conversion.ManagedConversion, which performs the identity conversion itself (e.g., by directly copying some metadata fields from source into the target). We previously did not have this flexibility and conversion.RoundTrip would always do the identity conversion.

It's now optional and for backwards compatibility, config.DefaultResource now initializes the config.Resource.Conversions slice with the identity conversion. In contexts where we don't want the identity conversion, then we need to replace Conversions slice.

There could in theory be clients which just replace the slice here although they need the identity conversion and will be broken when they consume these changes. If they are overriding the default Conversions slice, then it means they are not interested in using the defaults. But probably they just assume there are no defaults instead of not using these (or any future) defaults. I consider this as a bad practice. Another complication is that there could be multiple places in the provider code where the converters are added. If multiple sites replace these converters (instead of appending to already existing converters), then again it's probably unintended.

Unfortunately, I think this has roots in not following the principle of encapsulation in the resource configuration API (the configuration data fields are just exported). This requires a self-discipline at the client side.

- Terraform configuration blocks, even if they have a MaxItems constraint of 1, are (almost) always generated as lists. We now generate the lists with a MaxItems constraint of 1 as embedded objects in our MR APIs. - This also helps when updating or patching via SSA the (previously list) objects. The merging strategy implemented by SSA requires configuration for associative lists and converting the singleton lists into embedded objects removes the configuration need. - A schema traverser is introduced, which can decouple the Terraform schema traversal logic from the actions (such as code generation, inspection, or singleton-list-to-embedded-object conversion) taken while traversing the schema. Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

the generated CRD API version as the storage version. Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

…versions between singleton list & embedded object API versions. - Export conversion.Convert to be reused in embedded singleton list webhook API conversions. Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

the storage & hub API versions independently. - The default values for both of the storage & hub versions is the version being generated, i.e., Resource.Version. - Replace pipeline.ConversionHubGenerator & pipeline.ConversionSpokeGenerator with a common pipeline.ConversionNodeGenerator implementation - Now the hub generator can also inspect the generated files to regenerate the hub versions according to the latest resource configuration and we have removed the assumption that the hub version is always the latest version generated. - Fix duplicated GKVs issue in zz_register.go. Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

…Converter Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

…ield paths when the conversion.RoundTrip copies the fields, from src to dst, with the same name. Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

- Fix runtime conversion for expanded field paths of length greater than 1. Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

- Prevent execution of these pipelines multiple times for each available versions of an API group. - Improve conversion.RoundTrip paved conversion error message Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

sergenyalcin

Thanks @ulucinar LGTM!

mergenci

Thanks for your changes @ulucinar 🙏 LGTM.

ulucinar requested review from sergenyalcin, erhancagirici and mergenci as code owners April 17, 2024 09:17

ulucinar marked this pull request as draft April 17, 2024 09:18

ulucinar force-pushed the embed-singleton branch 7 times, most recently from f22cff7 to 017f4f2 Compare April 18, 2024 13:38

mergenci requested changes Apr 18, 2024

View reviewed changes

ulucinar force-pushed the embed-singleton branch 2 times, most recently from 03260c8 to 0c57170 Compare April 19, 2024 09:20

ulucinar mentioned this pull request Apr 19, 2024

Convert all singleton lists in the MR APIs to embedded objects crossplane-contrib/provider-upjet-azuread#123

Merged

6 tasks

ulucinar force-pushed the embed-singleton branch 3 times, most recently from d6c4f08 to 9a463d5 Compare April 22, 2024 07:23

ulucinar mentioned this pull request Apr 22, 2024

Convert all singleton lists in the MR APIs to embedded objects crossplane-contrib/provider-upjet-gcp#508

Merged

7 tasks

ulucinar marked this pull request as ready for review April 24, 2024 14:36

sergenyalcin mentioned this pull request May 6, 2024

Add example converter for conversion of singleton lists to embedded objects #397

Merged

3 tasks

sergenyalcin reviewed May 6, 2024

View reviewed changes

ulucinar mentioned this pull request May 7, 2024

Convert all singleton lists in the MR APIs to embedded objects crossplane-contrib/provider-upjet-azure#733

Merged

8 tasks

ulucinar force-pushed the embed-singleton branch 3 times, most recently from 1543b80 to 83e4f07 Compare May 7, 2024 18:46

ulucinar added 4 commits May 8, 2024 02:53

Make singleton-list-to-embedded-object API conversions optional

fb97ee5

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Add a config.Resource configuration option to be able to mark

668de34

the generated CRD API version as the storage version. Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

ulucinar added 9 commits May 8, 2024 02:53

Fix the error messages in the template implementations of conversion.…

914df98

…Converter Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Add conversion.identityConversion API converter and allow excluding f…

a87bc73

…ield paths when the conversion.RoundTrip copies the fields, from src to dst, with the same name. Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Fix slice value length assertion in conversion.Convert

136fabd

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Initialize config.Resource.OverrideFieldNames with an empty map

5a6fd95

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Fix connection details state value map conversion

cf00700

- Fix runtime conversion for expanded field paths of length greater than 1. Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Add unit tests for runtime API conversions

26df74c

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

Add tests for config.SingletonListEmbedder

44c139e

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

ulucinar force-pushed the embed-singleton branch from 2e1b6cd to 44c139e Compare May 7, 2024 23:54

sergenyalcin approved these changes May 8, 2024

View reviewed changes

mergenci approved these changes May 8, 2024

View reviewed changes

ulucinar merged commit 03a207b into crossplane:main May 8, 2024
6 checks passed

ulucinar deleted the embed-singleton branch May 8, 2024 13:47

ulucinar mentioned this pull request May 14, 2024

Add config.Resource.PreviousVersions to specify the previous versions of an MR API #402

Merged

3 tasks

ulucinar mentioned this pull request May 29, 2024

Convert all singleton lists in the MR APIs to embedded objects crossplane-contrib/provider-upjet-aws#1332

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate singleton lists as embedded objects #387

Generate singleton lists as embedded objects #387

ulucinar commented Apr 17, 2024 •

edited

Loading

mergenci left a comment

mergenci Apr 18, 2024

sergenyalcin left a comment

sergenyalcin May 6, 2024 •

edited

Loading

ulucinar May 7, 2024

sergenyalcin May 6, 2024

ulucinar May 7, 2024

sergenyalcin left a comment

mergenci left a comment

Generate singleton lists as embedded objects #387

Generate singleton lists as embedded objects #387

Conversation

ulucinar commented Apr 17, 2024 • edited Loading

Description of your changes

How has this code been tested

mergenci left a comment

Choose a reason for hiding this comment

mergenci Apr 18, 2024

Choose a reason for hiding this comment

sergenyalcin left a comment

Choose a reason for hiding this comment

sergenyalcin May 6, 2024 • edited Loading

Choose a reason for hiding this comment

ulucinar May 7, 2024

Choose a reason for hiding this comment

sergenyalcin May 6, 2024

Choose a reason for hiding this comment

ulucinar May 7, 2024

Choose a reason for hiding this comment

sergenyalcin left a comment

Choose a reason for hiding this comment

mergenci left a comment

Choose a reason for hiding this comment

ulucinar commented Apr 17, 2024 •

edited

Loading

sergenyalcin May 6, 2024 •

edited

Loading