Ensure shorter tokens for object types #1118

t0yv0 · 2023-05-12T14:49:54Z

Hello!

Vote on this issue by adding a 👍 reaction
If you want to implement this feature, comment to let us know (we'll work with you on design, scheduling, etc.)

Issue details

When mapping deeply nested TF properties to Pulumi Package Schema .types section, tfgen needs to pick "tokens" to identify every record/object type it encounters, and sometimes it picks very long tokens. Unfortunately these tokens are not simply abstract identifiers, but are used by every language codegen (C#, Py, TS, Java, Go) to pick names for classes representing these structures. This can lead to very long class names, because the current naming heuristic makes the name O(N) in the depth of property.

Example of an unusable name:

TemplateDefinitionSheetVisualGeospatialMapVisualChartConfigurationFieldWellsGeospatialMapAggregatedFieldWellsValueNumericalMeasureFieldFormatConfigurationNumericFormatConfigurationPercentageDisplayFormatConfigurationSeparatorConfigurationThousandsSeparator

In languages that require file-per-class this additionally creates problems with very long filenames.

The feature request is to make a change to generate shorter tokens for object types. This is scoped to only types of resource/datasource/provider properties, and will not change anything for types representing resources, datasources or provider config itself. The change will make codegen class names shorter, easier to work with, more pleasant, and avoid hard limits on filenames in languages such as C#.

The last bit of the token must be shorter than 256 as that is where codegen breaks. Arguably we can also have a smaller cutoff like 120 for aesthetic reasons. We can make this configurable defaulting to some reasonable value.

Note that this change will be technically breaking, because it will translate to auxiliary types being moved, and some programs depending on the old location of these types will be broken with the update. To mitigate the break, we need to ensure that an escape hatch is available for the bridged provider maintainer to hand-write a token of a given auxiliary type. It looks like .Type or .NestedType fields on SchemaInfo may help with this purpose but we need to confirm as it's not obvious. The expectation is that the maintainers will chose to mostly accept shorter names as a "welcome breaking change", but make exceptions for auxiliary types that happen to be widely used, based on user feedback.

Affected area/feature

The text was updated successfully, but these errors were encountered:

t0yv0 · 2023-05-12T14:56:53Z

Noting some comments per @danielrbradley

Scoping per resource may be beneficial for gradual rollout instead of switching the default wholesale, something that abstracts ObjectTypeNamingStrategy parameter, keep current strategy as the default but allow to opt-in to shorter names per resource/datasource. For ecosystem team we can also have a way to option to a different global strategy if need that, say on larger providers like OCI.

t0yv0 · 2023-05-12T15:04:31Z

Note it looks like .Type/.NestedType cannot be relied right now to influence the picked tokens, so that is definitely something that we need to make sure is available here.

t0yv0 · 2023-09-06T16:49:17Z

@mjeffryes this is the exciting ticket I was thinking of.

`tfbridge.SchemaInfo` has an `Omit bool` field, which will omit a property from the schema. Unfortunately, omitted fields don't omit their associated types from the schema, leading to unnecessary bloat. This leads to code such as this (from aws): ```go // We removed the `definition` property from quicksights.Template, see // #1118 // But the types are still present in the schema, which pollutes the Go SDK // specifically. This function removes those types from the schema. func removeUnusedQuicksightTypes(pulumiPackageSpec *schema.PackageSpec) { var elidedTypes []string for tok := range pulumiPackageSpec.Types { if strings.Contains(tok, ":quicksight/AnalysisDefinition") || strings.Contains(tok, ":quicksight/DashboardDefinition") || strings.Contains(tok, ":quicksight/TemplateDefinition") { elidedTypes = append(elidedTypes, tok) } } for _, tok := range elidedTypes { delete(pulumiPackageSpec.Types, tok) } } ``` `Omit: true` is only used in 2 providers: `pulumi-aws` and `pulumi-alicloud` ([source](https://github.com/search?q=org%3Apulumi%20Omit%3A%20true&type=code)). Datadog has recenlty added some hefty (+900k lines) types, and I am considering using `Omit: true` to trim unwanted types. This is a necessary pre-cursor for that usage.

`tfbridge.SchemaInfo` has an `Omit bool` field, which will omit a property from the schema. Unfortunately, omitted fields don't omit their associated types from the schema, leading to unnecessary bloat. This leads to code such as this (from aws): ```go // We removed the `definition` property from quicksights.Template, see // #1118 // But the types are still present in the schema, which pollutes the Go SDK // specifically. This function removes those types from the schema. func removeUnusedQuicksightTypes(pulumiPackageSpec *schema.PackageSpec) { var elidedTypes []string for tok := range pulumiPackageSpec.Types { if strings.Contains(tok, ":quicksight/AnalysisDefinition") || strings.Contains(tok, ":quicksight/DashboardDefinition") || strings.Contains(tok, ":quicksight/TemplateDefinition") { elidedTypes = append(elidedTypes, tok) } } for _, tok := range elidedTypes { delete(pulumiPackageSpec.Types, tok) } } ``` `Omit: true` is only used in 2 providers: `pulumi-aws` and `pulumi-alicloud` ([source](https://github.com/search?q=org%3Apulumi%20Omit%3A%20true&type=code)). Datadog has recenlty added some hefty (+900k lines) types, and I am considering using `Omit: true` to trim unwanted types. This is a necessary pre-cursor for that usage. It also has the advantage of making more intuitive sense to future users of the bridge.

iwahbe · 2023-12-01T23:46:32Z

Merging #1550 will allow users to override type tokens by setting .Type to the old token. If a type changes and they want to change it back, a .Type based override will restore the old name.

- Most of the `schemaNestedTypes` code was just moved from `generate_schema.go` - The new code is the code related to `nestedTypeGraph` I picked an arbitrary max character limit of 120, but those still look pretty long. I checked in `pulumi-aws` and there are still a lot of types that are up to and over 120. I currently just pick the shorter name if the normal name would be +120, but I feel like it would be better to introduce a new schema property to enable this at the resource level. TODO: - I need to add some more tests to ensure this is deterministic, but I am not yet sure it is possible to make it completely deterministic without tracking state (maybe by reading in the existing schema.json) re #1118

t0yv0 · 2024-09-13T18:02:36Z

We have some progress related to this issue.

OmitType and TypeName to support sharing types #2409 adds flags to manually control generated type names, which can be used to mitigate the shorter tokens problem
Prototyping explicit sharing for Quicksight types pulumi-aws#4449 explored this for Quicksight. Long tokens arise from recursion emulation or non-recursive deep type sharing. We're incubating some tooling to help detect and mitigate both via the mechanism in 2409.

t0yv0 added needs-triage Needs attention from the triage team kind/enhancement Improvements or new features labels May 12, 2023

t0yv0 changed the title ~~Ensure shorter tokens for aux types~~ Ensure shorter tokens for object types May 12, 2023

t0yv0 mentioned this issue May 12, 2023

[Epic] Q2 Bridge Quality #886

Closed

8 tasks

danielrbradley mentioned this issue May 12, 2023

Upgrade v4.67.0 pulumi/pulumi-aws#2521

Merged

guineveresaenger added kind/bug Some behavior is incorrect or out of spec and removed needs-triage Needs attention from the triage team labels May 12, 2023

danielrbradley mentioned this issue May 15, 2023

Re-enable aws quicksight template.definition property pulumi/pulumi-aws#2525

Open

t0yv0 added size/S Estimated effort to complete (1-2 days). size/M Estimated effort to complete (up to 5 days). and removed size/S Estimated effort to complete (1-2 days). labels Aug 4, 2023

t0yv0 removed the kind/bug Some behavior is incorrect or out of spec label Sep 5, 2023

mjeffryes self-assigned this Sep 6, 2023

iwahbe unassigned mjeffryes Dec 1, 2023

t0yv0 mentioned this issue Jan 11, 2024

Bridge quality investment possibilities #1621

Closed

t0yv0 assigned corymhall Jul 22, 2024

mjeffryes added this to the 0.108 milestone Jul 24, 2024

corymhall mentioned this issue Aug 8, 2024

Generate shorter names for properties #2290

Closed

mjeffryes removed this from the 0.108 milestone Aug 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure shorter tokens for object types #1118

Ensure shorter tokens for object types #1118

t0yv0 commented May 12, 2023 •

edited

Loading

t0yv0 commented May 12, 2023

t0yv0 commented May 12, 2023

t0yv0 commented Sep 6, 2023

iwahbe commented Dec 1, 2023

t0yv0 commented Sep 13, 2024

Ensure shorter tokens for object types #1118

Ensure shorter tokens for object types #1118

Comments

t0yv0 commented May 12, 2023 • edited Loading

Hello!

Issue details

Affected area/feature

t0yv0 commented May 12, 2023

t0yv0 commented May 12, 2023

t0yv0 commented Sep 6, 2023

iwahbe commented Dec 1, 2023

t0yv0 commented Sep 13, 2024

t0yv0 commented May 12, 2023 •

edited

Loading