Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Providers of Large Size: Import is very slow in Python #3753

Open
1 task
brent-at-aam opened this issue Oct 25, 2024 · 7 comments
Open
1 task

Providers of Large Size: Import is very slow in Python #3753

brent-at-aam opened this issue Oct 25, 2024 · 7 comments
Labels
bug Something isn't working new Un-triaged issue pre-built providers Issues around pre-built providers managed at https://github.com/hashicorp/cdktf-repository-manager

Comments

@brent-at-aam
Copy link

brent-at-aam commented Oct 25, 2024

Expected Behavior

When using Python, the time taken for a synth is only correlated to the volume of resources within a stack.

Actual Behavior

When using Python, running any operations with larger providers takes a long time due to lengthy module import load times. You could have a single resource and it would still take ~30s to even get started with the synth.

Steps to Reproduce

  1. Use Python
  2. Add the cdktf-cdktf-provider-aws library to your project
  3. Make a small stack that uses any resource from the aws provider module
  4. Run a synth

Versions

Running with the latest versions

Providers

No response

Gist

No response

Possible Solutions

This is really just a general issue for all providers, but only becomes a big problem for the large ones like AWS. Looking through the issues, this is related to #2792, which was ostensibly fixed by #3030.

Perhaps this is a regression due to some newer behavior in upstream packages but importing AWS is back to taking around 30 seconds and has been for quite a while. I've switched a few projects from python to typescript to get away from it but really it would be nice to not have to do that.

The bulk of the time is spent loading the submodules (thanks @giner). This is why I wonder if it's a regression for the changes made in #3030

Another source of slowness is the large gzipped assembly in _jsii.

The root of the module loads it:

/init.py

from ._jsii import *

And worth noting that resource modules also load it:

/foo/init.py

from .._jsii import *

I know much of this is really just behavior of other libraries, and thus this might not be something you can control here. I also realize this is something you have probably already considered, but there is no discussion about it in issues I could find. Is it possible to instruct the package generation to build separate assemblies for each of the submodules of a provider package?

And you would then remove from ._jsii import * from the root. And resources would just import their specific jsii assembly.

Workarounds

None that are feasible

Anything Else?

No response

References

Help Wanted

  • I'm interested in contributing a fix myself

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@brent-at-aam brent-at-aam added bug Something isn't working new Un-triaged issue pre-built providers Issues around pre-built providers managed at https://github.com/hashicorp/cdktf-repository-manager labels Oct 25, 2024
@DanielMSchmidt
Copy link
Contributor

I think the changes we made in 0.18 might help here already: https://developer.hashicorp.com/terraform/cdktf/release/upgrade-guide-v0-18#python-performance-improvements-disable-root-level-provider-imports

I think any other improvements would need to be made on the JSII side, so I would suggest checking if there is a similar issue here: https://github.com/aws/jsii

@brent-at-aam
Copy link
Author

@DanielMSchmidt Yes I referenced those changes above actually. It seems to have zero effect since the jsii assembly is imported at the root level, which is what slows down the runtime.

My question here really is if the package you are building could be structured differently so that it doesn't generate one large JSII assembly but many.

@giner
Copy link

giner commented Nov 8, 2024

Module libraries always all submodules in from the root __init__.py which causes all submodules being loaded all the time. This in turn causes very slow start time. Here is an example (we load cloudwatch_log_group and check whether alb_target_group gets loaded):

time python3 -c "import cdktf_cdktf_provider_aws.cloudwatch_log_group; print(type(cdktf_cdktf_provider_aws.alb_target_group))"
<class 'module'>

real	0m24.241s
user	0m24.020s
sys	0m2.288s

This looks as if it was a mistake. Fixing it will significantly improve experience with CDKTF for Python users.

@giner
Copy link

giner commented Nov 8, 2024

Here is a similar issue reported on jsii aws/jsii#3389

@brent-at-aam
Copy link
Author

Yeah I think we could side-step this entirely if the provider package for python was built similar to something like boto3-stubs, where each subject area is an extra package. I would love to have an interface like:

pip install cdktf-cdktf-provider-aws[s3,iam,lambda]

So this would require that instead of one giant AWS provider package, we would have multiple python packages.

I know these providers are all generated so special tweaks per language like this might be hard to do, but it would seem like it's possible to work within the limitations of jsii, while still delivering a better experience.

@giner
Copy link

giner commented Nov 8, 2024

The problem is not in this being a single package, it's __init__.py importing all submodules for (likely) no good reason is causing the issue

@brent-at-aam
Copy link
Author

brent-at-aam commented Nov 8, 2024

For sure the bulk of the time is the module imports. The jsii assembly load is a minimal impact comparatively, but still slows it down.

I updated the issue description to designate the appropriate source of the slow down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working new Un-triaged issue pre-built providers Issues around pre-built providers managed at https://github.com/hashicorp/cdktf-repository-manager
Projects
None yet
Development

No branches or pull requests

3 participants