Replies: 1 comment
-
Hey all, could you check this discussion, I'm trying to regroup all the threads on that topic in one place |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Background
When the connection type is session for spark type and lets say user wants to pass specific application level properties, then only option user have is to change the properties in SPARK_HOME path, in spark-defaults.conf file. User has to have the option to pass the parameters so that these will be considered when the sparkSession is initialized.
Currently Supported or a way to achieve this:
If user needs to provide a dynamically configurable properties of spark, it can be passed using pre_hook tag in model configurations. But lets say there are multiple dbt projects and user wants to pass some application specific memory configurations for each of the dbt projects, in that case, the only possible way is user has to change the config in spark-defaults.conf which is not a feasible way.
What can be done
Currently server_side_paramaters we can give in profile, according to this PR. So code changes can be done to access the info from this and set when starting the spark session in sessions.py. But lets say there are multiple dbt projects referring to the same profile, but the application properties or memory requirements are different, in that case again user has to intervene to update the profiles.yml file which is not expected.
Proposal
Planning to introduce a feature to support the above use case. Please evaluate this and lets discuss on this and im planning to introduce a new generalized config file to support adding custom properties which can be used specific to dbt_projects. In this way above use cases will be satisfied.
Im planning to contribute this feature to opensource, Need help from the community to evaluate and finalize a solution for the same.
Beta Was this translation helpful? Give feedback.
All reactions