-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix DBConnect support in VS Code #1253
Conversation
try: | ||
from databricks.connect import DatabricksSession | ||
return DatabricksSession.builder.getOrCreate() | ||
except ImportError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This really doesn't seem like it should be how we recommend customers write their code...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but until we have something better I'd rather be explicit and verbose than not supporting old DBRs or hiding it in non-standard libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does seem like a very specific factory pattern we want users to follow. How about moving it inside the main function and not having the get_spark function? That should make it very clear that this is not intended to be used everywhere and is here to only enable per file runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct to understand that the only case that DatabricksSession.builder.getOrCreate()
will fail is if the user is targeting this at DBR <13?
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1253 +/- ##
==========================================
- Coverage 52.52% 51.74% -0.78%
==========================================
Files 308 314 +6
Lines 17589 17864 +275
==========================================
+ Hits 9238 9244 +6
- Misses 7657 7919 +262
- Partials 694 701 +7 ☔ View full report in Codecov by Sentry. |
CLI: * The SDK update fixes `fs cp` calls timing out when copying large files. Bundles: * Fix summary command when internal Terraform config doesn't exist ([#1242](#1242)). * Configure cobra.NoArgs for bundle commands where applicable ([#1250](#1250)). * Fixed building Python artifacts on Windows with WSL ([#1249](#1249)). * Add `--validate-only` flag to run validate-only pipeline update ([#1251](#1251)). * Only transform wheel libraries when using trampoline ([#1248](#1248)). * Return `application_id` for service principal lookups ([#1245](#1245)). * Support relative paths in artifact files source section and always upload all artifact files ([#1247](#1247)). * Fix DBConnect support in VS Code ([#1253](#1253)). Internal: * Added test to verify scripts.Execute mutator works correctly ([#1237](#1237)). API Changes: * Added `databricks permission-migration` command group. * Updated nesting of the `databricks settings` and `databricks account settings commands` * Changed `databricks vector-search-endpoints delete-endpoint` command with new required argument order. * Changed `databricks vector-search-indexes create-index` command with new required argument order. * Changed `databricks vector-search-indexes delete-data-vector-index` command with new required argument order. * Changed `databricks vector-search-indexes upsert-data-vector-index` command with new required argument order. OpenAPI commit d855b30f25a06fe84f25214efa20e7f1fffcdf9e (2024-03-04) Dependency updates: * Bump github.com/stretchr/testify from 1.8.4 to 1.9.0 ([#1252](#1252)). * Update Go SDK to v0.34.0 ([#1256](#1256)).
CLI: * The SDK update fixes `fs cp` calls timing out when copying large files. Bundles: * Fix summary command when internal Terraform config doesn't exist ([#1242](#1242)). * Configure cobra.NoArgs for bundle commands where applicable ([#1250](#1250)). * Fixed building Python artifacts on Windows with WSL ([#1249](#1249)). * Add `--validate-only` flag to run validate-only pipeline update ([#1251](#1251)). * Only transform wheel libraries when using trampoline ([#1248](#1248)). * Return `application_id` for service principal lookups ([#1245](#1245)). * Support relative paths in artifact files source section and always upload all artifact files ([#1247](#1247)). * Fix DBConnect support in VS Code ([#1253](#1253)). Internal: * Added test to verify scripts.Execute mutator works correctly ([#1237](#1237)). API Changes: * Added `databricks permission-migration` command group. * Updated nesting of the `databricks settings` and `databricks account settings commands` * Changed `databricks vector-search-endpoints delete-endpoint` command with new required argument order. * Changed `databricks vector-search-indexes create-index` command with new required argument order. * Changed `databricks vector-search-indexes delete-data-vector-index` command with new required argument order. * Changed `databricks vector-search-indexes upsert-data-vector-index` command with new required argument order. OpenAPI commit d855b30f25a06fe84f25214efa20e7f1fffcdf9e (2024-03-04) Dependency updates: * Bump github.com/stretchr/testify from 1.8.4 to 1.9.0 ([#1252](#1252)). * Update Go SDK to v0.34.0 ([#1256](#1256)).
Changes
With the current template, we can't execute the Python file and the jobs notebook using DBConnect from VSCode because we import
from pyspark.sql import SparkSession
, which doesn't support Databricks unified auth. This PR fixes this by passing spark into the library code and by explicitly instantiating a spark session where the spark global is not available.Other changes: