Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add validation for single node clusters #1909

Merged
merged 6 commits into from
Nov 22, 2024
Merged

Conversation

shreyas-goenka
Copy link
Contributor

@shreyas-goenka shreyas-goenka commented Nov 18, 2024

Changes

This PR adds a warning validating that the configuration for a single node cluster is valid for interactive, job, job-task, and pipeline clusters.

Note: We skip the validation if a cluster policy is configured because the policy is likely to configure spark_conf / custom_tags itself.

Note: Terrform originally only had validation for interactive, job, and job-task clusters. This PR adding the validation for pipeline clusters as well is new.

This PR follows the same logic as we used to have in Terraform. The validation was removed from Terraform because we had no way to demote the error to a warning:
databricks/terraform-provider-databricks#4222

Background

Single-node clusters require spark_conf and custom_tags to be correctly set in the cluster definition for them to function optimally. The cluster will be created even if incorrectly configured, but its performance will not be great.

For example, if both spark_conf and custom_tags are not set and num_workers is 0, then only the driver process will be launched on the cluster compute instance thus leading to sub-optimal utilization of available compute resources and no parallelization across worker processes when processing a spark query.

Issue

This PR addresses some issues reported in #1546

Tests

Unit tests and manually.

Example output of the warning:

➜  bundle-playground git:(master) ✗ cli bundle validate
Warning: Single node cluster is not correctly configured
  at resources.pipelines.bar.clusters[0]
  in databricks.yml:29:11

num_workers should be 0 only for single-node clusters. To create a
valid single node cluster please ensure that the following properties
are correctly set in the cluster specification:

  spark_conf:
    spark.databricks.cluster.profile: singleNode
    spark.master: local[*]

  custom_tags:
    ResourceClass: SingleNode
  

Name: foobar
Target: default
Workspace:
  User: shreyas.goenka@databricks.com
  Path: /Workspace/Users/shreyas.goenka@databricks.com/.bundle/foobar/default

Found 1 warning

}
}

func TestValidateSingleNodePipelineClustersFail(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We duplicate the test cases for pipeline clusters because they have a separate proto / Go SDK type for pipeline clusters that mirrors the spec from the cluster APIs.

Duplicating the validation code and test seemed easier and more robust than generalizing the code by converting between the two cluster structs or having a common representation.

bundle/config/validate/single_node_cluster.go Outdated Show resolved Hide resolved
bundle/config/validate/single_node_cluster.go Outdated Show resolved Hide resolved
bundle/config/validate/single_node_cluster.go Outdated Show resolved Hide resolved
Copy link

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/cli

Inputs:

  • PR number: 1909
  • Commit SHA: 382a4efd6efb98adbc7532a3286d81521c899843

Checks will be approved automatically on success.

@eng-dev-ecosystem-bot
Copy link
Collaborator

Test Details: go/deco-tests/11974490258

@shreyas-goenka shreyas-goenka added this pull request to the merge queue Nov 22, 2024
Merged via the queue into main with commit b323703 Nov 22, 2024
10 checks passed
@shreyas-goenka shreyas-goenka deleted the warning/single-node branch November 22, 2024 15:55
pietern added a commit that referenced this pull request Dec 5, 2024
**New features for Databricks Asset Bundles:**

This release adds support for managing Unity Catalog volumes as part of your bundle configuration.

Bundles:
 * Add DABs support for Unity Catalog volumes ([#1762](#1762)).
 * Support lookup by name of notification destinations ([#1922](#1922)).
 * Extend "notebook not found" error to warn about missing extension ([#1920](#1920)).
 * Skip sync warning if no sync paths are defined ([#1926](#1926)).
 * Add validation for single node clusters ([#1909](#1909)).
 * Fix segfault in bundle summary command ([#1937](#1937)).
 * Add the `bundle_uuid` helper function for templates ([#1947](#1947)).
 * Add default value for `volume_type` for DABs ([#1952](#1952)).
 * Properly read Git metadata when running inside workspace ([#1945](#1945)).
 * Upgrade TF provider to 1.59.0 ([#1960](#1960)).

Internal:
 * Breakout variable lookup into separate files and tests ([#1921](#1921)).
 * Add golangci-lint v1.62.2 ([#1953](#1953)).

Dependency updates:
 * Bump golang.org/x/term from 0.25.0 to 0.26.0 ([#1907](#1907)).
 * Bump github.com/Masterminds/semver/v3 from 3.3.0 to 3.3.1 ([#1930](#1930)).
 * Bump github.com/stretchr/testify from 1.9.0 to 1.10.0 ([#1932](#1932)).
 * Bump github.com/databricks/databricks-sdk-go from 0.51.0 to 0.52.0 ([#1931](#1931)).
github-merge-queue bot pushed a commit that referenced this pull request Dec 5, 2024
**New features for Databricks Asset Bundles:**

This release adds support for managing Unity Catalog volumes as part of
your bundle configuration.

Bundles:
* Add DABs support for Unity Catalog volumes
([#1762](#1762)).
* Support lookup by name of notification destinations
([#1922](#1922)).
* Extend "notebook not found" error to warn about missing extension
([#1920](#1920)).
* Skip sync warning if no sync paths are defined
([#1926](#1926)).
* Add validation for single node clusters
([#1909](#1909)).
* Fix segfault in bundle summary command
([#1937](#1937)).
* Add the `bundle_uuid` helper function for templates
([#1947](#1947)).
* Add default value for `volume_type` for DABs
([#1952](#1952)).
* Properly read Git metadata when running inside workspace
([#1945](#1945)).
* Upgrade TF provider to 1.59.0
([#1960](#1960)).

Internal:
* Breakout variable lookup into separate files and tests
([#1921](#1921)).
* Add golangci-lint v1.62.2
([#1953](#1953)).

Dependency updates:
* Bump golang.org/x/term from 0.25.0 to 0.26.0
([#1907](#1907)).
* Bump github.com/Masterminds/semver/v3 from 3.3.0 to 3.3.1
([#1930](#1930)).
* Bump github.com/stretchr/testify from 1.9.0 to 1.10.0
([#1932](#1932)).
* Bump github.com/databricks/databricks-sdk-go from 0.51.0 to 0.52.0
([#1931](#1931)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants