Python code is not updated when using Python wheel tasks with existing Cluster ID in the task definition #1050

FrancoisLem · 2023-12-08T15:18:35Z

Describe the issue

When working with a workflow using python_wheel_task (built with poetry), modifications of the python code of the package wheel are not deployed when re-deploying the bundle with databricks bundle deploy.

Configuration

`bundle:
name: my_bundle
include:

./resources/.yml # Jobs Models and clusters descriptions
artifacts:
my-wheel:
type: whl
build: poetry build
and the job resources:
jobs:
my_job:
name: my_workflow
tasks:
################## ETL TASK ##########################
- task_key: "etl_task"
# job_cluster_key: basic-spark-cluster
existing_cluster_id: XXXX-YYYYY-2jtbhpqj
max_retries: 0
python_wheel_task:
package_name: my_package
entry_point: my_entrypoint
parameters:
[
"--config-file-path",
"/Workspace/${workspace.file_path}/conf/tasks/databricks/main_dbx_config.yml",
"--mode", "train"
]
libraries:
- whl: ./dist/my_wheel-.whl
`

Steps to reproduce the behavior

Please list the steps required to reproduce the issue, for example:

Run databricks bundle deploy ...
Run databricks bundle run ...
Modify the source code of the package, like adding a log or a sys.exit()
Run databricks bundle deploy ...
Run databricks bundle run ...
Observe that your modifications are not deployed

Expected Behavior

Since my package is built in the bundle deploy step, the modifications should be included and deployed on the cluster

Actual Behavior

Modifications are not deployed on the existing cluster.

OS and CLI version

WSL Ubuntu 20.04.6 LTS -- Databricks CLI v0.209.1

Is this a regression?

I don't think so

When we are using an existing cluster ID in our bundle is that we are in a development phase, and want to iterate fastly, not waiting at each deploy that a new cluster for our job is deployed, so upgrading the version of our python package is not really an option.

The text was updated successfully, but these errors were encountered:

andrewnester · 2023-12-11T10:08:58Z

Hi @FrancoisLem ! This is a limitation on cluster libraries side which requires cluster restart when the wheel is updated for changes to pick it up.
You have 2 options to work this around:

Set experimental -> python_wheel_wrapper option in your yaml config to true. See details here: Make a notebook wrapper for Python wheel tasks optional #797 and Added transformation mutator for Python wheel task for them to work on DBR <13.1 #635
In your Python project, setup the package version to be auto updated on each build, for example, like here:
https://github.com/databricks/bundle-examples/blob/main/default_python/setup.py#L18-L20

Hope this helps.

JonasDev1 · 2024-12-06T15:22:22Z

We managed it with an automatic poetry version update:

artifacts:
  default:
    type: whl
    build: poetry version patch && poetry build
    path: .

FrancoisLem added the DABs DABs related issues label Dec 8, 2023

andrewnester closed this as completed Dec 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python code is not updated when using Python wheel tasks with existing Cluster ID in the task definition #1050

Python code is not updated when using Python wheel tasks with existing Cluster ID in the task definition #1050

FrancoisLem commented Dec 8, 2023

andrewnester commented Dec 11, 2023

JonasDev1 commented Dec 6, 2024

Python code is not updated when using Python wheel tasks with existing Cluster ID in the task definition #1050

Python code is not updated when using Python wheel tasks with existing Cluster ID in the task definition #1050

Comments

FrancoisLem commented Dec 8, 2023

Describe the issue

Configuration

Steps to reproduce the behavior

Expected Behavior

Actual Behavior

OS and CLI version

Is this a regression?

andrewnester commented Dec 11, 2023

JonasDev1 commented Dec 6, 2024