Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Job failed using spark-submit with parameters when job runs in Azure databricks.. INIT_SCRIPT_FAILURE (CLIENT_ERROR). #1182

Open
bmorshed opened this issue Sep 1, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@bmorshed
Copy link

bmorshed commented Sep 1, 2024

I created a simple netcore3.1 project and ran locally successfully. I wanted to test the Microsoft netcore3.1 application into azure databricks cluster to run job. I followed the instruction as per Microsoft documentation, but job failed.

https://learn.microsoft.com/en-us/previous-versions/dotnet/spark/tutorials/databricks-deployment

Cluster configuration

  • single node
  • Databricks runtime version : runtime 10.4(Scala 2.12,spark 3.2.1)
  • Node type: standard_d4ds_v5
  • Advanced Options: Init Script (WorkSpace and ABFSS options. No DBFS option and also shows warning message)

InitScript

image

copied db-init.sh to workspace->shared folder because there is no DBFS option.

Job configuration

  • Task name: testjob
  • Type : Spark-Submit
  • Cluster : created earlier
  • parameters :
    ["--class","org.apache.spark.deploy.dotnet.DotnetRunner",
    "/dbfs/spark-dotnet/microsoft-spark-3-2_2.12-2.1.1.jar",
    "/dbfs/spark-dotnet/HelloSparkCore31.zip","HelloSparkCore31"]

Publish application:

  • Using VS2019 publish, self -contained option and runtime win-x64
  • create a zip file

Databricks dbfs/spark-dotnet folder content as per ms documentation
-db-init.sh
-install-worker.sh
-microsoft-spark-3-2_2.12-2.1.1.jar
-Microsoft.Spark.Worker.netcoreapp3.1.linux-x64-2.1.1.tar.gz
-HelloSparkCore31.zip

Job creation is okey but got exception when job starts.

Exception as per job output

Cluster '0901-204609-m9yiukve' was terminated. Reason: INIT_SCRIPT_FAILURE (CLIENT_ERROR). Parameters: instance_id:93d671e9f0884221b689a09b125d2655, databricks_error_message:Cluster scoped init script /Shared/db-init.sh failed: Script exit status is non-zero.

image

I am in learning stage about databricks. I searched google a lot but could not resolve.

Any kind of help or hints would be greatly appreciated.

@bmorshed bmorshed added the bug Something isn't working label Sep 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant