Concurrent use of databricks all-purpose cluster #928

dbeavon · 2021-04-26T04:20:38Z

dbeavon
Apr 26, 2021

I've been trying to send a steady stream of up to 50 concurrent jobs thru a large all-purpose driver node (in azure databricks).

However, the jobs seem to come back to me with unexpected durations (overall time from start to finish).

I'm wondering if .net for spark is synchronizing them in same artificial way, or imposing some concurrency restrictions. I'm using v.1.1.1.

I noticed the following message that doesn't make that much sense to me. I will investigate the code as well ... but thought I would ask to see if anyone knows what this means.

[Info] [JvmBridge] The number of JVM backend thread is set to 10. The max number of concurrent sockets in JvmBridge is set to 7.

This message may be the reason for the unexpected durations of my jobs. Any help would be appreciated.

Answered by dbeavon

Apr 26, 2021

I found the environment variable that controls the message above. It is "DOTNET_SPARK_NUM_BACKEND_THREADS" in ConfigurationService.

Oddly enough there is a different environment variable used in scala:

DotnetBackend (scala):
val numBackendThreads = conf.get(DOTNET_NUM_BACKEND_THREADS)

Is this intentional?

Here are logs from the .net side and JVM side, with seemingly contradictory information:


21/04/26 15:57:21 INFO DotnetRunner: Starting DotnetBackend with /databricks/driver/DataRailWorkspace/UFP.DataRail.Spark.Driver.Whatever.
21/04/26 15:57:21 INFO DotnetBackend: The number of DotnetBackend threads is set to 10.
21/04/26 15:57:21 INFO DotnetRunner: Port number used by DotnetBackend is…

View full answer

dbeavon · 2021-04-26T16:12:44Z

dbeavon
Apr 26, 2021
Author

I found the environment variable that controls the message above. It is "DOTNET_SPARK_NUM_BACKEND_THREADS" in ConfigurationService.

Oddly enough there is a different environment variable used in scala:

DotnetBackend (scala):
val numBackendThreads = conf.get(DOTNET_NUM_BACKEND_THREADS)

Is this intentional?

Here are logs from the .net side and JVM side, with seemingly contradictory information:


21/04/26 15:57:21 INFO DotnetRunner: Starting DotnetBackend with /databricks/driver/DataRailWorkspace/UFP.DataRail.Spark.Driver.Whatever.
21/04/26 15:57:21 INFO DotnetBackend: The number of DotnetBackend threads is set to 10.
21/04/26 15:57:21 INFO DotnetRunner: Port number used by DotnetBackend is 44601

[2021-04-26T15:57:21.5424324Z] [0426-151730-suede47-10-129-253-13] [Info] [ConfigurationService] Using port 44601 for connection.
[2021-04-26T15:57:21.5446732Z] [0426-151730-suede47-10-129-253-13] [Info] [JvmBridge] JvMBridge port is 44601
[2021-04-26T15:57:21.5480685Z] [0426-151730-suede47-10-129-253-13] [Info] [JvmBridge] The number of JVM backend thread is set to 50. The max number of concurrent sockets in JvmBridge is set to 47.

Any pointers would be appreciated. I think I will start setting both environment variables for good measure.

1 reply

dbeavon Apr 27, 2021
Author

Comparing DOTNET_SPARK_NUM_BACKEND_THREADS to DOTNET_NUM_BACKEND_THREADS was not meaningful. One is an environment variable and one is an identifier in code

The DOTNET_NUM_BACKEND_THREADS identifier in the Scala code refers to a spark config parameter in the package org.apache.spark.internal.config.dotnet

The spark config option is spark.dotnet.numDotnetBackendThreads

I increased that to fifty as well and I don't think I can blame spark.dotnet for my bottlenecks anymore... I am discovering that the silly REST API that databricks offers to send jobs to the cluster is probably the reason for my bottlenecks.... I'm not happy with them at the moment. In the very least they should raise a meaningful error message, rather than surreptitiously slowing things down. It reminds me of Apple iphones.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concurrent use of databricks all-purpose cluster #928

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Concurrent use of databricks all-purpose cluster #928

dbeavon Apr 26, 2021

Replies: 1 comment · 1 reply

dbeavon Apr 26, 2021 Author

dbeavon Apr 27, 2021 Author

dbeavon
Apr 26, 2021

Replies: 1 comment 1 reply

dbeavon
Apr 26, 2021
Author

dbeavon Apr 27, 2021
Author