Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include number of executors per node in cluster information #1119

Merged
merged 5 commits into from
Jun 17, 2024

Conversation

parthosa
Copy link
Collaborator

@parthosa parthosa commented Jun 13, 2024

Fixes #1117. Currently for cluster information, we calculate the number of nodes correctly but do not track the number of executors per node. This can generate wrong GPU cluster recommendations because there can be multiple executors per node.

This PR adds numExecsPerNode in the cluster information output file.

Changes:

Core/Java:

  • Calculate numExecsPerNode as maximum number of executors in any host
  • Update ClusterInfo and related methods.
  • Add Num Executor Per Node as new field in cluster information output CSV file.
  • Update unit test to include case with multiple executors on single node.

Output

Cluster Information Generated from Core:

File: rapids_4_spark_qualification_output_cluster_information.json

Previously:

[ {
  "appName" : "NDS - Power Run",
  "appId" : "application_169685947xxxxx",
  "eventLogPath" : "file:/Users/psarthi/Work/event-logs/xxxxxx",
  "clusterInfo" : {
    "vendor" : "dataproc",
    "coresPerExecutor" : 16,
    "numExecutorNodes" : 4,
    "driverHost" : "xxxx-dataproc-cpu-m.c.xxxx.internal"
  }
} ]

After this fix:

[ {
  "appName" : "NDS - Power Run",
  "appId" : "application_169685947xxxxx",
  "eventLogPath" : "file:/Users/psarthi/Work/event-logs/xxxxxx",
  "clusterInfo" : {
    "vendor" : "dataproc",
    "coresPerExecutor" : 16,
    "numExecsPerNode" : 6,
    "numExecutorNodes" : 4,
    "driverHost" : "xxxx-dataproc-cpu-m.c.xxxx.internal"
  }
} ]

Follow Up

Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
@parthosa parthosa added bug Something isn't working user_tools Scope the wrapper module running CSP, QualX, and reports (python) core_tools Scope the core module (scala) labels Jun 13, 2024
@parthosa parthosa self-assigned this Jun 13, 2024
Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
@parthosa parthosa changed the title Include number of executors per node in cluster inference Include number of executors per node in cluster information Jun 13, 2024
@parthosa parthosa removed the user_tools Scope the wrapper module running CSP, QualX, and reports (python) label Jun 14, 2024
@parthosa parthosa marked this pull request as ready for review June 14, 2024 00:08
@@ -865,9 +865,13 @@ class QualificationAppInfo(
logWarning(s"Application $appId: Cluster with variable executor cores detected. " +
s"Using maximum value.")
}
// Group by host name, find max executors per host, and number of unique hosts
val groupedHosts = activeHosts.groupBy(identity)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we don't want to just use activeHosts, we want to use all hosts to determine execs per node. That will be more likely right in case as an application finishes, if it has dynamic allocation on, it could release executors and thus the final active count could be inaccurate.

Put the number of executor nodes back to the active hosts for now but really we should try to do some timelining of this to see what was the max number in use at any time, but filed #1121 to followup with this. Put it back to what it was for now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also it would be good to get a event log where dynamic allocation is disabled and the number of execs/hosts change over time. This is relatively easy in interactive, where you just run something then wait for the execs to idle timeout. then maybe run something again, maybe smaller.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't want to just use activeHosts, we want to use all hosts to determine execs per node.

Updated the code to use all hosts.

Put the number of executor nodes back to the active hosts for now

Reverted to the original logic.

Copy link
Collaborator Author

@parthosa parthosa Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a scenario for this (I think this meant dynamic allocation to be enabled):

get a event log where dynamic allocation is disabled and the number of execs/hosts change over time

image

Steps:

  1. Created a dataproc cluster with 4 n1-standard-16 worker nodes.
  2. Start a spark shell with --conf spark.dynamicAllocation.enabled=true --conf spark.executor.instances=8 --conf spark.executor.cores=8
  3. Run a large spark application. UI shows all 8 executors are active (2 per worker node).
  4. Wait for 30min. 7 executors are dead.
  5. Run a small spark application. UI shows it 1 executor is active.

Now since we are calculating execs from all hosts, we will get correct num of exec/hosts as 2 instead of 1.

Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
@parthosa parthosa requested a review from tgravescs June 14, 2024 19:51
Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
tgravescs
tgravescs previously approved these changes Jun 14, 2024
Copy link
Collaborator

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have any tests with the dynamic allocation test you had in the description?

Signed-off-by: Partho Sarthi <psarthi@nvidia.com>
@parthosa
Copy link
Collaborator Author

do we have any tests with the dynamic allocation test you had in the description?

Added unit test for dynamic allocation with comment

@tgravescs
Copy link
Collaborator

We cannot perform coresPerNode = numExecsPerNode * coresPerExecutor since a node maybe oversubscribed.

There is no way for us to know this on some platforms like yarn where they select what it schedules by. We will for now just have to make an assumption that it isn't but document and warn user

@tgravescs tgravescs merged commit 18e8e7f into NVIDIA:dev Jun 17, 2024
17 checks passed
amahussein added a commit to amahussein/spark-rapids-tools that referenced this pull request Jun 25, 2024
Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>

This is a follow-up to NVIDIA#1119

This fix only works for Dataproc when the cluster argument is not in the
CLI.
The instance type will be set after multiplying `numExecutorCores` by
`numExecutorsPerNode`
@parthosa parthosa deleted the spark-rapids-tools-1117 branch October 9, 2024 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core_tools Scope the core module (scala)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Cluster Information: Include number of executors per node
2 participants