Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DGX Nightly Benchmark run 20201201 #37

Open
quasiben opened this issue Dec 1, 2020 · 2 comments
Open

DGX Nightly Benchmark run 20201201 #37

quasiben opened this issue Dec 1, 2020 · 2 comments

Comments

@quasiben
Copy link
Owner

quasiben commented Dec 1, 2020

Historical Throughput

Benchmark Image

Raw Data

<Client: 'tcp://127.0.0.1:37017' processes=10 threads=10, memory=540.94 GB>
Distributed Version: 2.31.0.dev0+58.g863b0105
simple
5.971e-01 +/- 4.167e-02
shuffle
2.224e+01 +/- 5.193e-01
rand_access
1.039e-02 +/- 1.207e-03
anom_mean
9.748e+01 +/- 1.237e+00

Raw Values

simple
[0.63938522 0.56597996 0.56482959 0.5679853 0.63865495 0.58807492
0.61041498 0.64601135 0.51409435 0.63523746]
shuffle
[21.48743486 22.00875306 21.55290961 21.57217765 22.52186704 22.36727738
22.85407138 22.73326921 22.85354495 22.49844527]
rand_access
[0.01017618 0.00788522 0.01258779 0.00999117 0.01085591 0.01135015
0.00982213 0.01149797 0.00977087 0.0099709 ]
anom_mean
[99.054811 97.62341428 98.87981558 94.83508325 97.62352633 97.95679402
97.43829679 95.69920492 98.11229801 97.60029459]

Dask Profiles

Scheduler Execution Graph

Sched Graph Image

Client Execution Graph

Client Graph Image

@jakirkham
Copy link
Collaborator

JFYI this is the first profile where we have started compiling bits of Distributed with Cython. Previously we were just running pure Python. Though we are still working on optimizing Distributed. So wouldn't expect this to change a lot.

@jakirkham
Copy link
Collaborator

Interestingly this is the first one where shuffle takes a bit less time. In addition to enabling Cythonization, we optimized check_idle_saturated ( dask/distributed#4289 ) a slow function playing a role in the 2 slowest transitions: transition_processing_memory and transition_waiting_processing. Also we annotated ClientState and typed all variables holding ClientState. ( dask/distributed#4290 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants