-
Notifications
You must be signed in to change notification settings - Fork 655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible improvement for the OrtEngine #2095
Comments
@andreabrduque
DJL will pick the initialized OrtEnvironment. |
Hey @frankfliu , thanks for the reply. The problem with is that I need to be able to set I can make a PR to add that option, but if users try to use the |
Description
I think it would be interesting to add support for onnx models to disable session threadpools by using a shared global thread pool.
As far as my understanding of DJL goes, when we create one predictor of different models for each thread, we end up running several onnx sessions in parallel.
In my benchmarks being able to control the threadpool when multiple onnx sessions are run in parallel offers slightly better performance and better resource utilisation for some models.
Will this change the current api? How?
I thought about passing the option
disablePerSessionThreads
as a setter, such as done for the other options fromSessionOptions
.However It would also require to pass global thread pool settings through the
OrtEnvironment.ThreadingOptions
The part where I got really stuck was because the method
OrtEnvironment.getEnvironment()
is called both inOrtEngine
andOrtNDManager
implementations. According to the Onnxruntime java API, there is no way to guarantee that the environment has the appropriate thread pool configuration. So if I passThreadingOptions
to an environment and then retrieve it again, I will get anIllegalStateException
.I hacked a bit here
Who will benefit from this enhancement?
This benefits the use case in which more than one model instance is being used in parallel (for example, if we load a
ZooModel
each in one GPU and do that for 4 GPUs).The text was updated successfully, but these errors were encountered: