Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible improvement for the OrtEngine #2095

Closed
andreabrduque opened this issue Oct 20, 2022 · 2 comments · Fixed by #2104
Closed

Possible improvement for the OrtEngine #2095

andreabrduque opened this issue Oct 20, 2022 · 2 comments · Fixed by #2104
Labels
enhancement New feature or request

Comments

@andreabrduque
Copy link
Contributor

andreabrduque commented Oct 20, 2022

Description

I think it would be interesting to add support for onnx models to disable session threadpools by using a shared global thread pool.

As far as my understanding of DJL goes, when we create one predictor of different models for each thread, we end up running several onnx sessions in parallel.

In my benchmarks being able to control the threadpool when multiple onnx sessions are run in parallel offers slightly better performance and better resource utilisation for some models.

Will this change the current api? How?

I thought about passing the option disablePerSessionThreads as a setter, such as done for the other options from SessionOptions.

However It would also require to pass global thread pool settings through the OrtEnvironment.ThreadingOptions

The part where I got really stuck was because the method OrtEnvironment.getEnvironment() is called both in OrtEngine and OrtNDManager implementations. According to the Onnxruntime java API, there is no way to guarantee that the environment has the appropriate thread pool configuration. So if I pass ThreadingOptions to an environment and then retrieve it again, I will get an IllegalStateException.

I hacked a bit here

Who will benefit from this enhancement?

This benefits the use case in which more than one model instance is being used in parallel (for example, if we load a ZooModel each in one GPU and do that for 4 GPUs).

@andreabrduque andreabrduque added the enhancement New feature or request label Oct 20, 2022
@frankfliu
Copy link
Contributor

frankfliu commented Oct 21, 2022

@andreabrduque
OrtEnvironment use a global singleton. If you want to customize ThreadingOptions, you can call the following before DJL is loaded:

OrtEnvironment.getEnvironment(OrtLoggingLevel.ORT_LOGGING_LEVEL_WARNING, "ort-java", threadOptions);

DJL will pick the initialized OrtEnvironment.

@andreabrduque
Copy link
Contributor Author

andreabrduque commented Oct 26, 2022

Hey @frankfliu , thanks for the reply.

The problem with is that I need to be able to set OrtSession.SessionOptions.disablePerSessionThreads() for the ThreadingOptions to make an effect, and the SessionOptions are passed through DJL.

I can make a PR to add that option, but if users try to use the disablePerSessionThreads() without passing the global threading options, they will get an exception, but I can add a section to the docs to warn that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants