Optimize for inference when using call api #162

joeyballentine · 2024-02-19T07:34:05Z

Generally speaking, it's always good to put a model in inference mode when performing inference. I figure it's probably good to do this automatically when using the call api to prevent possible problems.

Could theoretically be related to #160 but I think they are doing the right things there so I don't think tat's it

RunDevelopment · 2024-02-19T10:57:21Z

Can @torch.inference_mode() and model.eval() negatively affect performance if the model already under inference mode?

joeyballentine · 2024-02-19T15:07:40Z

I haven't tested it, but I don't believe so.

For the record, I'm pretty sure we call that multiple times in chaiNNer. And the inference mode thing is meant to be used individually each time the model is ran. Check the docs.

joeyballentine added 2 commits February 19, 2024 02:28

Optimize for inference when using call api

cec94b1

We only need the decorator

19bfe69

RunDevelopment approved these changes Feb 19, 2024

View reviewed changes

joeyballentine merged commit 4e647a7 into main Feb 19, 2024
7 checks passed

joeyballentine deleted the optimize-inference branch February 19, 2024 20:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize for inference when using call api #162

Optimize for inference when using call api #162

joeyballentine commented Feb 19, 2024

RunDevelopment commented Feb 19, 2024

joeyballentine commented Feb 19, 2024

Optimize for inference when using call api #162

Optimize for inference when using call api #162

Conversation

joeyballentine commented Feb 19, 2024

RunDevelopment commented Feb 19, 2024

joeyballentine commented Feb 19, 2024