fix(ml): race condition when loading models #3207

mertalev · 2023-07-11T07:13:19Z

Description

This change reverts to loading models synchronously, preventing multiple calls from loading the same model several times.

An earlier change made model loading happen in a background thread so other requests could continue to be handled. However, this had the effect of allowing multiple calls to load the same model repeatedly with concurrent requests. While the default behavior is to load all models at startup, this race condition allowed memory usage to spike dramatically after the models were unloaded.

Additionally defaults to models never being unloaded. The current implementation for unloading isn't robust enough and can result in greater memory usage than simply keeping models in memory.

Fixes #3142

How Has This Been Tested?

Set the job concurrency to a high number (such as 8) for either image tagging or facial recognition (the CLIP model has no logs) and start the job. The logs should only show one instance of the corresponding log below. I tested this with all ML jobs running simultaneously and saw it showed correct behavior (on main, this swelled memory usage to over 10gb).

Image classification log:
Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.

Facial recognition log:

Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
model ignore: /cache/facial-recognition/buffalo_l/models/buffalo_l/1k3d68.onnx landmark_3d_68
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
model ignore: /cache/facial-recognition/buffalo_l/models/buffalo_l/2d106det.onnx landmark_2d_106
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /cache/facial-recognition/buffalo_l/models/buffalo_l/det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
model ignore: /cache/facial-recognition/buffalo_l/models/buffalo_l/genderage.onnx genderage
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /cache/facial-recognition/buffalo_l/models/buffalo_l/w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5
set det-size: (640, 640)

vercel · 2023-07-11T07:13:23Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment

Name	Status	Preview	Comments	Updated (UTC)
immich	⬜️ Ignored (Inspect)			Jul 11, 2023 7:41am

machine-learning/app/config.py

sync model loading, disabled model ttl by default

d88030c

mertalev requested a review from alextran1502 July 11, 2023 07:17

mertalev added 2 commits July 11, 2023 03:23

disable revalidation if model unloading disabled

7d44dec

moved lock

42f5f31

jrasm91 reviewed Jul 11, 2023

View reviewed changes

machine-learning/app/config.py Show resolved Hide resolved

alextran1502 approved these changes Jul 11, 2023

View reviewed changes

alextran1502 merged commit 848ba68 into main Jul 11, 2023

alextran1502 deleted the ml/fix-race-condition branch July 11, 2023 17:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ml): race condition when loading models #3207

fix(ml): race condition when loading models #3207

mertalev commented Jul 11, 2023 •

edited

Loading

vercel bot commented Jul 11, 2023 •

edited

Loading

fix(ml): race condition when loading models #3207

fix(ml): race condition when loading models #3207

Conversation

mertalev commented Jul 11, 2023 • edited Loading

Description

How Has This Been Tested?

vercel bot commented Jul 11, 2023 • edited Loading

mertalev commented Jul 11, 2023 •

edited

Loading

vercel bot commented Jul 11, 2023 •

edited

Loading