You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.
I have made an investigation of this issue. Currently, memory issues exist both for framework watcher and poller.
For framework watcher, if it is restarted or the watch connection is disconnected, a re-listing will happen. If there are too many frameworks, memory issue happens.
For poller, in every polling round, if there are too many deleted=false and (completed=true or synced=false) frameworks in database, memory issue happens.
I have tested the memory usage of basic framework objects in Node.js: 10000 simple framework objects can consume 265MB+ memory. So there is no surprise that 30000 real framework objects can consume 1GB+ memory.
Apart from re-listing/polling, if there are too many frameworks in processing queue, the service will be restarted. Then the re-listing/polling happens. So solving memory problem in re-listing/polling is more critical.
To solve the memory problem in framework watcher, we can take advantage of chunk listing in k8s api server. Each time for re-listing, we query api server by chunks and synchronize it to write-merger. Then start watching from the chunks' resource version. The informer in k8s node.js client doesn't support this (see here for the logic of it). So we should handle it by ourselves.
To solve the memory problem in poller, we may use SELECT ... WHERE ... ORDER BY "submissionTime" ASC LIMIT N in every polling round. But we should confirm every job can be polled eventually.
To mitigate this issue during processing in queue, we can:
raise memory limit
add a concurrency control to handle burst
hzy46
changed the title
Framework watcher consumes too much memory when re-listing frameworks
Memory issues for framework watcher and poller in DBController
Aug 28, 2020
If there is a lot of waiting jobs (e.g. 30000+ frameworks), and framework watcher does a re-listing, the memory usage will soon rise up to >1000 MB+.
The text was updated successfully, but these errors were encountered: