You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.
As introduced in #5292 , we create a job specific token for each job submission. However, during a stress test, I find it will bring large overhead to the cluster.
For example:
If N jobs are submitted, N tokens will be created.
The rest-server API verify will call purge. purge will take O(N) time when N tokens is present.
The SSH plugin in these N jobs all need to call rest-server, which will call the verify function internally.
Overall, O(N^2) overhead will be bought to rest-server, database, and api server.
Another potential issue is that: We save all tokens in one secret object. But Kubernetes' objects have 1MB size limit. This will limit the maximum number of jobs the user can run at the same time.
One possible solution is that we just create one job specific token for the user. And delete it when there is no active jobs for the user:
In purge, query the database to find if the user has any incompleted jobs. If there is no such job, remove user's job specific token.
In submit job API, try to create a job specific token if there is no such token. And generate tokenSecretDef for database controller.
The logic of database controller and runtime remains the same.
The text was updated successfully, but these errors were encountered:
As introduced in #5292 , we create a job specific token for each job submission. However, during a stress test, I find it will bring large overhead to the cluster.
For example:
verify
will callpurge
.purge
will take O(N) time when N tokens is present.verify
function internally.Another potential issue is that: We save all tokens in one secret object. But Kubernetes' objects have 1MB size limit. This will limit the maximum number of jobs the user can run at the same time.
One possible solution is that we just create one job specific token for the user. And delete it when there is no active jobs for the user:
purge
, query the database to find if the user has any incompleted jobs. If there is no such job, remove user's job specific token.submit job
API, try to create a job specific token if there is no such token. And generate tokenSecretDef for database controller.The text was updated successfully, but these errors were encountered: