Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Want to exclude lc nodes from job submission in xenon1t partition #87

Closed
FaroutYLq opened this issue Mar 5, 2024 · 3 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@FaroutYLq
Copy link
Collaborator

Since a couple of weeks ago, the nodes configuration in xenon1t has changed

yuanlq@midway2-login2:~$ nodestatus xenon1t
                       --------------***----------------
                       Status of nodes:
                       --------------***----------------
NODES             CPU     MEM     Features                                 STATUS  CORES IN USE      MEM IN USE   PURPOSE  NOTES
-----             ---     ---     --------                                  -----  ------------      ---  -----   -------  ------
midway2-0110  28-core    58GB tc,e5-2680v4,64GB,ib,fdr,ibspine-d9b            mix      6  21.4%     21GB    37%   xenon1t
midway2-0112  28-core    58GB tc,e5-2680v4,64GB,ib,fdr,ibspine-d9b            mix      3  10.7%     23GB  39.6%   xenon1t
midway2-0113  28-core    58GB tc,e5-2680v4,64GB,ib,fdr,ibspine-d9b            mix      3  10.7%     13GB  23.8%   xenon1t
midway2-0119  28-core    58GB tc,e5-2680v4,64GB,ib,fdr,ibspine-d9b            mix      2   7.1%      8GB  15.4%   xenon1t
midway2-0120  28-core    58GB tc,e5-2680v4,64GB,ib,fdr,ibspine-d9b           idle      0     0%      9GB  16.4%   xenon1t
midway2-0121  28-core    58GB tc,e5-2680v4,64GB,ib,fdr,ibspine-d9b           idle      0     0%      6GB  11.4%   xenon1t
midway2-0122  28-core    58GB tc,e5-2680v4,64GB,ib,fdr,ibspine-d9b           idle      0     0%      6GB    12%   xenon1t
midway2-0123  28-core    58GB tc,e5-2680v4,64GB,ib,fdr,ibspine-d9b           idle      0     0%      7GB  12.5%   xenon1t
midway2-0124  28-core    58GB tc,e5-2680v4,64GB,ib,fdr,ibspine-d9b           idle      0     0%      6GB  11.6%   xenon1t
midway2-0125  28-core    58GB tc,e5-2680v4,64GB,ib,fdr,ibspine-d9b           idle      0     0%      6GB  11.3%   xenon1t
midway2-0126  28-core    58GB tc,e5-2680v4,64GB,ib,fdr,ibspine-d9b           idle      0     0%      7GB  13.3%   xenon1t
midway2-0127  28-core    58GB tc,e5-2680v4,64GB,ib,fdr,ibspine-d9b           idle      0     0%      9GB  15.8%   xenon1t
midway2-0129  28-core    58GB tc,e5-2680v4,64GB,ib,fdr,ibspine-d9b            mix      2   7.1%     27GB  46.4%   xenon1t
midway2-0130  28-core    58GB tc,e5-2680v4,64GB,ib,fdr,ibspine-d9b            mix      3  10.7%      8GB  15.3%   xenon1t
midway2-0138  28-core    58GB tc,e5-2680v4,64GB,ib,fdr,ibspine-d9b           idle      0     0%      8GB    15%   xenon1t
midway2-0139  28-core    58GB tc,e5-2680v4,64GB,ib,fdr,ibspine-d9b           idle      0     0%      8GB  14.4%   xenon1t
midway2-0411  28-core   125GB lc,e5-2680v4,128GB,noib                         mix      2   7.1%     31GB  24.8%   xenon1t
midway2-0412  28-core   125GB lc,e5-2680v4,128GB,noib                        idle      0     0%     10GB     8%   xenon1t
midway2-0413  28-core   125GB lc,e5-2680v4,128GB,noib                        idle      0     0%      9GB   7.8%   xenon1t
midway2-0414  28-core   125GB lc,e5-2680v4,128GB,noib                        idle      0     0%      9GB   7.7%   xenon1t
midway2-0415  28-core   125GB lc,e5-2680v4,128GB,noib                        idle      0     0%      9GB   7.2%   xenon1t
midway2-0416  28-core   125GB lc,e5-2680v4,128GB,noib                        idle      0     0%      9GB     8%   xenon1t
midway2-0417  28-core   125GB lc,e5-2680v4,128GB,noib                        idle      0     0%      9GB   7.2%   xenon1t
midway2-0418  28-core   125GB lc,e5-2680v4,128GB,noib                        idle      0     0%      8GB   7.2%   xenon1t
midway2-0419  28-core   125GB lc,e5-2680v4,128GB,noib                        idle      0     0%      9GB   7.5%   xenon1t
midway2-0420  28-core   125GB lc,e5-2680v4,128GB,noib                        idle      0     0%      8GB   6.8%   xenon1t
midway2-0421  28-core   125GB lc,e5-2680v4,128GB,noib                        idle      0     0%      8GB   6.8%   xenon1t
midway2-0422  28-core   125GB lc,e5-2680v4,128GB,noib                        idle      0     0%      8GB   6.9%   xenon1t
midway2-0423  28-core   125GB lc,e5-2680v4,128GB,noib                        idle      0     0%      8GB   6.9%   xenon1t
midway2-0424  28-core   125GB lc,e5-2680v4,128GB,noib                        idle      0     0%      8GB   6.8%   xenon1t
midway2-0425  28-core   125GB lc,e5-2680v4,128GB,noib                        idle      0     0%      9GB   7.3%   xenon1t
midway2-0426  28-core   125GB lc,e5-2680v4,128GB,noib                        idle      0     0%      8GB   6.8%   xenon1t
midway2-0462  28-core    58GB tc,e5-2680v4,64GB,ib,ibspine-d9b                mix      2   7.1%     17GB  30.8%   xenon1t
midway2-0463  28-core    58GB tc,e5-2680v4,64GB,ib,ibspine-d9b                mix      8  28.5%     12GB  20.8%   xenon1t
midway2-0464  28-core    58GB tc,e5-2680v4,64GB,ib,ibspine-d9b               resv      0     0%     11GB  19.2%   xenon1t
midway2-0465  28-core    58GB tc,e5-2680v4,64GB,ib,ibspine-d9b               resv      0     0%      6GB  10.4%   xenon1t

where the lc means loosely-coupled, with NO access to /project. This is blocking us from accessing the data stored in rucio there. Therefore, we want to introduce a mechanism to exclude these nodes from submission.

It can be either done by brutal force (hardcoding all lc nodes into to-exclude list), or more elegantly (actively read results from nodestatus xenon1t and then exclude those lc ones)

@FaroutYLq FaroutYLq added the bug Something isn't working label Mar 5, 2024
@yuema137
Copy link
Contributor

yuema137 commented Mar 7, 2024

Thanks Lanqing! I will add an option for users to avoid the lc nodes and add a warning if they don't

@FaroutYLq
Copy link
Collaborator Author

FaroutYLq commented Mar 7, 2024

Thanks Lanqing! I will add an option for users to avoid the lc nodes and add a warning if they don't

Please just make sure to do it in another PR: ) Thanks a lot for taking care of utilix

@yuema137
Copy link
Contributor

This issue is solved by #91

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants