Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ac kubernetes #182

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

Ac kubernetes #182

wants to merge 12 commits into from

Conversation

Acribbs
Copy link
Contributor

@Acribbs Acribbs commented Nov 10, 2024

No description provided.

@Acribbs
Copy link
Contributor Author

Acribbs commented Nov 12, 2024

@snsansom @IanSudbery @nickilott @david-cgat @jscaber (and all). This is a big change. I have refactored the code to introduce a new base class called the Executor that is inherited by Slurm, Torque, SGE, Kubernetes and local (it replaces make_runner()). The main aim was to implement kubernetes functonality which I couldnt do with the current framework. I introduced a new class function in base_executor.py that is inherited in executor.py (There is also a kubernets.py which I will move into executor.py before merging). I have tested it locally and on slurm on my end and it works, but if others could give it a go that would be great. It also defaults to LocalExecutor if you dont specify it to run locally, but now I think about it, I think im going to make it raise an error as that keeps it consistent with the current behaviour when it complains if DRMMA API isnt found if its ran for cluster config but is ran locally. Thinking of implementing a lot more functionalities in my free time (when i travel on trains and planes) so would be happy to accept suggestions on what to implement next.

@nickilott
Copy link

Thanks Adam for keeping going with cgat-core! I'm happy to test it out on existing pipelines and will let you know here how it goes

@Acribbs
Copy link
Contributor Author

Acribbs commented Dec 1, 2024

Planning on merging this soon

@nickilott
Copy link

Hey Adam, I just pulled the branch and tried to run a pipeline and got the following error:

File /gpfs3/well/powrie/users/utu691/devel/venv/Python-3.10.8-GCCcore-12.2.0/skylake/lib/python3.10/site-packages/cgatcore/pipeline/executors.py, line 69, in run
self.monitor_job_completion(job_id)
AttributeError: 'SlurmExecutor' object has no attribute monitor_job_completion

I haven'[t had a chance to look into the source of the error in any detail at all but wanted to flag before the merge...

@Acribbs
Copy link
Contributor Author

Acribbs commented Dec 8, 2024

Thanks for this! I have no idea why the tests are not picking this up. Will investigate further. May need some more tests to capure this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants