Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify Python code within certain snakemake rules #120

Open
jmtsuji opened this issue Feb 13, 2024 · 6 comments
Open

Simplify Python code within certain snakemake rules #120

jmtsuji opened this issue Feb 13, 2024 · 6 comments
Labels
enhancement New feature or request

Comments

@jmtsuji
Copy link
Collaborator

jmtsuji commented Feb 13, 2024

Some snakemake rules (e.g., rule rule summarize_contigs_by_coverage in polish.smk) currently include long Python code blocks that contain functions etc. Would it be better to simplify the code included in the Snakefile, e.g., by calling a separate Python script, for such code?

@jmtsuji jmtsuji added the question Further information is requested label Feb 13, 2024
@jmtsuji
Copy link
Collaborator Author

jmtsuji commented Feb 13, 2024

Previous discussion for rule summarize_contigs_by_coverage in #114:

@jmtsuji

Thanks for improving the function definition here. How do you think we should handle such functions? Might it be better to move functions like this into a different part of the snakefile (or to a different file in the repo) rather than keep it nested inside the rule?

@LeeBergstrand

@jmtsuji, do you have to move imports inside the run block as if it's a script? I suspect everything outside the run block can't be seen inside unless it is a particular variable created by Snakemake. That is my suspicion, but I am unsure; it could be injecting it differently than expected. If we wanted to move it, I would move it to utils.py (most of utils.py would be moved to its library, but we can have two util files) or a dedicated coverage so as not to clutter up to code.

@jmtsuji
Copy link
Collaborator Author

jmtsuji commented Feb 13, 2024

@LeeBergstrand Regarding your comment above, I think you are probably right... I also suspect that everything outside the code block can't be seen inside. We will probably need to import the code via an import command or just write a Python script that we can call for the rule.

@jmtsuji jmtsuji mentioned this issue Feb 13, 2024
@jmtsuji
Copy link
Collaborator Author

jmtsuji commented Feb 13, 2024

Given that we don't have very many rules with long Python code blocks at the moment, I think it's not a high priority to try to move around the code at this point. However, it might be something we could consider in the future, especially if we create more rules like summarize_contigs_by_coverage that have complex Python code. OK to table this issue for now? Or do you think it could be worthwhile to explore simplifying the code in the snakefile for this rule?

@LeeBergstrand
Copy link
Collaborator

LeeBergstrand commented Feb 13, 2024

Given that we don't have very many rules with long Python code blocks at the moment, I think it's not a high priority to try to move around the code at this point. However, it might be something we could consider in the future, especially if we create more rules like summarize_contigs_by_coverage that have complex Python code. OK to table this issue for now? Or do you think it could be worthwhile to explore simplifying the code in the snakefile for this rule?

I would table the issue for now. I would probably move the code to utils.py but after we pull the the sample object code out into its own library.

@jmtsuji
Copy link
Collaborator Author

jmtsuji commented Feb 13, 2024

OK, sounds good -- let's table this for now.

@jmtsuji jmtsuji added enhancement New feature or request long_term and removed question Further information is requested enhancement New feature or request long_term labels Feb 13, 2024
@jmtsuji
Copy link
Collaborator Author

jmtsuji commented Feb 14, 2024

Update: as discussed in #126 , the logic for rule summarize_contigs_by_coverage is getting complex enough that it might make sense to move the code into its own script in future:

From #126:

In the long term, I think it might make sense to move the code in summarize_contigs_by_coverage into a Python utility script -- this would make it easier to write/run tests on some of the logic being performed in this code.

@jmtsuji jmtsuji changed the title Consider simplifying Python code within snakemake rules Simplify Python code within certain snakemake rules Feb 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants