Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to snakemake version 8 with plugin profile #80

Merged
merged 32 commits into from
Jun 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
ef4517e
Update to snakemake version 8
verku Jun 14, 2024
ada3e0b
Fix syntax warning by adding escape characters
verku Jun 14, 2024
381dc23
Update slurm config files
verku Jun 14, 2024
5c78406
Switch to qualimap version 2.3
verku Jun 14, 2024
cee9004
Switch to seqera singularity container
verku Jun 14, 2024
654f4b5
Undo double escape characters in regular expression due to syntaxwarning
verku Jun 14, 2024
d73f649
Add resources specific to rules and groups
verku Jun 14, 2024
801fd7d
Prefix the string with r so that Python treats the string as a raw st…
verku Jun 14, 2024
1e9a163
Switch to version 1.20 from seqera
verku Jun 14, 2024
4769cab
Remove deprecated command line option to print reasons for rule execu…
verku Jun 14, 2024
b69bac9
Add a README with instructions on how to run GenErode with the slurm …
verku Jun 14, 2024
0639bad
Replace sequera container with galaxyproject container for bcftools 1:20
verku Jun 16, 2024
5d53485
Update documentation
verku Jun 16, 2024
9df5bc6
Add flag -k to keep going when a job fails until none of the jobs can…
verku Jun 16, 2024
d784a84
Switch back to bcftools 1.19
verku Jun 16, 2024
4caeb02
Update compute resources
verku Jun 16, 2024
5f31057
Add snakemake executor plugin slurm to GenErode conda environment
verku Jun 16, 2024
201ecd5
Switch back to bcftools 1.20 from the galaxy project
verku Jun 17, 2024
a57ada1
Update calculation of mem to provide to tools based on mem_mb specifi…
verku Jun 17, 2024
9f7b9b3
Add separate config files for Dardel and Rackham
verku Jun 17, 2024
edb1a28
Add set-threads section back to config.yaml for slurm plugin
verku Jun 17, 2024
1563a35
Remove flag -m
verku Jun 17, 2024
3c437dc
Remove path to temporary directory
verku Jun 17, 2024
4fa505a
Switch to sequera container and upgrade to version 1.4
verku Jun 17, 2024
bca2e56
Switch to galaxyproject container
verku Jun 17, 2024
09b02a2
Replace plink container with galaxyproject container and switch to ve…
verku Jun 17, 2024
57f3ba5
Update the README file according to latest changes made to the reposi…
verku Jun 17, 2024
e6598cc
Specify threads in rule
verku Jun 17, 2024
150c604
Switch back to same version as before but from galaxy project
verku Jun 17, 2024
bc89adf
Update the readme for slurm profiles
verku Jun 18, 2024
cb95f67
Clarify compute resource specifications in current configuration file
verku Jun 18, 2024
248d623
Relaxed depth filters to get more SNPs for test runs
verku Jun 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/gerp.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -67,9 +67,9 @@ jobs:
- name: gerp_dry
shell: bash -l {0}
run: |
snakemake -npr --configfile .test/config/config_gerp.yaml -j 4 --cores 1 --use-singularity
snakemake -np -k --configfile .test/config/config_gerp.yaml -j 4 --cores 1 --use-singularity

- name: gerp
shell: bash -l {0}
run: |
snakemake --configfile .test/config/config_gerp.yaml -j 4 --cores 1 --use-singularity
snakemake -k --configfile .test/config/config_gerp.yaml -j 4 --cores 1 --use-singularity
4 changes: 2 additions & 2 deletions .github/workflows/mitogenome_mapping.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -91,10 +91,10 @@ jobs:
- name: mitogenome_mapping_dry
shell: bash -l {0}
run: |
snakemake -npr --configfile .test/config/config_mitogenomes.yaml -j 4 --cores 1 --use-singularity
snakemake -np -k --configfile .test/config/config_mitogenomes.yaml -j 4 --cores 1 --use-singularity

- name: mitogenome_mapping
shell: bash -l {0}
run: |
snakemake --configfile .test/config/config_mitogenomes.yaml -j 4 --cores 1 --use-singularity
snakemake -k --configfile .test/config/config_mitogenomes.yaml -j 4 --cores 1 --use-singularity

4 changes: 2 additions & 2 deletions .github/workflows/mlRho_options.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -73,9 +73,9 @@ jobs:
- name: mlRho_options_dry
shell: bash -l {0}
run: |
snakemake -npr --configfile .test/config/config_mlRho_options.yaml -j 4 --cores 1 --use-singularity
snakemake -np -k --configfile .test/config/config_mlRho_options.yaml -j 4 --cores 1 --use-singularity

- name: mlRho_options
shell: bash -l {0}
run: |
snakemake --configfile .test/config/config_mlRho_options.yaml -j 4 --cores 1 --use-singularity
snakemake -k --configfile .test/config/config_mlRho_options.yaml -j 4 --cores 1 --use-singularity
4 changes: 2 additions & 2 deletions .github/workflows/pca_roh.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,9 @@ jobs:
- name: pca_roh_dry
shell: bash -l {0}
run: |
snakemake -npr --configfile .test/config/config_pca_roh.yaml -j 4 --cores 1 --use-singularity
snakemake -np -k --configfile .test/config/config_pca_roh.yaml -j 4 --cores 1 --use-singularity

- name: pca_roh
shell: bash -l {0}
run: |
snakemake --configfile .test/config/config_pca_roh.yaml -j 4 --cores 1 --use-singularity
snakemake -k --configfile .test/config/config_pca_roh.yaml -j 4 --cores 1 --use-singularity
4 changes: 2 additions & 2 deletions .github/workflows/snpeff.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -67,9 +67,9 @@ jobs:
- name: snpeff_dry
shell: bash -l {0}
run: |
snakemake -npr --configfile .test/config/config_snpeff.yaml -j 4 --cores 1 --use-singularity
snakemake -np -k --configfile .test/config/config_snpeff.yaml -j 4 --cores 1 --use-singularity

- name: snpeff
shell: bash -l {0}
run: |
snakemake --configfile .test/config/config_snpeff.yaml -j 4 --cores 1 --use-singularity
snakemake -k --configfile .test/config/config_snpeff.yaml -j 4 --cores 1 --use-singularity
4 changes: 2 additions & 2 deletions .test/config/config_pca_roh.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -153,13 +153,13 @@ zerocoverage: False
# to set a minimum depth threshold. For ultra low coverage samples, a
# minimum hard threshold of 3X is applied that overrides this parameter.
# A minimum depth of 6X should be aimed for.
minDP: 0.33
minDP: 0.1

# Maximum depth threshold calculation per sample.
# Will be applied to mlRho analysis and in VCF file filtering.
# Factor by which the average genome-wide depth should be multiplied
# to set a maximum depth threshold.
maxDP: 10
maxDP: 100
#####


Expand Down
109 changes: 109 additions & 0 deletions config/slurm/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# GenErode execution on SLURM clusters

With the switch to Snakemake version 8, GenErode can be run
the following on SLURM clusters:

1) Create the GenErode conda environment or update an earlier
version. The latest conda environment contains the Snakemake
executor plugin for slurm:

```
conda create -f environment.yaml -n generode
```

2) Copy one of the example configuration files `config/slurm/profile/config_plugin_rackham.yaml`
or `config/slurm/profile/config_plugin_dardel.yaml` to
`slurm/config.yaml`. This file specifies compute resources
for each rule or group jobs. Any rule or group job that is
not listed under `set-threads` or `set-resources` uses
default resources specified under `default-resources`. If
any rule or group jobs fail due to too little memory or run
time, their compute resources can be updated in this file.

> Note that the current configuration files were adjusted to the
HPC clusters Rackham from UPPMAX and Dardel from PDC/KTH. Details
on how to configure and run GenErode on Dardel are provided below.
The configuration file for Snakemake version 7 was kept for comparison
which was also written for Rackham/UPPMAX.

3) Start GenErode the following:

- Open a tmux or screen session
- Activate the GenErode conda environment
- Start the dry run:

```
snakemake --profile slurm -np &> YYMMDD_dry.out
```

- Start the main run:

```
snakemake --profile slurm &> YYMMDD_main.out
```

> Useful flags for running the pipeline: `--ri` to re-run
incomplete jobs and `-k` to keep going in case a job fails.

## Specific instructions for Dardel

1) Load the following modules on Dardel:

```
module load PDC UPPMAX bioinfo-tools conda singularity tmux
```

2) After cloning the repository, change permissions for the
Snakefile:

```
chmod 755 Snakefile
```

3) Create the GenErode conda environment or update an earlier
version. The latest conda environment contains the Snakemake
executor plugin for slurm:

```
conda create -f environment.yaml -n generode
```

4) Copy the configuration file `config/slurm/profile/config_plugin_dardel.yaml`
to `slurm/config.yaml`. This file specifies compute resources
for each rule or group jobs to be run on Dardel. Any rule or
group job that is not listed under `set-threads` or `set-resources`
uses default resources specified under `default-resources`. If
any rule or group jobs fail due to too little memory or run
time, their compute resources can be updated in this file.

> Note that the current version of `config/slurm/profile/config_plugin_dardel.yaml`
is still being tested. Threads are currently specified under
`set-threads` and under `set-resources` as `cpus_per_task`.

5) Start GenErode the following:

- Open a tmux session (alternatively, you can use screen)

- Activate the GenErode conda environment (create or update
from `environment.yaml`), replacing the path to the location
of the conda environment:

```
export CONDA_ENVS_PATH=/cfs/klemming/home/.../
conda activate generode
```

- Start the dry run:

```
snakemake --profile slurm -np &> YYMMDD_dry.out
```

- Start the main run:

```
snakemake --profile slurm &> YYMMDD_main.out
```

> Useful flags for running the pipeline: `--ri` to re-run
incomplete jobs and `-k` to keep going in case a job fails.
Loading
Loading