Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CADD v1.7 on Docker #60

Open
parsboy66 opened this issue Mar 26, 2024 · 14 comments
Open

CADD v1.7 on Docker #60

parsboy66 opened this issue Mar 26, 2024 · 14 comments

Comments

@parsboy66
Copy link

parsboy66 commented Mar 26, 2024

Hi there ,

regarding the new version of CADD v1.7 installation on docker, i have faced a problem " The following environment variables are requested by the workflow but undefined. Please make sure that they are correctly defined before running Snakemake:
0.802 CADD
"
when i try to install it on docker.

previous version perfectly could be installed , but new one i can not. i will appreciate it if you offer me any suggestions? hints?

snakemake version i use is v.7 and FROM mambaorg/micromamba:1.5.8

full error:
RUN cd /opt && curl -L https://github.com/kircherlab/CADD-scripts/archive/refs/tags/v1.7.tar.gz | tar xz && cd CADD-scripts-1.7 && ln -s CADD.sh cadd.sh 1.1s
=> ERROR [5/5] RUN cd /opt/CADD-scripts-1.7 && snakemake -j 1 test/input.tsv.gz --use-conda --conda-create-envs-only --conda-prefix envs --configfile config/config_GRCh38_v1.7.yml --snakefile Snakefile 0.8s

[5/5] RUN cd /opt/CADD-scripts-1.7 && snakemake -j 1 test/input.tsv.gz --use-conda --conda-create-envs-only --conda-prefix envs --configfile config/config_GRCh38_v1.7.yml --snakefile Snakefile:
0.714 WorkflowError in file /opt/CADD-scripts-1.7/Snakefile, line 20:
0.714 The following environment variables are requested by the workflow but undefined. Please make sure that they are correctly defined before running Snakemake:
0.714 CADD

0.714 File "/opt/CADD-scripts-1.7/Snakefile", line 20, in

##########################################################################
if i just create docker without CADD and then entering the docker image and installing CADD by ./install and then commit it in new image it wseems it works , but i have faced new error, (i have all folders in /usr/bin and all are bind to the annotation on my pc) but it seems CADD can not find some files.
MissingInputException in rule annotate_esm in file /usr/bin/Snakefile, line 123:
Missing input files for rule annotate_esm:
output: /tmp/tmp.9DnBOrJwwP/input.esm_missens.vcf.gz, /tmp/tmp.9DnBOrJwwP/input.esm_frameshift.vcf.gz, /tmp/tmp.9DnBOrJwwP/input.esm.vcf.gz
wildcards: file=/tmp/tmp.9DnBOrJwwP/input
affected files:
data/annotations/GRCh38_v1.7/esm/esm1v_t33_650M_UR90S_1.pt
data/annotations/GRCh38_v1.7/esm/esm1v_t33_650M_UR90S_5.pt
data/annotations/GRCh38_v1.7/esm/esm1v_t33_650M_UR90S_2.pt
data/annotations/GRCh38_v1.7/esm/esm1v_t33_650M_UR90S_4.pt
data/annotations/GRCh38_v1.7/esm/esm1v_t33_650M_UR90S_3.pt
data/annotations/GRCh38_v1.7/esm/pep.110.fa

Thanks

@visze
Copy link
Collaborator

visze commented Mar 26, 2024

I think you have to set a bash environment variable called CADD.

@parsboy66
Copy link
Author

parsboy66 commented Mar 27, 2024

I think you have to set a bash environment variable called CADD.

Hi , thank you , Well it solved the last problem ,and now there is another issue, seems strange, doesnt it?
do i need to define other variables? related to prescore and annotations?

appreciate it in advance for any helps.

RUN snakemake CADD-scripts/test/input.tsv.gz --use-conda --conda-create-envs-only --conda-prefix CADD-scripts/envs --configfile CADD-scripts/config/config_GRCh38_v1.7.yml --conda-frontend conda --cores 4 --snakefile CADD-scripts/Snakefile && mkdir -p CADD-scripts/data/
prescored/GRCh38_v1.7/:
0.646 Building DAG of jobs...
0.658 WorkflowError:
0.658 MissingInputException: Missing input files for rule prescore:
0.658 output: CADD-scripts/test/input.novel.vcf, CADD-scripts/test/input.pre.tsv
0.658 wildcards: file=CADD-scripts/test/input
0.658 affected files:
0.658 $(dirname/data/prescored/GRCh38_v1.7/incl_anno
0.658 MissingInputException: Missing input files for rule decompress:
0.658 output: CADD-scripts/test/input.novel.vcf
0.658 wildcards: file=CADD-scripts/test/input.novel
0.658 affected files:
0.658 CADD-scripts/test/input.novel.vcf.gz

Imn

@visze
Copy link
Collaborator

visze commented Mar 29, 2024

Not sure what is going on. Looks like that an input file is missing: here CADD-scripts/test/input.novel.vcf.gz

Make sure that this is available especially due to relative paths... Should be in the directory where you run the command.

@visze
Copy link
Collaborator

visze commented Apr 25, 2024

I think issue #64 is related. I created a small patch (#65). Can you try it again with the latest master branch?

@parsboy66
Copy link
Author

I think issue #64 is related. I created a small patch (#65). Can you try it again with the latest master branch?

I could install and run cadd 1.6 successfully , problem is with cadd 1.7, mmsplice env can not be created successfully. ill git it a try and let you know

@visze
Copy link
Collaborator

visze commented Apr 25, 2024

This seems to be a common issue (e.g. see #54). But we cannot update the mmpsplilce version because than it will be different mmsplice scores resulting in different raw scores that differ from the whole genome file and so on. So we are a bit lost in that case.

One possibility might be that we wrap our internal conda environment into a container and provide that for the rule. I have to discuss that internally and (more trucky) find the time to do it :-)

@raghvendra44
Copy link

I think issue #64 is related. I created a small patch (#65). Can you try it again with the latest master branch?

I tried this, But the error still presists.
image

@parsboy66
Copy link
Author

This seems to be a common issue (e.g. see #54). But we cannot update the mmpsplilce version because than it will be different mmsplice scores resulting in different raw scores that differ from the whole genome file and so on. So we are a bit lost in that case.

One possibility might be that we wrap our internal conda environment into a container and provide that for the rule. I have to discuss that internally and (more trucky) find the time to do it :-)

could you please get the .yml file out of your conda container and share it ?

@visze
Copy link
Collaborator

visze commented Apr 29, 2024

Have a look here: #56 (comment)

@raghvendra44
Copy link

That solves this issue mostly but... what about the error that was faced after fixing the mmsplice error? the "snakemake uses strict bash mode" when we run the test/input.vcf.gz
how do we fix that?

@parsboy66
Copy link
Author

parsboy66 commented Apr 29, 2024 via email

@visze
Copy link
Collaborator

visze commented Jul 16, 2024

I released a new CADD-scripts version v1.7.1. Maybe you try that one. Now it is recommended to use apptainer/singularity and all environments are packed within a container and no conda builds are needed (container is 17GB large). You also need now snakemake 8.

Also I updated the environments. So If you use mamba/conda instead I hope you will not face the issues you had above

@parsboy66
Copy link
Author

I released a new CADD-scripts version v1.7.1. Maybe you try that one. Now it is recommended to use apptainer/singularity and all environments are packed within a container and no conda builds are needed (container is 17GB large). You also need now snakemake 8.

Also I updated the environments. So If you use mamba/conda instead I hope you will not face the issues you had above

Thnks for the hard work. ill give it a shot and keep you in loop

@petersam23
Copy link

Thanks for your help. I tried to resolve the problems regarding local dependencies (using -m) but still run into the issues mentioned previously https://github.com/kircherlab/CADD-scripts/issues/63. When using the default apptainer option, I encounter the error:

CreateCondaEnvironmentException:
Conda must be version 24.7.1 or later, found version 24.3.0. Please update conda to the latest version. Note that you can also install conda into the snakemake environment without modifying your main conda installation.

Everything is installed according to the readme, and my local mamba/conda version is 24.9.0. Am I using the apptainer feature incorrectly?
Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants