Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

max size for a Tcl value (2147483647 bytes) exceeded #15

Open
ekarlins opened this issue Jun 11, 2020 · 13 comments
Open

max size for a Tcl value (2147483647 bytes) exceeded #15

ekarlins opened this issue Jun 11, 2020 · 13 comments
Labels
bug Something isn't working enhancement New feature or request

Comments

@ekarlins
Copy link

I'm running v2.2 of AnnotSV on a SGE cluster. I'm running on a couple of gene lists separately using "-candidateGenesFile" and "-candidateGenesFiltering". The run completes for a small gene list, but fails for the larger gene list and exits abruptly. The error message is in the subject line. I tried giving more memory to the job (200G), but that doesn't fix it as there doesn't seem to be a way to specify to AnnotSV (or Tcl) to use more memory. I'm going to try splitting my gene list into two and see if that works. Is there another fix for this?

Thanks,
Eric

@lgmgeo
Copy link
Owner

lgmgeo commented Jun 11, 2020

Hi Eric,

Can you tell me the size of your candidateGenesFile?
And how many lines are there ?

Best regards,
Véronique

@ekarlins
Copy link
Author

Véronique,
Thanks for your quick reply! The original candidateGenesFile has 1730 lines, but I've since split it into smaller parts of 346 lines and it still fails with the same error.
Is there any way to tell Tcl to use more memory?

Thanks,
Eric

@lgmgeo
Copy link
Owner

lgmgeo commented Jun 11, 2020

Ok, one more question:
Are the gene names space-separated, tabulation-separated, or line-break-separated?

@lgmgeo
Copy link
Owner

lgmgeo commented Jun 11, 2020

Can you contact me by email: veronique.geoffroy@inserm.fr to continue the debbuging?

@ekarlins
Copy link
Author

Ok. I sent you an email. Thanks!

@lgmgeo
Copy link
Owner

lgmgeo commented Jun 11, 2020

Conclusion on this post:

  • Running AnnotSV without using the candidateGenes flags fails as well with the same error.
  • AnnotSV was used to annotate a very large SV input file.

To solve this, you can split your input file into smaller files, run AnnotSV and then later merge them into a single output file.

This will be fixed in a future release.
Sorry for any inconvenience that may cause.

Repository owner deleted a comment from ekarlins Jun 12, 2020
@lgmgeo lgmgeo closed this as completed Jun 12, 2020
@Mkddb
Copy link

Mkddb commented Feb 15, 2021

Hello Veronique,

Hope you are doing great.
I also had a similar issue a while ago, so posting my query in the same thread.

You are right. AnnotSV doesn't annotate a very large SV input file, say a merged file from multiple genomes or a large scale study. can we expect a better resolution around this in any upcoming version soon.
or shall i continue with annotation of split input file and later merging.

Thanks in Advance

@lgmgeo
Copy link
Owner

lgmgeo commented Feb 15, 2021

Hi Mkddb,

Good news, a new collaboration is just being set up to implement AnnotSV in Python. It should solve this bug!
But it may take a while, I can't say exactly how long.

I will keep this thread open until then.

Best,
Véronique

@lgmgeo lgmgeo reopened this Feb 15, 2021
@Mkddb
Copy link

Mkddb commented Feb 16, 2021

Okay Great.

That's a good news. Good luck with the new collaboration.
Thanks for informing.

@Stikus
Copy link

Stikus commented Feb 24, 2021

Hello @lgmgeo
We experienced the same problem:

AnnotSV 3.0

Copyright (C) 2017-2020 GEOFFROY Veronique

Please feel free to contact me for any suggestions or bug reports
email: veronique.geoffroy@inserm.fr

Tcl/Tk version: 8.6

Application name used (defined with the "ANNOTSV" environment variable):
/soft/AnnotSV-3.0


...downloading the configuration data (February 21 2021 - 20:49)
        ...configuration data by default
        ...configuration data from /soft/AnnotSV-3.0/etc/AnnotSV/configfile
        ...configuration data given in arguments
        ...checking all these configuration data

...checking the annotation data sources

...listing arguments
        ******************************************
        AnnotSV has been run with these arguments:
        ******************************************
        -REreport no
        -SVinputFile /var/lib/cwl/stg3e23f422-fa51-450a-906d-1d842ff98155/77050025_NEB_manta.candidateSV.vcf
        -SVinputInfo 1
        -SVminSize 50
        -annotationMode both
        -annotationsDir /var/lib/cwl/stgff43072e-534a-4e34-a491-3fde1a061db3/ref/AnnotSV/3.0
        -bcftools bcftools
        -bedtools bedtools
        -candidateGenesFiltering no
        -genomeBuild GRCh38
        -includeCI yes
        -metrics us
        -minTotalNumber 500
        -organism Human
        -outputDir /LlySzz/output
        -outputFile 77050025_NEB.annotsv.tsv
        -overlap 100
        -overwrite yes
        -promoterSize 500
        -rankFiltering 1 2 3 4 5
        -reciprocal no
        -samplesidBEDcol -1
        -snvIndelPASS 0
        -svtBEDcol -1
        -tx RefSeq
        ******************************************

max size for a Tcl value (2147483647 bytes) exceeded

Will wait for Python implementation and fix.

@nvnieuwk
Copy link
Contributor

Hi, any news on the python version? We're also hitting this maximum value error. I'll implement a temporary splitting solution for now, but that isn't really the best for SV VCFs with a lot of BNDs 😁

@lgmgeo
Copy link
Owner

lgmgeo commented Jan 15, 2024

but that isn't really the best for SV VCFs with a lot of BNDs

You're right.

Ok, definitely, I have to admit that I can't find the time to reimplement AnnotSV in Python.

I'll look at this bug from another angle and see how to fix it in Tcl code. Not in the near future but as soon as possible

@nvnieuwk
Copy link
Contributor

Thanks for your response!
Don't worry, it's not that urgent since I've implemented a workaround for now in my pipeline. (Splitting per SVTYPE)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants