Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of bounds error #12

Open
diego-rt opened this issue Nov 14, 2023 · 1 comment
Open

Out of bounds error #12

diego-rt opened this issue Nov 14, 2023 · 1 comment

Comments

@diego-rt
Copy link

Hey there neighbors,

Thanks a lot for producing the only competent dot plotter out there!

I am getting the following error after running Gepard on a 1.08 Gbp sequence.

Loading substitution matrix...
Loading sequence from split.fasta
Loading sequence from split.fasta
Calculating suffix array... 
Calculating dotplot... 
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index -1072693259 out of bounds for length 1083253305
	at org.gepard.common.SuffixArray.search(SuffixArray.java:84)
	at org.gepard.common.DotMatrix.calcDotMatrix(DotMatrix.java:211)
	at org.gepard.common.DotMatrix.<init>(DotMatrix.java:144)
	at org.gepard.client.cmdline.CommandLine.main(CommandLine.java:310)

I assume this out of bounds issue is because of some buffer overflow math error somewhere? My understanding is that java should be able to handle integers up to 2G so I assume it's just some non-overflow-safe math?

Thanks a lot!

@diego-rt
Copy link
Author

I've tried splitting the huge monolithic fasta sequence (i.e. 1.5 Gbp) into a multi sequence fasta where no sequence is larger than 1Gb (i.e. 1 Gbp + 0.5 Gbp) but that also hasn't worked. Perhaps enabling this is the best approach for dealing with genome-scale sequences?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant