command "paladin prepare -r2" throwing error related to memory ? #40

jaidevjoshi83 · 2019-03-14T15:33:41Z

Hi,
Is there any way/hack to run this particular part without encountering this error given below, if you have less memory, in my case 256GB only. Or can I use the pre-prepared or pre-indexed database?

Error: "Constructing BWT for the packed sequence... [is_bwt] Failed to allocate 482146077304 bytes at is.c line 212: Cannot allocate memory"

Kindly suggest.

davidfbibby · 2022-04-11T12:28:31Z

Hello,
I get the same error when attempting to index the RVDb prot database. The protein FASTAs are renamed to protein.faa.gz and when I run paladin index -r3 protein.faa.gz, I get the following:

[M::command_index] Translating protein sequence...0.00 sec
[M::command_index] Packing protein sequence... 93.97 sec
[M::command_index] Constructing BWT for the packed sequence... [is_bwt] Failed to allocate 134975224056 bytes at is.c line 212: Cannot allocate memory

Although the protein.faa.gz is large (461Mb), I am working on a large cluster, and am surprised to encounter this problem.
Many thanks for any help you can provide.

Dave

ToniWestbrook · 2022-04-11T20:46:45Z

Hi @davidfbibby - to double check, I just clustered the latest revision of the clustered RVDB (around 3.1 GB of amino acids uncompressed) while profiling the memory usage. The maximum resident size during indexing for this reference is 56GB, but as can be seen above, it actually allocates a larger amount to work in (128GB) so you'll need at least that much (system memory and/or job constraint wise) to complete the indexing process. Does your system have that much memory?

davidfbibby · 2022-04-12T09:28:32Z

Hi,
I was using the unclustered dataset, which is over 8Gb! Maybe I should try to use the clustered version...
I'm not sure about my available memory tbh, but if the clustered version fails, I'll enquire.

Thanks for the quick response,

Dave

davidfbibby · 2022-04-12T09:32:16Z

Another question - on https://rvdb-prot.pasteur.fr/, it is only the unclustered dataset that I can find.

ToniWestbrook · 2022-04-12T16:26:46Z

~~Here's the link to the group that maintains the clustered RVDB: https://rvdb.dbi.udel.edu/ (that has both the clustered and unclustered references available).~~ Indexing the unclustered DB would need significantly more memory, so it would be good to use the clustered first if that works okay for your purposes. Hope that helps

ToniWestbrook · 2022-04-12T16:34:17Z

Apologies, that's a nucleotide version of the reference at that link! I totally missed that when I downloaded it yesterday. I'll take a look around for a clustered version of the protein database - if not, you may have to cluster it yourself to fit into memory. Sorry again

davidfbibby · 2022-04-13T12:31:04Z

Ooof. I don't fancy clustering them. I'll see if I can get more memory to allocate...
Thanks again for the quick responses.

Dave

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

command "paladin prepare -r2" throwing error related to memory ? #40

command "paladin prepare -r2" throwing error related to memory ? #40

jaidevjoshi83 commented Mar 14, 2019 •

edited

Loading

davidfbibby commented Apr 11, 2022

ToniWestbrook commented Apr 11, 2022

davidfbibby commented Apr 12, 2022

davidfbibby commented Apr 12, 2022

ToniWestbrook commented Apr 12, 2022 •

edited

Loading

ToniWestbrook commented Apr 12, 2022 •

edited

Loading

davidfbibby commented Apr 13, 2022

command "paladin prepare -r2" throwing error related to memory ? #40

command "paladin prepare -r2" throwing error related to memory ? #40

Comments

jaidevjoshi83 commented Mar 14, 2019 • edited Loading

davidfbibby commented Apr 11, 2022

ToniWestbrook commented Apr 11, 2022

davidfbibby commented Apr 12, 2022

davidfbibby commented Apr 12, 2022

ToniWestbrook commented Apr 12, 2022 • edited Loading

ToniWestbrook commented Apr 12, 2022 • edited Loading

davidfbibby commented Apr 13, 2022

jaidevjoshi83 commented Mar 14, 2019 •

edited

Loading

ToniWestbrook commented Apr 12, 2022 •

edited

Loading

ToniWestbrook commented Apr 12, 2022 •

edited

Loading