You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi. I run the downloadRefSeq.pl command -- downloadRefSeq.pl --seqencesOutDirectory data/metamaps-db/refseq --taxonomyOutDirectory data/metamaps-db/taxonomy, and after about two days of churning data and printing progress output, it just failed with "Cannot change working directory into assembly path na na: No such file or directory" and no other explanation. It had successfully processed all bacterial genomes but only got through 5 out of 323 fungal genomes. Looking into the data/metamaps-db/refseq/fungi dir, I actually see only six subdirectories for six species. assembly_summary.txt lists a lot more. I have about 20TB free disk space left, so it can't be that.
Does it mean that some previous data retrieval steps failed? Is there a way to safeguard against this? Or fix it and resume from where it left off?
The text was updated successfully, but these errors were encountered:
I fixed the error by changing ftp to https in one line of downloadRefSeq.pl.
Original: (my $assembly_path_FTP = $assembly_path_fullURL) =~ s/ftp:\/\/ftp.ncbi.nlm.nih.gov//g;
New: (my $assembly_path_FTP = $assembly_path_fullURL) =~ s/https:\/\/ftp.ncbi.nlm.nih.gov//g;
I added a conditional statement in there that iterates to the next species if $assembly_path_fullURL == "na" - that's why that error was being thrown. I used the following sed command to insert the logic:
sed -i 's|# last SPECIES if($downloaded_assemblies > 100);|if($assembly_path_fullURL eq "na"){\n\t\t\t\tnext SPECIES; \n\t\t\t}\n|g' ./downloadRefSeq.pl
This will replace this comment line # last SPECIES if($downloaded_assemblies > 100); with the following if statement:
if($assembly_path_fullURL eq "na"){ next SPECIES; }
Keep in mind that if there is an update to MetaMaps and the # last SPECIES if($downloaded_assemblies > 100); comment is removed, this sed statement won't work
Hi. I run the downloadRefSeq.pl command --
downloadRefSeq.pl --seqencesOutDirectory data/metamaps-db/refseq --taxonomyOutDirectory data/metamaps-db/taxonomy
, and after about two days of churning data and printing progress output, it just failed with "Cannot change working directory into assembly path na na: No such file or directory" and no other explanation. It had successfully processed all bacterial genomes but only got through 5 out of 323 fungal genomes. Looking into thedata/metamaps-db/refseq/fungi
dir, I actually see only six subdirectories for six species.assembly_summary.txt
lists a lot more. I have about 20TB free disk space left, so it can't be that.Does it mean that some previous data retrieval steps failed? Is there a way to safeguard against this? Or fix it and resume from where it left off?
The text was updated successfully, but these errors were encountered: