Skip to content

Commit

Permalink
update documentations
Browse files Browse the repository at this point in the history
  • Loading branch information
smirarab committed Aug 10, 2021
1 parent 370ae2d commit 8e2811b
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 1 deletion.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,9 @@ All questions and inquires should be addressed to our user email group: `pasta-u
* The current PASTA code is heavily based on the [SATé code](http://phylo.bio.ku.edu/software/sate/sate.html) developed by Mark Holder's group at KU. Refer to sate-doc directory for documentation of the SATé code, including the list of authors, license, etc.
* [Niema Moshiri](https://github.com/niemasd) has contributed to the import to dendropy 4 and python 3 and to the Docker image.

**Documentation**: In addition to this README file, you can consult our [Tutorial](pasta-doc/pasta-tutorial.md).
# Documentation

In addition to this README file, you can consult this [Tutorial](pasta-doc/pasta-tutorial.md).

INSTALLATION
===
Expand Down
24 changes: 24 additions & 0 deletions pasta-doc/pasta-tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -393,6 +393,30 @@ This reads your final PASTA alignment and masks out columns with at most 20 non-

If there are features that you would like to see implemented `run_seqtools.py`, let us know, and we will try to add them.

#### Restart PASTA from the previous runs.

Let us assume you run PASTA and after 1 iteration finished, the job stopped.
Now you want to run another iteration.
What do you do?

You can use
~~~bash
zcat pastajob_temp_iteration_1_seq_unmasked_alignment.gz |run_seqtools.py -informat COMPACT3 -outformat FASTA -outfile iterate1.fasta -rename pastajob_temp_name_translation.txt
~~~
to get a file called `iterate1.fasta` that you can give as input to PASTA for the next iteration.
What is going on here is the following.
1. `zcat` will uncompress the unmasked alignment file
2. The unmasked file in the COMPACT3 format is translated to FASTA output
3. The internal PASTA names are translated to original names.

The output file may be very large.
For very large input, you may be OK with loosing some very gappy portions of the alignment between the two steps.
If so, you can use
~~~bash
zcat pastajob_temp_iteration_1_seq_unmasked_alignment.gz |run_seqtools.py -informat COMPACT3 -outformat FASTA -outfile iterate1.fasta -rename pastajob_temp_name_translation.txt -masksitesp 0.0001
~~~
to remove sites that have have (for example) 99.99% or more gaps.

Step 7: Running PASTA using configuration files
---
As mentioned before, the configurations used for running PASTA are all saved to a configuration file, and also PASTA can be run using a configuration file. These configuration files are useful for multiple purposes - for example, if you want to reproduce a PASTA run, or if you want to report the exact configurations used. Always make sure to keep the produced configuration files for future reference. Note however, that configuration files can be used as input only using command-line.
Expand Down

0 comments on commit 8e2811b

Please sign in to comment.