-
Notifications
You must be signed in to change notification settings - Fork 26
TAMA GO: Degradation Signature
This tool in TAMA-GO is used to assess degradation signature of cDNA libraries and get useful stats about the annotation.
The degradation signature (DegSig) is calculated by counting the number of collapsed transcripts from collapsing with the "no_cap" and collapsing with the "capped" algorithm.
Formula:
DegSig = ((capped transcripts) - (no cap transcripts)) / (capped transcripts)
However, only transcripts from genes with mutliple exons and multiple read support are used. This is done to prevent over-counting relevant transcript models since genes with single read support will not differ in their collapsing and genes with only single exons do not have splice junctions to assess exon cascading.
Note: The Degradation Signature calculation requires that the TAMA Collapse runs were done on pre-clustered FLNC mapped reads. This is because pre-mapping clustering (as in the official Iso-Seq pipeline) will already collapse a portion of the degraded products into longer reads thus hiding it from TAMA Collapse. So running the Degradation Signature tool on TAMA Collapse runs from cluster/polish reads will result in a lower estimation of degradation.
tama_degradation_signature.py
To run tama_degradation_signature.py you first need to run TAMA Collapse with both the "Capped" and "No Cap" algorithms. Both runs should otherwise have identical parameters. The inputs for this tool are the trans_read.bed files from these runs.
USAGE:
python tama_degradation_signature.py -c capped_trans_read.bed -nc nocap_trans_read.bed -o outfile_name
Input explanation:
capped_trans_read.bed - This is the trans_read.bed file that was output from the TAMA Collapse capped run.
nocap_trans_read.bed - This is the trans_read.bed file that was output from the TAMA Collapse no_cap run.
outfile_name - This is the name of the output file which will contain a summary of stats including the degradation signature.
The output will look like this:
Degradation Signature = 0.41700034321 Capped multi-exon, multi-read, transcript count = 61187 No-cap multi-exon, multi-read, transcript count = 35672 Capped total transcript count = 76518 No-cap total transcript count = 49544 Capped single exon trans count = 28755 No-cap single exon trans count = 15854 Capped multi exon trans count = 47763 No-cap multi exon trans count = 33690 Capped total gene count = 21722 No-cap total gene count = 21722 Capped single exon gene count = 11158 No-cap single exon gene count = 11158 Capped multi exon gene count = 10564 No-cap multi exon gene count = 10564 Capped single exon single read gene count = 9794 No-cap single exon single read gene count = 9794 Capped multi exon single read gene count = 2019 No-cap multi exon single read gene count = 2019
Note that gene counts should be the same for the capped and no_cap runs. These numbers are shown for trouble shooting in case the wrong input files are used.
A degradation signature higher than 0.25 is considered high and indicates a large number of degraded products in the sequenced RNA.