-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TAMA collapse IndexError #126
Comments
Hi, My bad. I am the developer of uLTRA. What you observe is an error in how uLTRA output the cigar (problematic cigar string as it ends with In the meantime, perhaps it is possible to handle such errors gracefully in TAMA with a Best, |
Thanks a lot for the fast reply! I've been trying to make sense of this error for a few days and I would have never found the answer. When you say that read I showed looks problematic, is it a formatting issue or does it look bad quality / bad alignment...?
So I still have a bunch of problematic reads... I attach the sam file with the reads mentioned in the error message here: |
I was referring to the long homopolymer/STR. However, a mature aligner should handle these things seamlessly (diss to my own aligner). Although I now see that also other established aligners are struggling with the cigar output as reported in this repo.
Not sure. Perhaps you can try to remove also those reads (as done here: #113). The 30-40 reads affected is only a tiny subset of your data I assume(?), so shouldn't change your results notably unless they are all from some gene(s) that you are interested in. |
Thanks for the clarification! And yeh, I ended up removing those reads too. |
Hi,
I'm trying to run TAMA collapse on ONT aligned reads that have gone through the following pipeline:
-Adapter trimming and reorienting with Pychopper
Now, I'm using this command to run TAMA collapse
python tama/tama_collapse.py -s reads.filtered.sorted.sam -f $ref_genome_dir/$ref_genome -x no_cap -a 100 -z 100 -m 10 -p tama/collapse/mysample
And I'm getting this error
sam count 1530000 Traceback (most recent call last): File "tama/tama_collapse.py", line 5190, in <module> [h_count,s_count,i_count,d_count,mis_count,nomatch_dict,sj_pre_error_list,sj_post_error_list] = calc_error_rate(start_pos,cigar,seq_list,scaff_name,read_id) File "tama/tama_collapse.py", line 847, in calc_error_rate next_cig_flag = cig_char_list[i + 1] IndexError: list index out of ran
I've guess it has something to do with the cigar string and errors around splice junction. I've also found the first read in my sam that causes the error (I think there's more than one). It's this one:
99:1914|a1ac6b88-3c74-4652-8b19-b682f6e0e40d 0 NC_027836.1 1435 60 2X3=1D1X1=1X4=4I3=3X3=3X166=1X8=1X11=3D104=1I84=2D56=1X111=2X113=1X74=1I1X12=2X6=1X8=1X26=1X2=1X17=1X6=1X88=1X15=1X58=1X137=1X38=1X11=1X5=1X14=1X7=1X10=1X31=1I3=1I32=44I1X2=1I2=1D1X2=1D2=180I1=1X2=1X3=1X2=1D1=1X43=1X2=1X3=1D1=1X3=1X45=1D112=1X7D2992N * 0 0 CGACTTGGCATTAGAATTAGGCTTTGGGGCGAAAATGACTTTATTCAACAAATCATAAAGATATTGGAACATTATATTTTATTTTTGGAATTTGAGCAGGGATAGTAGGTACTTCTTTAAGTTTATTAATTCGAGCTGAATTAGGGACTCCAGGATCTTTAATTGGAGATGATCAAATTTATAATACTATTGTAGCAGCTCATACTTTTATTATATTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGACTTGTACCTTTAATATTAGGAGCCCCTGATATAGCTTTCCCACGTATAAATAATATAAGTTTTTTGACTTTTACCCCCATCTTTAACTTTATTAATTTCTAGTAGCATTGTAGAAAATGGAGCAGGAACTGGATGAACAGTTTACCCCCCTCTCCTCTAATATTGCTCATGGTGGTAGTTCAGTAGATTTAGCTATTTTCCCACTTCATTTAGCTGGAATTTCATCTATTTTAGGAGCTATTAACTTTATTACTACTATTATTAATATACGATTAAATAATTTATCATTTGATCAAATACCTTTATTTATTTAGGCTGTAGGTATTACTGCATTCTTATTATTATTATCTTTACCTGTTTTAGCCGGAGCTATTACTATATTACTTACTGATCGAAATTTAAATACATCATTTTTCGATCCTGCAGGTGGAGGTGATCCTATTCTTTATCAACATTTATTTTGATTTTTTGGACATCCTGAAGTATATATTTTAATTTTACCAAGGATTTGGTATACATTCTCATATTATTTCCCAAGAAAGAGGTAAAAAGGAAACATTCGGGTGTTTAGGTATAATTTACGCTATACTAGCAATTGGTTTATTAGGATTTATTGTTTGAGCTCATCATATATTTACTGTAGGAATAGATATTGATACACGAGCATATTTTACATCAGCAACAATAATTATTACTGTACCAACAGGTATTAAAATTTTTAGTTGATTAGCTACTTTCCATGGAACTCAAATTAATTATTCCCCATCTATTTTATGAAGATTAGGATTTGTATTTTTATTTACTGTAGGAGGATTAACAGGTGTAATTTTATCTAATTCTTCTATTGATATTACTTTACATGATACTTACTATGTAGTTGCTCATTTCCATTATGTTTTATCAATAGGAGCTGTATTTGCTATTTTAGGGGGATTTATTCATTGATACCCATTATTTACTGGTTTATCTTCAAATCCTTATTTATTAAAAATTCAATTTTTTATTATATTTATCCGGAAGTAAATTTAACTTTCTTCCCACAACATTTTTGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGATGCATGCCTGGTGGTATTCTGATTATCCTGATTCTTATATTTCATGAAATATTATTTTATTATTGAATCATATATTTCATTATTAGCAATCATATTTATATTAATTATTATTTGAAATCTATAATTAATCAACGAATTTCTTTATTTACTTTAAATTTATCTTCTTCAATTGAATGATATCAAAATTTACCACCAGCTGAACATTCATATAATGAATTGCCTATTTTT * XA:Z: XC:Z:Insufficient_junction_coverage_unclassified NM:i:297
Could you help me out understand what's going on? How could I filter the sam file to get rid of problematic reads?
Thank you!
The text was updated successfully, but these errors were encountered: