-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error with exon end earlier than exon start #113
Comments
Hello, This typically happens when you get a lot of reads from a messy gene where you start having wobble bridging creating a collapse exon start which occurs before the collapse exon end. However, to be double sure it would help to see the SAM lines from those two reads. Could you post those? Thank you, |
hello, Richard |
What parameters are you using to run TAMA Collapse? |
python /public/home/tama/tama_collapse.py -d merge_dup -x no_cap -b BAM -f /public/home/genome.fasta -s /public/home/Gb.bam -p Gb |
This is quite strange. What mapper did you use and what parameters? |
pbmm2 align --preset ISOSEQ --unmapped --sort -j 6 --log-level INFO --log-file pbmm2.log /public/home/genome.fasta --sample Gb ./Gb.flnc.bam Gb.bam |
The problem seems to be coming from the alignment. You can see from the CIGAR string for read "m64041_201001_163935/99746185/ccs": 2194=2I1172=2767N183=1I132=1D2=510N326I850N254=1D518=202N914= The issue is here: |
ok, So if I extract the reads, can the problem be solved? |
In theory yes but if the mapper has a bug then there are likely issues with more mappings so I would not trust it until the bug is resolved. Until they know how this bug works then you cannot know if it is causing silent errors with other transcript models. |
ok I hava a try. Thank you so much! |
Just to clarify are you trying to contact Pacbio about the bug or trying to remove the read info from the file to make it run through TAMA Collapse? |
All. two methods are being carried out simultaneously |
Nice! Very efficient. |
Hi @GenomeRIK Detailsminimap2 results Total Gene Count: 59385 Total Transcript Count: 874323 Total Accepted Reads: 3944979 Total Discarded Reads: 4086923 Detailspbmm2 Total Gene Count: 59485 Total Transcript Count: 876767 Total Accepted Reads: 3965819 Total Discarded Reads: 63775 Discarded Reads Huge difference |
The larger number of discarded reads from minimap is probably because you did not turn off secondary mappings when running minimap. You can check the read.txt file to see what is the reason for read discards. |
There is some impact but it may not make much of a difference to you depending on your objectives. Either way the Pacbio mapper is not working for you. |
OK thank you for your assistance! |
hello, RIK
when I use tama_collapse.py I found Error with exon end earlier than exon start
54827318 54827318
['m64041_201001_163935/150406967/ccs', 'm64041_201001_163935/99746185/ccs']
It is Pacbio Iso-seq data , please tell me what causes this?
The text was updated successfully, but these errors were encountered: