Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmented inverted alignments #148

Closed
armintoepfer opened this issue Apr 10, 2018 · 2 comments
Closed

Segmented inverted alignments #148

armintoepfer opened this issue Apr 10, 2018 · 2 comments

Comments

@armintoepfer
Copy link
Contributor

armintoepfer commented Apr 10, 2018

I'm looking at an example where a read has an inversion in the middle. Everything gets aligned properly, but I do not understand why the inverted part has its cigar reversed. The flag only states 2064, for supplements and reverse. Is this a special combination, leading to different record printing? Do you have documentation/specs about this case?

Examples:
Reference https://gist.github.com/armintoepfer/ef6505a5a9db5c1ba9b49ffe910b326e
Seqs: https://gist.github.com/armintoepfer/a24959b2e199784c35fc11651aee84d9

minimap2 -Y -x map-pb -a -z 400,20 ref.fasta seqs.fasta

First example:

The first sequence mid_inv should be aligned in three segments 180 fwd, 540 rev, 1380 fwd.
The longest is picked as primary,

mid_inv 0 chr02__substr__128948_131048 721 60 720S1380M

Makes totally sense.
The first part also gets aligned as forward and supplementary

mid_inv 2048 chr02__substr__128948_131048 1 60 180M1920S

Now the confusion for the middle part

mid_inv 2064 chr02__substr__128948_131048 181 0 1380S540M180S

The cigar is reverse and the actual SEQ is also reverse-complemented.

Second example end_inv

Here, the sequence is split into two segments 960 fwd, 1140 rev.
Minimap2 picks the second segment as the primary alignment, reversed the cigar and rc the SEQ

end_inv 16 chr02__substr__128948_131048 961 60 1140M960S

And the first part is aligned properly as forward

end_inv 2048 chr02__substr__128948_131048 1 60 960M1140S

But if I align a single segment that is reverse-complement, the cigar and sequence are forward strand (genomic) with flag 16 for rev.

Why are segments treated differently? And why are the inversions tagged with MAPQ 0?
Using 2.10-r763-dirty

@lh3
Copy link
Owner

lh3 commented Apr 10, 2018

The 0 mapping quality is a bug, which has just been fixed via 372c90c. I am not sure about the other questions, though. SAM/PAF only works with colinear alignments. If inversion alignment is not reported this way, what would you expect to see?

@armintoepfer
Copy link
Contributor Author

Thank you for fixing the MAPQ bug; this works now.

I'm not sure how I'd do it. I'll think about it...

@lh3 lh3 closed this as completed May 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants