-
Notifications
You must be signed in to change notification settings - Fork 424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
minimap2 sometimes misses good base-level alignments #659
Comments
You may try setting mask level to 0 with Line 165 in c9874e2
After base alignment, the unaligned length ( PS: on that line, |
Thank you, Heng. Using -M0 does in fact change the behavior.
yields
|
In practice, |
If I'm understanding -M correctly, I think what I would ideally like is to be able to set -M in absolute terms to not miss alignments of less than approximately length L, rather than as a fraction of the alignment length. The surprising behavior, which bit me more than once, is that the absence of any reported alignment for a longish query segment does not necessarily imply that no good alignment exists for that segment. But knowing the behavior and the cause and how to control it is very helpful. |
I have added a new option |
I built and tested the branch and it seems to work fine. I tried it on another couple of examples also. It seems to produce a bit more stable behavior with respect to not dropping short but good alignments. |
Thanks. I have merged that branch to the master. |
I have run into more than one example where minimap2 fails to generate base-level alignments for a query sequence even though good alignments appear to exist.
One example is when I try to align the human hg38 alt contig chr17_GL000258v2_alt to the non-alt chr17. When I generate backbone alignments like this:
minimap2 -x asm10 chr17.fasta.gz chr17_GL000258v2_alt.fasta.gz
I get (in part) the following alignments (sorted by query start), which pretty much tile the query sequence in this region:
However when I do base-level alignment
minimap2 -c -x asm10 chr17.fasta.gz chr17_GL000258v2_alt.fasta.gz
Then the middle aligment from above largely disappears and the resulting alignments leave a ~10kb gap of unaligned query sequence:
The query gap segment chr17_GL000258v2_alt:534341-545038 has a good alignment to chr17:45592747-45603440:+, as suggested by the original backbone alignment.
In addition, most of the unaligned query region, specifically segment chr17_GL000258v2_alt:535164-545038 has a good alignment to the target "gap" at chr17:46242508-46252364:-.
I've attached a muscle file showing both of these alignments.
align_segments.txt
align_muscle.clw.txt
I have tried varying the presets (asm5, asm10 and asm20 produce the same results) and I have tried turning up -z (up to -z 10000) as suggested here #158 but neither of these seem to change the behavior.
I have two questions:
Thanks very much for any insights you can offer.
I am happy to provide the fasta files if that would be helpful.
The text was updated successfully, but these errors were encountered: