-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
50bp deletion not detected #64
Comments
Clair3 reports SNP and indels, and it follows the definition that indels are <50bp (≥50bp are structural variants). A hardcoded threshold in Clair3 limits the longest indels to be reported (https://github.com/HKU-BAL/Clair3/blob/main/clair3/CallVariants.py#L27). Elevating the threshold enables Clair3 to report insertions and deletions ≥50bp. But we have not tested Clair3's performance on indels ≥50bp, it very much depends on the read length and the performance of the sequence aligner. |
I see, that makes sense. I will try a SV caller for this example. Thanks! |
No. Actually, you reminded me to question the rationality of using 50bp as the indel size cutoff in the ONT era. The 50bp cutoff was to a large extent a practical cutoff imposed by the short length of NGS reads. Before NGS, the cutoff was 1kbp. I will be extending the limit to a larger value say maybe somewhere between 200bp to 500bp depending on the maximum reliable length of an opened gap in a typical length ONT read. Indel length cutoff elevation scheduled for v0.1-r9, stay tuned. |
Please try out the latest version |
Im strugling with clair3 to get a ~50bp deletion detected. Any other variant I can see by eye is detected, but I have a deletion in ~ 100% of my reads, but for clair3 it seems hard to find.
I use the 0.1-r8 release, with the following parameters:
/opt/bin/run_clair3.sh --bam_fn ${NEWBASE}.bam --ref_fn ${reference} --threads=4 --platform=ont --model_path="/opt/models/ont"
--output ${NEWBASE}_clair3 --include_all_ctgs --snp_min_af=0.01 --indel_min_af=0.001
My data is ONT, pcr amplicon reads, >1000x coverage, downsampled to ~300x.
I've tested a few subsamplings, and in 1 case at 50x coverage it did find the deletion, but in another random sampling to again 50x, it did not find it.
Is there something I can do about this, or tweak some parameter so that it does find it?
The text was updated successfully, but these errors were encountered: