Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Insert ALT format #19

Closed
awgymer opened this issue Jun 3, 2024 · 7 comments
Closed

Insert ALT format #19

awgymer opened this issue Jun 3, 2024 · 7 comments

Comments

@awgymer
Copy link

awgymer commented Jun 3, 2024

I notice some inserts with an ALT format like <chr>:<start>-<stop> (i.e. chr8:141415620-141415787) and as far as I can tell this is not VCF spec. This causes downstream tools to throw errors.

Where do these format INS come from and is it possible to have them replaced by a spec format?

@aysegokce
Copy link
Contributor

Hello,
That is an older representation of the tandem duplications in VNTRs. To unify the representation of indels in VNTRs, we convert duplications to insertions (since in some reads, it may be represented as duplication while in others as an insertion).

In the current version, it is replaced with <INS> in the ALT column and ALINGED_POS in the INFO column.

Please let me know if this representation works.
Ayse

@awgymer
Copy link
Author

awgymer commented Jun 12, 2024

Testing with the same sample on v1.0 returns the line

chr8    141415620       severus_INS20553        N       <DUP>   60.0    PASS    PRECISE;SVTYPE=INS;SVLEN=118;INSIDE_VNTR=TRUE;MAPQ=60.0 GT:GQ:VAF:hVAF:DR:DV    0/0:204:0.21:1.00,0.00,0.00:22:6

This appears to be slightly different from what you have outlined here?

@aysegokce
Copy link
Contributor

In this case, it is a tandem duplication of a VNTR as INSIDE_VNTR=TRUE suggests. For those cases, we change the ALT column to <DUP>.

We use the ALINGED_POS representation for insertions with known alignments (other than duplications) as in the example below:
chr16 71242755 severus_INS13587 N CTACTCCACATTTTCTTTTTTTTTTTTTTTTTTTTTTTTTTTTTGGATTCTTATATTTTGCTTTATTCCAAATTATAGCCACTAAGATTCGTATCACTTTATAGTTTCTCAGGAAAAAGCACTTCCTTTCATATAGACAATTTTAGAAAAGTTTCTAAATTACAATATACCTTAATTTTTGAAATTGCATACCAGTTTTTTTCATCATTATAATCAAAATATAAAGATCCTTCTGAACTATGATAATGTGATAACTGATATAATCTATCAGAGTTACATATGATTTCAAGGTGAAATGCTGTTTTTCATAATCTGTGTTCATGATTTTTGAATGTTGATGCATTCTTGCCTTCCTGGAATAAACACTTAGTCATAATGTATAGTTTTTTTAAAAAAAAATACCCCAGTGACTTTGATTTTCTAATATTTTATTTTGGGTTTTTTACATCTGTAGTCCTAAGTGAGATTGAACTATTAATTGTCTTTGCTTCCCTTGGCCAGTTTTGTGAAGGAATAAACG 60.0 PASS PRECISE;SVTYPE=INS;SVLEN=520;ALINGED_POS=chr3:123876019-123876539;MAPQ=60.0;PHASESETID=68978924;HP=2 GT:GQ:VAF:hVAF:DR:DV 0/1:514:0.39:0.00,0.00,1.00:126:80

Sorry for the confusion.
Ayse

@aysegokce
Copy link
Contributor

Thank you for the feedback. The vcf output is updated in version 1.1

@awgymer
Copy link
Author

awgymer commented Jul 29, 2024

@aysegokce I have been testing the newer output format, one issue I have noticed is that - as the above example shows - a row with and ALT of <DUP> still has an SVTYPE=INS in INFO. Is this deliberate? If so is there a reason not to set it to DUP?

@minw2828 minw2828 mentioned this issue Aug 8, 2024
@aysegokce
Copy link
Contributor

Hello @awgymer,
Can you please confirm it is with v1.1? If so, please share the line from the vcf and an IGV screenshot.
Best
Ayse

@aysegokce aysegokce reopened this Aug 9, 2024
@awgymer
Copy link
Author

awgymer commented Aug 27, 2024

This appears resolved in 1.1. (All <DUP> now SVTYPE=DUP)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants