You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tldr: Is it possible to make virusbreakend work on BAMs aligned to reference genomes containing decoy sequences such that the output is identical to what would be obtained if said decoys were not included
The human reference genomes used in many of our pipelines include viral decoy sequences. One common example of this is hs37d5 (1000genomes), which includes an EBV (NC_007605) decoy sequence that seems to interfere with virusbreakend's ability to correctly identify EBV positive samples (presumably because the viral reads end up mapped?).
In my case, It would be nice to consider reads that map to the sequences NC_007605 or hs37d5 as potentially viral (and thus included in the kraken run). Further, when it comes to breakpoint calling, the breakpoints of interest are those that occur in the non-decoy sequences.
Would it be possible to add an option that makes virusbreakend aware of decoy sequences? This would allow end-users to easily integrate virusbreakend into existing workflows irrespective of the version of the human reference genome they use.
Thanks again for the tool!
Kind regards,
Sam
The text was updated successfully, but these errors were encountered:
@selkamand Are you aware of any other common reference genomes that contain viral decoys?
The GRCh38 human reference also has an EBV contig (chrEBV)
A bunch of the different versions include this EBV contig. The ref we use in our GRCh38 pipelines is based on:
ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa
I don't know of any non-human examples I'm afraid.
Thanks for VirusBreakend, its a really nice tool!
tldr: Is it possible to make virusbreakend work on BAMs aligned to reference genomes containing decoy sequences such that the output is identical to what would be obtained if said decoys were not included
The human reference genomes used in many of our pipelines include viral decoy sequences. One common example of this is hs37d5 (1000genomes), which includes an EBV (NC_007605) decoy sequence that seems to interfere with virusbreakend's ability to correctly identify EBV positive samples (presumably because the viral reads end up mapped?).
In my case, It would be nice to consider reads that map to the sequences
NC_007605
orhs37d5
as potentially viral (and thus included in the kraken run). Further, when it comes to breakpoint calling, the breakpoints of interest are those that occur in the non-decoy sequences.Would it be possible to add an option that makes virusbreakend aware of decoy sequences? This would allow end-users to easily integrate virusbreakend into existing workflows irrespective of the version of the human reference genome they use.
Thanks again for the tool!
Kind regards,
Sam
The text was updated successfully, but these errors were encountered: