Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: analyze mutation profile of a single variant to a reference #767

Open
AnonymousUserUse opened this issue Apr 5, 2023 · 1 comment

Comments

@AnonymousUserUse
Copy link

AnonymousUserUse commented Apr 5, 2023

In order to better illustrate my feature request, I would like to begin with a situation that I (and probably many other people) meet:
For example, if I want to analyze the mutation profile of XBB.1.16*, the display on CoV-Spectrum website is not really clear - the number of displayed mutations is too high, and it is challenging to distinguish the mutation inherited from its parental lineage, the mutation on the top of its parental lineage, as well as the diversity within the lineage.

image

My feature request for CoV-Spectrum website is to enable users to compare the mutation profile of a variant to a reference sequence, e.g. the reference sequence of a PANGO lineage in the Nextclade reference tree.
For example, to analyze the mutation profile of XBB.1.16*, one may choose the reference sequence of XBB.1 to view the mutation on the top of its parental lineage as well as the diversity within the lineage, or even choose the reference sequence of XBB.1.16 to view the diversity within the lineage directly.
If there are reversions of a variant compared to a reference, the reversion will be included in the mutation profile accordingly. For example, if one chooses XBB.1.15 as a reference to analyze XBB.1.16*, the mutation profile will include the reversion ORF1a:V540A.

I am aware that "compare variants" function can be used to compare two or more variants, but this does not show the proportion of certain mutations, so that mutations that are contained in a significant proportion of a variant will be directly included in the mutation list (e.g. ORF1b:T1050N in BA.5.2) and mutations with low proportion will be ignored.
Therefore, I think it would be helpful to extend the feature "Substitutions and deletions" in the column "Analyze single variant" with the possibility to choose a reference sequence based on a PANGO lineage in Nextclade reference tree, so as to better analyze the mutation profile of a variant.
Alternatively, the feature can also be integrated into the column "Compare variants to a baseline" if one only chooses one variant compared with a Nextclade PANGO lineage, or extended to a new column "Analyze single variant to a reference", but I think the integration to the feature "Substitutions and deletions" would be the best choice.

It is expected that SARS-CoV-2 continues to evolve. If we always use the wild type as the reference, the list of mutation will become longer and longer. Therefore, the feature to choose reference sequence based on a PANGO lineage on Nextclade reference tree will definitely make sense in the long run.

I hope that my request can be considered. Thanks a lot in advance.

@chaoran-chen
Copy link
Member

Thanks for suggesting it! I like the idea and will add it to the backlog. However, as the task is not entirely trivial, it will take a while until we will be able to address it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants