Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stitched consensus #745

Closed
2 tasks done
donkirkby opened this issue Aug 25, 2021 · 1 comment
Closed
2 tasks done

Add stitched consensus #745

donkirkby opened this issue Aug 25, 2021 · 1 comment

Comments

@donkirkby
Copy link
Member

donkirkby commented Aug 25, 2021

When MiCall assembles several contigs, users often want our best guess at a complete consensus. Add a new output file, conseq_stitched.csv that has several lines for each sample. All consensus sequences are aligned against a reference, and then overlaps are combined and gaps are filled with x or the reference sequence.

  • below100 - gaps or coverage below 100 are filled with x.

We have decided not to pursue these other options for now:

  • below1 - gaps are filled with x.
  • below10 - gaps or coverage below 10 are filled with x.
  • filled - gaps are filled with the reference sequence.

Gaps at the start or end are not filled.

Another option is to add these as more lines in conseq_all.csv, with an extra column to describe the gaps strategy.

Still to do:

  • Add insertions into the stitched consensus.
@CBeelen
Copy link
Contributor

CBeelen commented Oct 19, 2021

As of commit 0578d8e, there is now a stitched consensus, created by stitching together the nucleotide sequences of all regions, starting at the 5' end. If there is a disagreement between the sequences for the different regions, we go with the region closer to the 5' end, for now.
The consensus is stored in a new output file, conseq_stitched.csv, with a coverage cutoff of 100 and 7 different mixture cutoffs (MAX, 0.01, 0.02, 0.05, 0.1, 0.2, and 0.25).
In the current version, insertions are left out, but they will be added in another update soon. Deletions are ignored, i.e. there are no dashes to represent them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants