-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mask all ambiguous #682
Mask all ambiguous #682
Conversation
Codecov Report
@@ Coverage Diff @@
## master #682 +/- ##
==========================================
- Coverage 30.34% 30.28% -0.07%
==========================================
Files 40 40
Lines 5549 5567 +18
Branches 1349 1355 +6
==========================================
+ Hits 1684 1686 +2
- Misses 3805 3821 +16
Partials 60 60
Continue to review full report at Codecov.
|
This PR replaces sites in the root sequence that have The root sequence json now looks like this
The how exactly we handle the mask is up for debate. But I do think we should mask completely undetermined sites. |
CI fails bc I added a field to |
@rneher I didn't get a chance to fully test this today, but the logic of what you're describing here makes a lot of sense. I'll try to finish my review tomorrow morning and can also push a fix for the test (maybe just to make it less brittle instead of more specific). |
Adds the new "mask" attribute to the ancestral sequences JSON for the Zika build.
Adds a comment to describe what's happening and why in the new masking feature. Also, makes minor stylistic changes for readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This worked for me as expected with ncov and flu builds. I added a comment to the main masking logic for future us to understand what's happening there and why and updated the functional test to expect the new mask
key in the ancestral output JSON. When the CI tests have finished, I'll merge this.
Description of proposed changes
some workflows mark parts of the alignment, but standard ML ancestral inference will still pick a "most likely" among the equally likely states at these positions. This PR masks sites without information before exporting.