-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update annotate_freq and qual_hists, add split_vds and compute_freq_by_strata #571
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some areas that documentation needs to be added
gnomad/utils/annotations.py
Outdated
Compute call statistics and, when passed, entry aggregation function(s) by strata. | ||
|
||
The computed call statistics are AC, AF, AN, and homozygote_count. Downsamplings are | ||
added to the strata when downsamplings when passed. The entry aggregation functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added to the strata when downsamplings when passed. The entry aggregation functions | |
added to the strata when `downsamplings` is passed. The entry aggregation functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small doc string changes
Co-authored-by: jkgoodrich <33063077+jkgoodrich@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This updates annotate_freq to use the added array aggregation functionality in hail, originally added by Tim in #537, adds the ability to add entry aggregation annotations, and generally cleans up the function by splitting the existing annotate_freq into two functions: annotate_freq which calls compute_freq_by_strata.
This also adds split_vds_by_strata which returns a lists of VDSs with as many VDSs as there are unique values for the passed expression.
We also add the ability to return a single struct of qual histograms, containing separate structs of raw and adj qual histograms.